5

Is it possible to count the number of the edges that connect two instance with a SPARQL query? I want to find a path.

Joshua Taylor
  • 80,876
  • 9
  • 135
  • 306
user2837896
  • 231
  • 3
  • 11
  • Yes and no… Do you want just a path made of a specific property? Will there be just one path between the individuals in the graph? – Joshua Taylor Oct 25 '13 at 11:29
  • You'll need to elaborate more on what your data is, and what exactly you want as a result (a list of the edges of the path, the length of the path, etc.). In the meantime, you might find [Finding all steps in property path](http://stackoverflow.com/q/18024413/1281433) and [Is it possible to get the position of an element in an RDF Collection in SPARQL?](http://stackoverflow.com/q/17523804/1281433) helpful. – Joshua Taylor Oct 25 '13 at 11:30
  • Ah, it took me a minute or two to find it, but you should also look at [Calculate length of path between nodes?](http://stackoverflow.com/q/5198889/1281433). Still, we need to clarify whether you're looking to _count_ the number of edges and so find the _length_, or if you're looking for the actual _path_, which is a collection of edges. – Joshua Taylor Oct 25 '13 at 11:47
  • Also, I see that you've tagged this with [tag:dbpedia], but there's no mention of DBpedia in the question. Does this question involve DBpedia in an essential way? – Joshua Taylor Oct 25 '13 at 11:49
  • Yes, the instances come from dbpedia – user2837896 Oct 25 '13 at 13:06
  • OK, the problem doesn't _really_ depend essentially on DBpedia, then (since you could be querying data from another source, too). I suppose it is worth noting that that the DBpedia SPARQL endpoint supports SPARQL 1.1, so it _does_ support property paths, and that's relevant. – Joshua Taylor Oct 25 '13 at 15:09
  • Did you end up having any luck with this? – Joshua Taylor Oct 31 '13 at 02:52

1 Answers1

9

You count the number of edges in a unique path using SPARQL's property paths and aggregate functions. For instance, with data like this, which contains two paths that we care about (a to c with two edges, and d to g with three edges):

@prefix : <https://stackoverflow.com/questions/19587520/sparql-path-between-two-instance/> .

:a :p :b .  # a to c is a path of length 2
:b :p :c .  

:d :p :e .  # d to g is a path of length 3
:e :p :f .
:f :p :g . 

you can use a query like the following one. Notice that I've used the specific property :p, rather than a variable. This is necessary, because 9.1 Property Path Syntax from the SPARQL 1.1 specification doesn't allow variables in property paths.

prefix : <https://stackoverflow.com/questions/19587520/sparql-path-between-two-instance/>

select ?start ?end (count(?mid) as ?length)
where {
  values (?start ?end) { (:a :c) (:d :g) }
  ?start :p+ ?mid .
  ?mid :p* ?end .
}
group by ?start ?end 

and get results like this:

$ sparql --query query.rq --data data.n3
------------------------
| start | end | length |
========================
| :d    | :g  | 3      |
| :a    | :c  | 2      |
------------------------

A fuller description of what's happening here can be found in:

The basic idea, though, is that if you have a path from ?start to ?end, then you've also got, for a bunch of different values of ?mid, a path from ?start to ?mid and a path from ?mid to ?end. The number of different values that you can pick for ?mid (if you allow one of the endpoints, and disallow the other) is exactly the length of the path.

Community
  • 1
  • 1
Joshua Taylor
  • 80,876
  • 9
  • 135
  • 306
  • Does the property must be specified ? Can't I do this ?--> ?p* or ?p+ – user2837896 Oct 25 '13 at 13:18
  • @user2837896 The property _does_ need to be specified. I updated my answer regarding this. The [property path syntax](http://www.w3.org/TR/sparql11-query/#pp-language) doesn't allow variables, unfortunately. – Joshua Taylor Oct 25 '13 at 15:07
  • I don't want to specify the property..I want to find a shortest path between the resources without specifying the property. Is it possible in java ? – user2837896 Oct 25 '13 at 16:05
  • Well, I don't think that you'll be able to do it using SPARQL, anyhow. If you had the data locally, you could certainly start at a resource and do a [iterative deepening DFS](https://en.wikipedia.org/wiki/Iterative_deepening_depth-first_search), or a [BFS](https://en.wikipedia.org/wiki/Breadth_first_search), but DBpedia data is _big_, so getting it locally will be a bit difficult. You could iteratively run some queries, but that will probably get you far too much data to send back and forth conveniently, and you'll probably run into DBpedia's rate limiting. – Joshua Taylor Oct 25 '13 at 16:16
  • I suppose that you could successively try longer and longer paths in the SPARQL, e.g., ` ?p `, and then ` ?p1 [ ?p2 ]`, and then ` ?p1 [ ?p2 [ ?p3 ]]`, and so on. Assuming that your paths aren't too long, this probably wouldn't take too many iterations, but it's probably not a _cheap_ query, either, and it would probably take quite some time to complete. It also occurs to me now that you have specified whether you want directed paths or undirected paths. E.g., if Bill and George are both of type Person, should `Bill a Person ^a George`, count as a path? – Joshua Taylor Oct 25 '13 at 17:01
  • I found this interesting paper : http://ceur-ws.org/Vol-996/papers/ldow2013-paper-04.pdf – user2837896 Oct 26 '13 at 08:43
  • For more expressive pattern matching you may want to look into Neo4j: [home](http://www.neo4j.org/), [linked data](http://www.neo4j.org/develop/linked_data), [dbpedia4neo](https://github.com/claudiomartella/dbpedia4neo). With the data in Neo4j you can write queries which traverse exactly one relationship type (property), several distinct types, or which is completely agnostic to types, or a combination. See [cypher patterns](http://docs.neo4j.org/chunked/milestone/introduction-pattern.html) for more info. – jjaderberg Oct 28 '13 at 21:27