1

I have an SPARQL query that mimics a zick-zack pattern like following.

?p1 :infector ?p.
?p2 :infector ?p1.
?p3 :infector ?p2.
?p4 :infector ?p3.
?p5 :infector ?p4
.................

Basically, in the pattern subject of one triple using as an object for next one. Is there any way to generalize this pattern? Therefore, I do not need to use a long list of variables (?p-?p5) in the pattern. Also, I do not know how many of such variables I need before running the query for multiple times. Hence, I can not come up with a defined set of variables. I need something generic. If you have any idea to make this query generic then please let me know. I will highly appreciate any help.

Clarification:

I have an RDF graph like following.

<http://ndssl.bi.vt.edu/chicago/dendrogram/experiment_id#7385/cell_id#86304/infectee_pid#446734805/iteration#0> <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> <http://ndssl.bi.vt.edu/chicago/person/pid#449563560>.

<http://ndssl.bi.vt.edu/chicago/dendrogram/experiment_id#7385/cell_id#86304/infectee_pid#446734805/iteration#0> <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> <http://ndssl.bi.vt.edu/chicago/person/pid#446734805>

<http://ndssl.bi.vt.edu/chicago/dendrogram/experiment_id#7385/cell_id#86304/infectee_pid#446753456/iteration#0> <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> <http://ndssl.bi.vt.edu/chicago/person/pid#446734805>.

<http://ndssl.bi.vt.edu/chicago/dendrogram/experiment_id#7385/cell_id#86304/infectee_pid#446753456/iteration#0> <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> <http://ndssl.bi.vt.edu/chicago/person/pid#446753456>.

.......................................................................

Following SPARQL query can fetch existing chain mentioned above RDF graph.

select * from <http://ndssl.bi.vt.edu/chicago/> where
where { 
{
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> ?o1.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> ?o2
}

{
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> ?o2.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> ?o3
}
 ..........................................................................

}

This chain type of query contains two parts where infectee ID from first part of the query using as an infector for second part. In my query, I have lots of parts. Is there any way to generalize it? So that insetad of using so many parts I can just use one part and get the result. BTW I need path length and intermediate node information as well.

Beautiful Mind
  • 4,552
  • 4
  • 17
  • 32
  • Possible duplicate of [SPARQL Querying Transitive](http://stackoverflow.com/questions/8569810/sparql-querying-transitive) –  Jun 07 '16 at 23:06
  • 1
    Thank you for your reply. I need to know path length and intermediate node information as well. Solution available in the link does not provide that information. – Beautiful Mind Jun 07 '16 at 23:15
  • I see. Is this closer to your problem? http://stackoverflow.com/q/4056008/824425 –  Jun 08 '16 at 00:10
  • Today I learned zigzag (in American English) is zickzack in some other parts of the world. :) – Joshua Taylor Jun 08 '16 at 14:17

1 Answers1

2

Basically, in the pattern subject of one triple using as an object for next one. Is there any way to generalize this pattern?

First, note if you consider your triple patterns in the other direction, then it's not so much of a zig-zag, but just a chain:

?p5 :infector ?p4 .
?p4 :infector ?p3 .
?p3 :infector ?p2 .
?p2 :infector ?p1 .
?p1 :infector ?p0 .

That's easy to capture with a repetition property path:

?p5 :infector* ?p0

You can reverse the direction if you want to see ?p0 appear in the text of your query first by reversing the direction of the property path:

?p0 ^:infector* ?p5

I need to know path length and intermediate node information as well.

Since you talk the "path length", it sounds like you want a maximal path. This makes things a little bit trickier, but you can still do this. You can apply the approach from Is it possible to get the position of an element in an RDF Collection in SPARQL?. To get the length of the path from ?begin to ?end, you can do something like:

select ?begin ?end (count(?mid) as ?length) {
  ?end :infector* ?mid .
  ?mid :infector* ?begin .
}
group by ?begin ?end

That will find the length of every :infector path. If you only want maximal paths, you'll need to make sure that the path can't be extended in either direction from ?begin or ?end:

select ?begin ?end (count(?mid) as ?length) {
  ?end :infector* ?mid .
  ?mid :infector* ?begin .

  filter not exists { ?begin :infector ?beginEx }
  filter not exists { ?endEx :infector ?end }
}
group by ?begin ?end

That requires grouping over the ?mid variable, so you can't get non-aggregate information about the middle nodes at the same time that you're getting the length, but when you don't get the length, you can get information about the middle nodes:

select * {
  ?end :infector* ?mid .
  ?mid :infector* ?begin .

  filter not exists { ?begin :infector ?beginEx }
  filter not exists { ?endEx :infector ?end }

  #-- information about ?mid, e.g,. 
  #-- ?mid rdfs:label ?midLabel . 
}
Community
  • 1
  • 1
Joshua Taylor
  • 80,876
  • 9
  • 135
  • 306
  • It is not exactly what I am looking for. I made a new post that clarifies my question (http://stackoverflow.com/questions/37711732/sparql-chain-query). Sorry for the inconvenience. It will be a great help if you provide some feedback on my new post. – Beautiful Mind Jun 08 '16 at 19:54
  • 1
    It would be better to keep this thread open and edit the question. – UninformedUser Jun 08 '16 at 20:48
  • @S.M.ShamimulHasan I agree with AKSW; if you have clarifications on this question, please clarify this question in preference to creating a new one. – Joshua Taylor Jun 08 '16 at 20:56