0

I'm a beginner in SPARQL and I'm working on this endpoint http://spcdata.digitpa.gov.it:8899/sparql. I'd like to join data from the DBpedia graph. I'm using the property owl:sameAs for referencing to DBpedia resources.

I'd like to fetch the name and population values of all cities falling in the class pa:Comune for which a dbp:populationTotal value is defined. Here is my query:

PREFIX pa:  <http://spcdata.digitpa.gov.it/> 
PREFIX rdf: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?label ?populationTotal WHERE {
  ?s a pa:Comune .
  ?s rdf:label ?label .
  ?s owl:sameAs ?sameAs .
  ?sameAs dbp:populationTotal ?populationTotal .
}
ORDER BY ?label

Unfortunately, though results are correct, I get only a very small subset of them. I've checked and there are many more municipalities that have a reference on DBpedia with a value for property dbp:populationTotal. I've tried with all different sponge values but the results are still the same. I guess the problem might be I'm fetching data from another graph, but I don't know what to do.


EDIT: i've tried this query after the suggestion of Ian Dickinson, and it works!

PREFIX pa:  <http://spcdata.digitpa.gov.it/> 
PREFIX rdf: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?label ?sameAs ?populationTotal WHERE {
  ?s a pa:Comune .
  ?s rdf:label ?label .
  ?s owl:sameAs ?sameAs .
FILTER (REGEX(STR(?sameAs), "dbpedia", "i")).
  SERVICE <http://dbpedia.org/sparql> 
  {
  ?sameAs dbp:populationTotal ?populationTotal .
   }
} LIMIT 1700

Unfortunately, there are 8000+ muncipalities in Italy, so I had to cap the results (hence the LIMIT 1700, which is the higher number of hits I can get without having a timeout.).

Dev-otchka
  • 317
  • 1
  • 4
  • 20

1 Answers1

2

It's not clear to me what data source your Virtuoso endpoint is connected to, but there are not many places with a population total in your dataset. The following query returns only 28 results:

PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT distinct *  WHERE {
  ?sa dbo:populationTotal ?total
}

As you observe, the same query run against the DbPedia SPARQL endpoint returns many more results. I can only surmise that you have loaded only a subset of the data into the Virtuoso graph that you have put up at http://spcdata.digitpa.gov.it:8899/sparql.

Ian Dickinson
  • 12,323
  • 9
  • 34
  • 63
  • The endpoint is connected to this data: http://spcdata.digitpa.gov.it/dataIPA.html (sorry, it's in Italian only). I was convinced I could reference DbPedia data but seems it loads partially as you suggested. I've also tried the SERVICE keyword, but with no success. Do you suggest to use DbPedia only? I've a hard time finding a class for municipalities. – Dev-otchka May 28 '13 at 23:43
  • 1
    Federated queries can be tricky. Is there something special about the 28 municipalities that are loaded into the `spcdata.digitpa.gov.it` endpoint? (You're right - I can't read the Italian I'm afraid). This link may help with using the service keyword with Virtuoso's SPARQL implementation: http://boards.openlinksw.com/phpBB3/viewtopic.php?f=12&t=1709 – Ian Dickinson May 28 '13 at 23:52
  • I couldn't say! Seems that those 28 records don't share nothing special. I tried with your suggestion, and it worked quite well, thanks! check my edit. – Dev-otchka Jun 06 '13 at 15:04