2

I'm working with the following SPARQL query, which is an example on the web-based end of my institution's SPARQL endpoint;

SELECT ?building_number ?name ?occupants WHERE {
  ?site a org:Site ;
        rdfs:label "Highfield Campus" .

  ?building spacerel:within ?site ;
            skos:notation ?building_number ;
            rdfs:label ?name .

  OPTIONAL {
    ?building soton:buildingOccupants ?occ .
    ?occ rdfs:label ?occupants .
  } .
} ORDER BY ?name

The problem is that as well as getting data from 'Buildings and Places', the Dataset I'm interested in, and would expect the example to use, it also gets data from the 'Facilities and Equipment' dataset, which isn't relevant. You should see this if you follow the link.

I suspect the example may pre-date the addition of the Facilities and Equipment dataset, but even with the research I've done into SPARQL, I can't see a clear way to define which datasets to include.

Can anyone recommend a starting point to limit it to just show 'Buildings', or, more specifically, results from the 'Buildings and Places' dataset.

Thanks

Matthew Higgins
  • 557
  • 1
  • 9
  • 22

1 Answers1

4

First things first, you really need to use SELECT DISTINCT, as otherwise you'll get repeated results.

To answer your question, you can use GRAPH { ... } to filter certain parts of a SPARQL query to only match data from a specific dataset. This only works if the SPARQL endpoint is divided up into GRAPHs (this one is). The solution you asked for isn't the best choice, as it assumes that things within sites in the 'places' dataset will always be resticted to buildings... That's risky -- as it might end up containing trees and signposts at some time in the future.

Step one is to just find out what graphs are in play:

SELECT DISTINCT ?g1 ?building_number ?name ?occupants WHERE {
  ?site a org:Site ;
        rdfs:label "Highfield Campus" .

  GRAPH ?g1 { ?building spacerel:within ?site ;
            skos:notation ?building_number ;
            rdfs:label ?name .
            }

  OPTIONAL {
    ?building soton:buildingOccupants ?occ .
    ?occ rdfs:label ?occupants .
  } .
} ORDER BY ?name

Try it here: http://is.gd/WdRAGX

From this you can see that http://id.southampton.ac.uk/dataset/places/latest and http://id.southampton.ac.uk/dataset/places/facilities are the two relevant ones.

To only look for things 'within' a site according to the "places" graph, use: SELECT DISTINCT ?building_number ?name ?occupants WHERE { ?site a org:Site ; rdfs:label "Highfield Campus" .

  GRAPH <http://id.southampton.ac.uk/dataset/places/latest> { 
        ?building spacerel:within ?site ;
            skos:notation ?building_number ;
            rdfs:label ?name .
            }

  OPTIONAL {
    ?building soton:buildingOccupants ?occ .
    ?occ rdfs:label ?occupants .
  } .
} ORDER BY ?name

Alternate solutions:


Using rdf:type

Above I've answered your question, but it's not the answer to your problem. This solution is more semantic as it actually says 'only give me buildings within the campus' which is what you really mean.

Instead of filtering by graph, which is not very 'semantic' you could also restrict ?building to be of class 'building' which research facilities are not. They are still sometimes listed as 'within' a site. Usually when the uni has only published what campus they are on but not which building.

?building a rooms:Building 

Using FILTER

In extreme cases you may not have data in different GRAPHS and there may not be an elegant relationship to use to filter your results. In this case you can use a FILTER and turn the building URI into a string and use a regular expression to match acceptable ones:

FILTER regex(str(?building), "^http://id.southampton.ac.uk/building/")  

This is bar far the worst option and don't use it if you have to.


Belt and Braces

You can use any of these restictions together and a combination of restricting the GRAPH plus ensuring that all ?buildings really are buildings would be my recommended solution.

Christopher Gutteridge
  • 4,237
  • 2
  • 19
  • 19
  • Absolutely perfect, thank you very much, it's just worth noting that in your example, the prefix `rooms:` wasn't defined, but when I defined it as `PREFIX rooms: ` it worked absolutely perfectly! Thank you very much! – Matthew Higgins Mar 10 '13 at 15:01