I am very new to DBPedia and I don't know how to and from where to start. I did some research on this and from that what I understand is we can access the data using SPARQL query language (Apache Jena). So I started to download the .ttl files for the Ontology Infobox Properties. After that I extracted this file its almost 2GB. Here my problem started None of the editors are unable to open this file. My sample program to access this file is here...
public class OntologyExample {
public static void main(String[] args) {
FileManager.get().addLocatorClassLoader(
OntologyExample.class.getClassLoader());
Model model = FileManager
.get()
.loadModel("D:\\Dell XPS\\DBPEDIA\\instance_types_en.ttl\\instance_types_en.ttl");
String q = "SELECT * WHERE { "
+ "?e <http://dbpedia.org/ontology/series> <http://dbpedia.org/resource/The_Sopranos> ."
+ "?e <http://dbpedia.org/ontology/releaseDate> ?date"
+ "?e <http://dbpedia.org/ontology/episodeNumber> ?number "
+ "?e <http://dbpedia.org/ontology/seasonNumber> ?season"
+ " }" + "ORDER BY DESC(?date)";
Query query = QueryFactory.create(q);
QueryExecution queryExecution = QueryExecutionFactory.create(query,
model);
ResultSet resultSet = queryExecution.execSelect();
ResultSetFormatter.out(System.out, resultSet, query);
queryExecution.close();
}
}
So the input for this program is that 2GB file. So I just ran this sample program its throwing exception like
Exception in thread "main" com.hp.hpl.jena.n3.turtle.TurtleParseException: GC overhead limit exceeded
at com.hp.hpl.jena.n3.turtle.ParserTurtle.parse(ParserTurtle.java:63)
at com.hp.hpl.jena.n3.turtle.TurtleReader.readWorker(TurtleReader.java:33)
at com.hp.hpl.jena.n3.JenaReaderBase.readImpl(JenaReaderBase.java:119)
at com.hp.hpl.jena.n3.JenaReaderBase.read(JenaReaderBase.java:84)
at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:268)
at com.hp.hpl.jena.util.FileManager.readModelWorker(FileManager.java:403)
at com.hp.hpl.jena.util.FileManager.loadModelWorker(FileManager.java:306)
at com.hp.hpl.jena.util.FileManager.loadModel(FileManager.java:258)
at jena.tutorial.OntologyExample.main(OntologyExample.java:18)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at org.apache.jena.iri.impl.LexerPath.yytext(LexerPath.java:420)
at org.apache.jena.iri.impl.AbsLexer.rule(AbsLexer.java:81)
at org.apache.jena.iri.impl.LexerPath.yylex(LexerPath.java:711)
at org.apache.jena.iri.impl.AbsLexer.analyse(AbsLexer.java:52)
at org.apache.jena.iri.impl.Parser.<init>(Parser.java:108)
at org.apache.jena.iri.impl.IRIImpl.<init>(IRIImpl.java:65)
at org.apache.jena.iri.impl.AbsIRIImpl.create(AbsIRIImpl.java:692)
at org.apache.jena.iri.IRI.resolve(IRI.java:432)
at com.hp.hpl.jena.n3.IRIResolver.resolve(IRIResolver.java:167)
at com.hp.hpl.jena.n3.turtle.ParserBase._resolveIRI(ParserBase.java:198)
at com.hp.hpl.jena.n3.turtle.ParserBase.resolveIRI(ParserBase.java:192)
at com.hp.hpl.jena.n3.turtle.ParserBase.resolveQuotedIRI(ParserBase.java:183)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.IRI_REF(TurtleParser.java:737)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.IRIref(TurtleParser.java:680)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.GraphTerm(TurtleParser.java:496)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.VarOrTerm(TurtleParser.java:420)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.TriplesSameSubject(TurtleParser.java:150)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.Statement(TurtleParser.java:97)
at com.hp.hpl.jena.n3.turtle.parser.TurtleParser.parse(TurtleParser.java:67)
at com.hp.hpl.jena.n3.turtle.ParserTurtle.parse(ParserTurtle.java:49)
... 8 more
I am running this code from my Eclipse and here are my Eclipse .ini preferences.
org.eclipse.epp.package.jee.product
--launcher.defaultAction
openFile
--launcher.XXMaxPermSize
512M
-showsplash
org.eclipse.platform
--launcher.XXMaxPermSize
512m
--launcher.defaultAction
openFile
-vmargs
-Dosgi.requiredJavaVersion=1.5
-Xms1024m
-Xmx2048m
So my problems here is
- How can I access this kind of large files.
- How can I use DBPedia in a proper manner.
So please help me I am stuck over here. I am doing a project on DBpedia.