9

Are there any existing ways of using the freebase data dumps to create a database similar to what freebase offers, but on you own server? Pretty much freebase but locally and not through the API?

I guess it would be possible to create, but are there any existing solutions for this already? Or any alternative solutions for similar data without using an API? I didnt find this for dbpedia either :|

freakshow
  • 451
  • 8
  • 16

5 Answers5

7

An alternative to freebase-quad-rdfize is here: https://github.com/castagna/freebase2rdf

I use Apache Jena's TDB store to load the RDF data and Fuseki to serve the data via SPARQL protocol over HTTP.

See also:

Moreover, you have now another option: http://basekb.com/

castagna
  • 1,309
  • 10
  • 12
3

I'm the creator of :BaseKB, the first usable conversion of Freebase to RDF.

There are key integrity problems in the Freebase quad dump that make it hard to get fully correct results from the quad dump. :BaseKB reconstructs the key structure of Freebase so that the unique name assumption holds. This is important, because the ability to write simple SPARQL queries that work like SQL queries depends on this.

Right now, :BaseKB exists in two editions. There's a free edition that consists of 120 million facts about 4 million topics (the ones from Wikipedia) and there's a "Pro" edition that contains everything.

As for the performance issues brought up by Phillip Kendall, I can say that it's mostly a matter of having enough RAM. With 24GB of RAM I can load the free edition into a triple store in an hour. Some queries take longer than I like, but overall query performance is good.

Anyone who wants to use the "Pro" edition is going to need unusually powerful hardware and will spend a good deal of effort getting their toolchain to work. I'm working with partners right now to deliver "Pro" to users in a satisfactory way.

  • Paul, can you expand on the hardware needed to run Pro? Can you expand on the partners? is BaseKB exanding beyond Freebase/DBpedia data? thanks – Luis Miguel Aug 28 '13 at 21:07
3

Importing the data into a triple store of your choice wouldn't be hard - but you'll have great difficulties getting any answers out in a reasonable time unless you're doing something trivial.

Someone did import the whole dataset into MySQL a few years ago - it took 2 weeks to load and even simple queries like "the count of things typed as a person" took >1 minute to give an answer. That was on big hardware and the dataset is much bigger now than it was then.

Philip Kendall
  • 4,237
  • 1
  • 20
  • 40
  • Gotta understand how Freebase data is laid out and then optimize it before attempting to load in MySQL. One way is described here - http://stackoverflow.com/a/12428232/756579 (loads all of Freebase and response time is fractions of a second). – ivaylo_iliev Sep 25 '12 at 16:47
3

Take a look at the freebase-quad-rdfize project on Google Code. It should allow you to download the weekly Freebase quad dump and load it into the RDF triple store of your choice.

Shawn Simister
  • 4,562
  • 1
  • 23
  • 31
1

If you can export the database to say, tab delimited or comma seperated values in TXT or database files such as MDB, XLS, or any other highly transportable data format, you'd have no problem building your own MySQL database on your computer using that data. Main thing is making sure you can export data from which you can rebuild your own database from.

DoctorLouie
  • 2,668
  • 1
  • 16
  • 23