2

I want to extract some information which exists in DBPedia. So, I've written an application using .NET's System.Net.WebClient which gets urls and returns the content of url in N-Triples format (plain text).

The result of extracted data for the url (with the application) is:

<http://dbpedia.org/resource/AfghanistanCommunications> <http://dbpedia.org/ontology/wikiPageRedirects> <http://dbpedia.org/resource/Communications_in_Afghanistan> . <http://dbpedia.org/resource/AfghanistanCommunications>   <http://www.w3.org/ns/prov#wasDerivedFrom>  <http://en.wikipedia.org/wiki/AfghanistanCommunications?oldid=74466499> . <http://dbpedia.org/resource/AfghanistanCommunications>   <http://xmlns.com/foaf/0.1/isPrimaryTopicOf>    <http://en.wikipedia.org/wiki/AfghanistanCommunications> . <http://dbpedia.org/resource/AfghanistanCommunications>  <http://www.w3.org/2000/01/rdf-schema#label>    "AfghanistanCommunications"@en .

But, when I see the url with my browser, I get very different content from which I've extracted.

I checked the request with Fiddler and then:

webClient.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

Is DBPedia detecting the application as a bot and returns less data than the real browser or I missed something else?!

Amir Pournasserian
  • 1,498
  • 4
  • 21
  • 41

1 Answers1

1

What your application is requesting is certainly:

http://dbpedia.org/data/AfghanistanCommunications.ntriples

but what your Web browser is showing is:

http://dbpedia.org/data/Communications_in_Afghanistan.ntriples

If your Web browser, if you go to http://dbpedia.org/resource/AfghanistanCommunications or http://dbpedia.org/page/AfghanistanCommunications, you are redirected to http://dbpedia.org/page/Communications_in_Afghanistan, unless the ask for specific formats. The reason for the redirect is because Wikipedia has a redirect from http://en.wikipedia.org/wiki/AfghanistanCommunications to http://en.wikipedia.org/wiki/Communications_in_Afghanistan. You can see in the triples you get with your application:

<http://dbpedia.org/ontology/wikiPageRedirects>
Antoine Zimmermann
  • 4,735
  • 15
  • 30