Dataset that was used for the Billion Triples Challenge 2010:
The major part of the dataset was crawled from the Web of Linked Data during March/April 2010 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. We also included partial data from data.gov and data.gov.uk.
The downloaded content was parsed using the Redland toolkit with the rdfxml parser. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in NQuads format and split into chunks of 10m statements each.
The datasets of the Billion Triples Challenges 2008 and 2009 are also still available.
|Autor||Andreas Harth, Chris Bizer|
Billion Triples Challenge Dataset 2010. Andreas Harth, Chris Bizer.
Retrieved 04:56, May 21, 2013 (UTC).
the Data Hub