Billion Triples Challenge Dataset 2010

Dataset that was used for the Billion Triples Challenge 2010:

See: http://challenge.semanticweb.org/

The major part of the dataset was crawled from the Web of Linked Data during March/April 2010 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. We also included partial data from data.gov and data.gov.uk.

The downloaded content was parsed using the Redland toolkit with the rdfxml parser. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in NQuads format and split into chunks of 10m statements each.

The datasets of the Billion Triples Challenges 2008 and 2009 are also still available.

Data and Resources

Additional Info

Pole Hodnota
Zdroj http://km.aifb.kit.edu/projects/btc-2010/
Autor Andreas Harth
Správca Andreas Harth
Verzia 2010
triples 3200000000

Comments