Dataset

  • GeoSpecies Knowledge Base

    Data exposed: Information on Biological Orders, Families, Species as well as species occurrence records and related data The data set currently contains information and linked...
  • UniProt

    About Data exposed: a large life sciences data set about proteins and their function. Openness Not open. Copyright page states: Copyright 2007-2012 UniProt Consortium. We...
  • STW Thesaurus for Economics

    The thesaurus provides vocabulary on any economic subject: about 6,000 standardized subject headings and about 18,000 entry terms to support individual keywords. You can also...
  • TCMGeneDIT Dataset

    Data exposed: Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons Size of dump...
  • Airport data from Our Airports published as RDF

    Description Data exposed: Information about airports, originally from package:ourairports, here re-published as RDF. Notes: Dump available by contact Issues The dataset does...
  • Yale Senselab

    About Data exposed: Yale Senselab Size of dump and data set: 216 KB Notes: released without contract The Semantic Web development of SenseLab involves exporting data from...
  • YAGO

    YAGO2s is a huge semantic knowledge base, derived from Wikipedia WordNet and GeoNames. Currently, YAGO2s has knowledge of more than 10 million entities (like persons,...
  • U.S. Census data

    Duplicate of package:2000-us-census-rdf
  • TWC Data-gov

    duplicate of package:twc-logd
  • Texai Lexicon

    Data exposed: machine readable dictionary derived from WordNet 2.1, Wiktionary, the CMU Pronouncing Dictionary and the OpenCyc lexicon. Each lexicon word sense entry contains...
  • Telegraphis Linked Data

    Data exposed: Countries, continents, capitals, currencies collected from GeoNames and Wikipedia data. Size of dump and data set: <10k triples a piece Notes: Also has...
  • Thesaurus for Graphic Materials (t4gm.info)

    Published in 2009, t4gm.info was a Linked Data rendering in RDFa of the Library of Congress' Thesaurus for Graphic Materials. It is now an early example of a linked data set...
  • SwetoDblp

    Data exposed: ontology focused on bibliography data of publications from DBLP with additions that include affiliations, universities, and publishers Size of dump and data set:...
  • RAMEAU subject headings (STITCH)

    Data exposed: SKOS representation of the RAMEAU book indexing vocabulary, maintained by the French National Library (BnF) Size of dump and data set: 130 MB uncompressed...
  • SIMILE Data Collection

    About Data exposed: various data sets including CIA's World Factbook, Library of Congress' Thesaurus of Graphic Materials, National Cancer Institute's cancer thesaurus, Web...
  • Semantic Web Dog Food Corpus

    About Data exposed: Metadata (papers, presentations, people) for several semantic web related conferences and workshops, including the most recent ISWC, ESWC and WWW events....
  • Semantic Bible

    Data exposed: (for New Testament Names) is a semantic knowledge base describing each named thing in the New Testament Size of dump and data set: about 600 names NTNames base...
  • U.S. Securities and Exchange Commission Corporate Ownership RDF Data (rdfabout)

    Data exposed: corporate ownership Size of dump and data set: 1.8 million triples Notes: also found in the of SPARQL Endpoints
  • Science Commons

    Data exposed: A bridging ontology, from Science Commons, importing other ontologies used in the prototype, defining classes and relations used to represent gene records and...
  • Rpm Find

    Data exposed: data exposed? Size of dump and data set: expands to about 1.3GB
  • RKB Explorer Data

    Data exposed: 45 different domains, each with a separate data set. The data sets are focused on scientific research; these include DBLP, Citeseer, CORDIS, NSF, EPSRC, RAE2001,...
  • Quotations Book

    Data exposed: at least 42,000 famous quotations with author and subject Size of dump and data set: size?
  • Ordnance Survey OpenData

    MiniScale® Data type: Raster Supply format: TIFF [LZW] Great Britain [304 MB] 1:250 000 Scale Colour Raster Data type: Raster...
  • Open Directory

    Data exposed: — Size of dump and data set: size? Notes: this is the classic RDF source but historically has had some problems with RDF correctness.
  • OpenCyc

    About Now it is even easier to use the rich and diverse collection of real-world concepts in OpenCyc to bring meaning to your semantic web applications! The full OpenCyc content...
  • NLM 2007 MeSH

    About Data exposed: NLM 2007 MeSH Size of dump and data set: 13 MB Notes: MeSH MOU Openness Appears to be in public domain. Copyright pages states: Government...
  • Neurocommons text mining pilot

    About The complete dataset is composed of a set of smaller datasets. Each download is in one of two formats: (1) WARC or (2) tar.gz. You can read about the WARC format by...
  • MusicBrainz

    Data exposed: — Size of dump and data set: Currently the zipped version of this data is 102MB
  • MeSH titles

    Data exposed: Extracted from 2007 Medline baseline distribution Size of dump and data set: 670 MB Notes: contact Medline for use terms
  • MeSH pairs

    Data exposed: NLM 2007 MeSH descriptor/qualifier pairs Size of dump and data set: 13 MB Openness: OPEN See http://www.nlm.nih.gov/mesh/termscon.html (basically attribution...
  • MeSH, IPSV - SKOS RDF

    About Data exposed: (used by output of MeSH to SKOS conversion) Size of dump and data set: 2.2 KB Notes: released without contract Openness Copyright notice: Integrated...
  • MeSH headings

    About Data exposed: List of all associations of MeSH headings to papers indexed by Medline extracted from 2007 Medline baseline distribution Size of dump and data set: 758 MB...
  • Linked Movie DataBase

    Data exposed: Linked Data about Movies Size of data set: 6,148,121 triples. Openness: Open Mixture of material from Wikipedia, Freebase and Geonames and states on...
  • LinkedCT

    Data exposed: Linked Clinical Trials Size of dump and data set: ~25 million triples as of April 2011. 4.8GB NTriples dump CC by-nc-sa license You are free to copy,...
  • Lexvo

    About Data exposed: Linguistic Data Size of dump and data set: ~40MB Openness Download dump: CC-BY-SA 3.0 license The web service additionally provides some parts that are...
  • DBTune.org Jamendo RDF Server

    Description The package holds data from package:jamendo converted to RDF, available under the same license than the raw Jamendo data itself. The package also holds links...
  • Homologene

    Data exposed: what? Size of dump and data set: 626 KB Notes: NCBI Copyright and Disclaimers
  • GO annotations from National Center for Biotechnology Information (NCBI) and ...

    Data exposed: GO annotations from National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI) Size of dump and data set: 73 MB Openness...
  • Galen from co-ode.org

    Data exposed: Galen from co-ode.org Size of dump and data set: 1.9 MB Notes: released without contract Openness: ? No license specified on home page though generic...
  • Freebase RDF Store

    Duplicate of package:freebase Data exposed: Freebase Views of Freebase Topics following the principles of Linked Data. The dataset extractions contain aggregated data from:...
  • Fly-TED

    Data exposed: derived from data published by www.fly-ted.org and provides metadata on images depicting in situ hybridisation in D. melanogaster testes. Size of dump and data...
  • FlyAtlas

    Data exposed: FlyAtlas and Affy D2 probe-to-gene Size of dump and data set: size? Notes: also found in the of SPARQL Endpoints
  • Entrez Gene Extract

    Data exposed: Entrez Gene Extract from [ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz] Size of dump and data set: 5.6 MB Notes: NCBI Copyright and Disclaimers
  • Entrez Gene

    About Data exposed: Select fields from Entrez Gene records Size of dump and data set: 7.7 MB Notes: NCBI Copyright and Disclaimers Openness Data appears to be in public...
  • DOAP Store

    About Data exposed: provides daily generated dumps with all its DOAP project descriptions Size of dump and data set: size? Notes: 2009-05-24: Both files seem to be empty -...
  • DOAPspace

    Data exposed: All 55,000+ DOAP profiles available as RDF/XML DOAP. This includes all DOAP created by doapspace and all DOAP spidered. Size of dump and data set: size? Notes:...
  • DMOZ RDF Dump

    Data exposed: DMOZ Size of dump and data set: size? Openness: OPEN (?) Use Open Directory License which is, in essence, open (may be some wrinkles about updates).
  • DBTune.org Magnatune RDF server

    Magnatune is an independent music label, allowing people to buy records for as much as they want. This package contains the Magnatune catalog in RDF format. The converted RDF...
  • DBTune.org John Peel sessions RDF server

    RDF conversion of a dataset released by the BBC, about the John Peel sessions, a long-lived series of live music performances on BBC Radio 1, hosted by DJ John Peel.
  • DBpedia

    Data exposed: Data set containing extracted data from Wikipedia. About 2.6 million concepts described by 247 million triples, including abstracts in 14 different languages Size...