34 datasets found

Filter Results
  • USPTO Patent data

    Linked Data version of the US Patent and Trademark Office (USPTO) data. Number of triples: 168,768,889. Links to other datasets: DBpedia, EUPatents, WorldBank and LinkedGeoData
  • GWPP Glossary

    The GWPP glossary is a set of scientific terms and their definitions that are used inside the Global Water Pathogen Project online book. This dataset is crowdsourced by a large...
  • Lidioms

    the LIDIOM dataset is a multilingual RDF representation of idioms containing five languages. The data set was crawled and integrated from various sources. For assuring the...
  • DBpedia abstract corpus

    This corpus contains a conversion of Wikipedia abstracts in six languages (dutch, english, french, german, italian and spanish) into the I used the NLP Interchange Format (NIF)....
  • LinkLion - A Link Repository for the Web of Data

    LinkLion is an open-source central repository for the storage of links among resources in the Linked Open Data web. The main goal of LinkLion is to facilitate the publication,...
  • SemanticQuran

    The Semantic Quran dataset is a multilingual RDF representation of translations of the Quran. The dataset was created by integrating data from two different semi-structured...
  • Linked TCGA

    Linked TCGA is the RDF version of the Cancer Genome Atlas, a pilot project started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research...
  • LODStats

    LODStats: The Data Web Census Dataset.
  • JRC-Names-MLODE

    From their web site: JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities'). It consists of large lists of names and...
  • Caucasian Spiders

    The Caucasian Spiders Database aims at containing all records (published occurrences) of spiders (Araneae) in the Caucasus Ecoregion (the rayons Krasnodar and Stavropol in...
  • CORDIS corpus

    CORDIS (Community Research and Development Information Service), is the European Commission’s core public repository providing dissemination information for all EU-funded...
  • CORDIS

    todo
  • aksw.org Research Group dataset

    This dataset contains projects, sub groups, people and pages or the Agile Knowledge Management and Semantic Web (AKSW) Research Group @ University of Leipzig.
  • KORE 50 NIF NER Corpus

    KORE 50[1] (AIDA) is a subset of the larger AIDA corpus, which is based on the dataset of the CoNLL 2003 NER task. The dataset aims to capture hard to disambiguate mentions of...
  • ORCID

    ORCID (Open Researcher and Contributor ID) is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors. This dataset contains RDF conversion...
  • Statbel Corpus

    This corpus contains RDF conversion of datasets from the "Statistics Belgium" (also known as Statbel) which aims at collecting, processing and disseminating relevant, reliable...
  • Global airports in RDF

    This corpus contains RDF conversion of Global airports dataset which was retrieved from openflights.org. The dataset contains information about airport names, its location,...
  • Lion's Den

    Lion's Den is a RDF repository of link specifications. Lion's Den is intended to be an open community-driven dataset that allows data publishers to also publish their...
  • LSQ

    Linked SQ: a Linked Dataset describing SPARQL queries extracted from the logs of a variety of prominent public SPARQL endpoints. We argue that this dataset has a variety of uses...
  • Brown Corpus in RDF/NIF

    RDF version of the Brown Corpus (W. N. Francis, H. Kucera; Brown University; 1979). 1,014,312 words in 500 documents, taken from newspapers texts on diverse topics, non-fiction...