CORE - Semantic Similarity of Open Access publications

The CORE dataset contains information about similarities between scientific papers stored across Open Access repositories. The similarities are calculated using Natural Language Processing techniques based on the full-text. The similarities are provided only for research articles with an accessible and machine readable full-text. More information about the data structure can be found at:

RDF Statistics

At the moment we expose more than 92 million RDF triples describing similarities calculated on a set of more than 400k full-text articles harvested from over 230 Open Access repositories.


The data about the similarities are represented using the MuSIM ontology ( BIBO ontologies ( with links to the OAI (RKBExplorer) repository available in the Linked Data cloud.

Données et ressources

Info additionnelle

Champ Valeur
Producteur Petr Knoth
Mainteneur Petr Knoth
Version 1.0
Dernière modification 29 Juillet 2014, 09:36
Créé le 13 Juillet 2011, 21:44
links:rkb-explorer-oai 200000
shortname CORE
triples 101526714
comments powered by Disqus
comments powered by Disqus