CORE - Semantic Similarity of Open Access publications

The CORE dataset contains information about similarities between scientific papers stored across Open Access repositories. The similarities are calculated using Natural Language Processing techniques based on the full-text. The similarities are provided only for research articles with an accessible and machine readable full-text. More information about the data structure can be found at:

RDF Statistics

At the moment we expose more than 92 million RDF triples describing similarities calculated on a set of more than 400k full-text articles harvested from over 230 Open Access repositories.


The data about the similarities are represented using the MuSIM ontology ( BIBO ontologies ( with links to the OAI (RKBExplorer) repository available in the Linked Data cloud.

Data and Resources

Additional Info

Polje Vrednost
Autor Petr Knoth
Održаvа Petr Knoth
Verzijа 1.0
links:rkb-explorer-oai 200000
shortname CORE
triples 101526714