KORE 50 NIF NER Corpus

KORE 50[1] (AIDA) is a subset of the larger AIDA corpus, which is based on the dataset of the CoNLL 2003 NER task. The dataset aims to capture hard to disambiguate mentions of entities and it contains a large number of first names referring to persons, whose identity needs to be deduced from the given context. It comprises 50 sentences from different domains, such as music, celebrities, and business and is provided in a clear TSV format.

The corpus was converted to NLP Interchange Format (NIF).

[1] J. Hoffart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. KORE: Keyphrase overlap relatedness for entity disambiguation. In Proc. of the 21st ACM international conference on Information and knowledge management, pages 545{554. ACM, 2012

Daten und Ressourcen

Zusätzliche Informationen

Feld Wert
Autor Magnus Knuth
Maintainer Magnus Knuth
Zuletzt geändert 22. September. 2015, 10:28 (Etc/UTC)
Erstellt 5. September. 2014, 08:45 (Etc/UTC)
homepage http://www.yovisto.com/labs/ner-benchmarks/
links:dbpedia 144
triples 1410
comments powered by Disqus
comments powered by Disqus