3 conjuntos de datos encontrados

Etiquetas: ir

  • Reuters-21578

    A set of documents from Reuters' 1986 newswire which have been classified. This dataset is appropriate for testing natural language processing and information retrieval...
  • RCV1-v2/LYRL2004

    This is a publicly available, tokenized version of the Reuters RCV1 corpus by David D Lewis et al. The creator requests attribution.
  • The ClueWeb09 Dataset

    The ClueWeb09 dataset was created to support research on information retrieval and related human language technologies. It consists of about 1 billion web pages in ten languages...
Usted también puede acceder a este registro utilizando los API (ver API Docs).