Thomson Reuters Text Research Collection (TRC2)

The TRC2 corpus comprises 1,800,370 news stories covering the period from 2008-01-01 00:00:03 to 2009-02-28 23:54:14 or 2,871,075,221 bytes, and was initially made available to participants of the 2009 blog track at the Text Retrieval Conference (TREC), to supplement the BLOGS08 corpus (that contains results of a large blog crawl carried out at the University of Glasgow). TRC2 is distributed via web download.

The stories in the Reuters Corpus are under the copyright of Reuters Ltd and/or Thompson Reuters, and their use is governed by the following agreements:

Organizational agreement

Agreement

This agreement must be signed by the person responsible for the data at your organization, and sent to NIST.

Individual agreement

Agreement

This agreement must be signed by all researchers using the Reuters Corpus at your organization, and kept on file at your organization.

Getting the corpus

  • Download and print the Organizational and Individual agreement forms above.

  • Send the Organizational form to NIST by one of the methods listed below:

  • Send a scanned pdf file

  • Complete the Reuters Organizational form and send a pdf file of the form to: reuters-request@nist.gov

  • In your email include the following:

  • Subject: request for Reuters corpus

  • In the body of message include: your name, your complete postal address, and if you are requesting RCV1, RCV2, TRC2 or all three.

  • (do not include other correspondence in this message)

Complete and keep the individual agreement form on file at your organization.

Subject to our approval, you will receive (in the case of RCV1 and 2) the corpus CDs by mail, and/or (in the case of TRC2) a download URL, login, and password via email.

Please allow seven business days for a response.

If you have already obtained some of the Reuters corpora, and wish to obtain others, send email to reuters-request@nist.gov. Please provide the name of your organization, the month/year you requested RCV1/2/TRC2, and the corpus you are interested in receiving. An Organizational agreement must be on file at NIST.

Risorse

(nessuno)

Informazioni supplementari

Campo Valore
Origine http://trec.nist.gov/data/reuters/reuters.html
Autore Autore sconosciuto
Manutentore Manutentore sconosciuto

Cite this

Thomson Reuters Text Research Collection (TRC2). No author.
Retrieved 17:13, May 24, 2013 (UTC).
the Data Hub

Comments