Etiqueta: corpora

Existem 5 conjuntos de dados marcados com a etiqueta corpora:

  • About From website: As of November 2007, the European Commission's Directorate-General for Translation (DGT) made publicly accessible its multilingual Translation Memory for the Acquis...
  • VoxForge
    • 49 views
    • None Não Abertamente Licenciado
    About VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac). We will make available all submitted...
  • The New York Times Annotated Corpus
    • 137 views
    • None Não Abertamente Licenciado
    About From website: The New York Times Annotated Corpus contains over 1.8 million articles written and published by the New York Times between January 1, 1987 and June 19, 2007 with...
  • Description Overview from home page: The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 11 European languages: Romanic...
  • Web 1T 5-gram Version 1
    • 189 views
    • None Não Abertamente Licenciado
    This data set, contributed by Google Inc., contains English word n-grams and their observed frequency counts. The length of the n-grams ranges from unigrams (single words) to five-grams....