Published

Epoch Data on AI Models

0
28

Comprehensive database of over 2800 AI/ML models tracking key factors driving machine learning progress, including parameters, training compute, training dataset size, publication date, organization, and more. Sourced from Epoch AI.

API Access

Access dataset files directly from scripts, code, or AI agents.

Browse dataset files
Dataset Files

Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.

/ai/epoch-data-on-ai-models/
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/AGENTS.md
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/README.md
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/data/all_ai_models.csv
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/data/frontier_ai_models.csv
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/data/large_scale_ai_models.csv
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/data/notable_ai_models.csv
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/datapackage.json
Key Files

Start with these files — they give you everything you need to understand and access the dataset.

datapackage.jsonmetadata & schema
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/datapackage.json
README.mddocumentation
https://datahub.io/ai/epoch-data-on-ai-models/_r/-/README.md
Typical Usage
  1. 1. Fetch datapackage.json to inspect schema and resources
  2. 2. Download data resources listed in datapackage.json
  3. 3. Read README.md for full context

Data Views

Data Previews

All AI Models

Loading data...

Schema

nametypeformatdescription
ModelstringName of the AI model
DomainstringDomain(s) the model operates in (e.g. Language, Vision)
TaskstringTask(s) the model is designed for
OrganizationstringOrganization(s) that developed the model
AuthorsstringAuthors of the model or associated paper
Publication datedate%Y-%m-%dDate the model was published or released
ReferencestringCitation reference for the model
LinkstringURL to model paper or announcement
CitationsnumberNumber of citations
Notability criteriastringCriteria that make this model notable
Notability criteria notesstringAdditional notes on notability criteria
ParametersnumberNumber of model parameters
Parameters notesstringNotes on parameter count
Training compute (FLOP)numberTotal training compute in floating point operations
Training compute notesstringNotes on training compute estimate
Training datasetstringName or description of the training dataset
Training dataset notesstringNotes on the training dataset
Training dataset size (datapoints)numberNumber of datapoints in the training dataset
Dataset size notesstringNotes on dataset size
Training time (hours)numberTotal training time in hours
Training time notesstringNotes on training time estimate
Training hardwarestringHardware used for training (e.g. A100, H100)
ApproachstringModeling approach or architecture type
ConfidencestringConfidence level of the data entries
AbstractstringAbstract of the associated paper
EpochsnumberNumber of training epochs
Benchmark datastringBenchmark evaluation data
Model accessibilitystringAccessibility of the model weights (e.g. open, closed)
Country (of organization)stringCountry where the developing organization is based
Base modelstringBase model this model was fine-tuned from, if any
Finetune compute (FLOP)numberCompute used for fine-tuning in FLOP
Finetune compute notesstringNotes on fine-tune compute estimate
Hardware quantitynumberNumber of hardware units used for training
Hardware utilization (MFU)numberModel FLOP utilization (MFU) of training hardware
Last modifiedstringTimestamp when the record was last modified
Training cloud compute vendorstringCloud vendor used for training compute
Training data centerstringData center used for training
Archived linksstringArchived URLs for the model or paper
Batch sizenumberTraining batch size
Batch size notesstringNotes on batch size
Organization categorizationstringCategory of the developing organization (e.g. Industry, Academia)
Foundation modelbooleanWhether this is a foundation model
Training compute lower boundnumberLower bound estimate of training compute in FLOP
Training compute upper boundnumberUpper bound estimate of training compute in FLOP
Training chip-hoursnumberTotal chip-hours used for training
Training code accessibilitystringAccessibility of training code
Accessibility notesstringNotes on accessibility of model or code
Organization categorization (from Organization)stringOrganization category derived from organization field
Possibly over 1e23 FLOPbooleanWhether training compute may exceed 1e23 FLOP
Training compute cost (2023 USD)numberEstimated training compute cost in 2023 US dollars
Utilization notesstringNotes on hardware utilization
Numerical formatstringNumerical precision format used in training (e.g. FP16, BF16)
Frontier modelbooleanWhether this model was a frontier model at the time of release
Training power draw (W)numberPower consumption during training in watts
Training compute estimation methodstringMethod used to estimate training compute
Hugging Face developer idstringHugging Face developer or organization identifier
Post-training compute (FLOP)numberCompute used for post-training (RLHF, fine-tuning, etc.) in FLOP
Post-training compute notesstringNotes on post-training compute estimate
Hardware utilization (HFU)numberHardware FLOP utilization (HFU) during training

Notable AI Models

Loading data...

Schema

nametypeformatdescription
ModelstringName of the AI model
OrganizationstringOrganization(s) that developed the model
Publication datedate%Y-%m-%dDate the model was published or released
DomainstringDomain(s) the model operates in (e.g. Language, Vision)
TaskstringTask(s) the model is designed for
ParametersnumberNumber of model parameters
Parameters notesstringNotes on parameter count
Training compute (FLOP)numberTotal training compute in floating point operations
Training compute notesstringNotes on training compute estimate
Training datasetstringName or description of the training dataset
Training dataset size (datapoints)numberNumber of datapoints in the training dataset
Dataset size notesstringNotes on dataset size
ConfidencestringConfidence level of the data entries
LinkstringURL to model paper or announcement
ReferencestringCitation reference for the model
CitationsnumberNumber of citations
AuthorsstringAuthors of the model or associated paper
AbstractstringAbstract of the associated paper
Organization categorizationstringCategory of the developing organization (e.g. Industry, Academia)
Country (of organization)stringCountry where the developing organization is based
Notability criteriastringCriteria that make this model notable
Notability criteria notesstringAdditional notes on notability criteria
EpochsnumberNumber of training epochs
Training time (hours)numberTotal training time in hours
Training time notesstringNotes on training time estimate
Training hardwarestringHardware used for training (e.g. A100, H100)
Hardware quantitynumberNumber of hardware units used for training
Hardware utilization (MFU)numberModel FLOP utilization (MFU) of training hardware
Training compute cost (2023 USD)numberEstimated training compute cost in 2023 US dollars
Compute cost notesstringNotes on compute cost estimate
Training power draw (W)numberPower consumption during training in watts
Base modelstringBase model this model was fine-tuned from, if any
Finetune compute (FLOP)numberCompute used for fine-tuning in FLOP
Finetune compute notesstringNotes on fine-tune compute estimate
Batch sizenumberTraining batch size
Batch size notesstringNotes on batch size
Model accessibilitystringAccessibility of the model weights (e.g. open, closed)
Training code accessibilitystringAccessibility of training code
Inference code accessibilitystringAccessibility of inference code
Accessibility notesstringNotes on accessibility of model or code
Numerical formatstringNumerical precision format used in training (e.g. FP16, BF16)
Frontier modelbooleanWhether this model was a frontier model at the time of release
Hardware acquisition costnumberCost of acquiring the training hardware in USD
Hardware utilization (HFU)numberHardware FLOP utilization (HFU) during training
Training compute cost (cloud)numberEstimated training compute cost using cloud pricing in USD
Training compute cost (upfront)numberEstimated training compute cost using upfront hardware pricing in USD

Large-Scale AI Models

Loading data...

Schema

nametypeformatdescription
ModelstringName of the AI model
DomainstringDomain(s) the model operates in (e.g. Language, Vision)
TaskstringTask(s) the model is designed for
AuthorsstringAuthors of the model or associated paper
Model accessibilitystringAccessibility of the model weights (e.g. open, closed)
LinkstringURL to model paper or announcement
CitationsnumberNumber of citations
ReferencestringCitation reference for the model
Publication datedate%Y-%m-%dDate the model was published or released
OrganizationstringOrganization(s) that developed the model
ParametersnumberNumber of model parameters
Parameters notesstringNotes on parameter count
Training compute (FLOP)numberTotal training compute in floating point operations
Training compute notesstringNotes on training compute estimate
Training datasetstringName or description of the training dataset
Training dataset notesstringNotes on the training dataset
Training dataset size (datapoints)numberNumber of datapoints in the training dataset
Dataset size notesstringNotes on dataset size
Training time (hours)numberTotal training time in hours
Training time notesstringNotes on training time estimate
Training hardwarestringHardware used for training (e.g. A100, H100)
ConfidencestringConfidence level of the data entries
AbstractstringAbstract of the associated paper
Country (of organization)stringCountry where the developing organization is based
Base modelstringBase model this model was fine-tuned from, if any
Finetune compute (FLOP)numberCompute used for fine-tuning in FLOP
Finetune compute notesstringNotes on fine-tune compute estimate
Hardware quantitynumberNumber of hardware units used for training
Hardware utilization (MFU)numberModel FLOP utilization (MFU) of training hardware
Training code accessibilitystringAccessibility of training code
Accessibility notesstringNotes on accessibility of model or code
Organization categorization (from Organization)stringOrganization category derived from organization field
Hardware utilization (HFU)numberHardware FLOP utilization (HFU) during training
Training compute cost (cloud)numberEstimated training compute cost using cloud pricing in USD
Training compute cost (upfront)numberEstimated training compute cost using upfront hardware pricing in USD

Frontier AI Models

Loading data...

Schema

nametypeformatdescription
ModelstringName of the AI model
DomainstringDomain(s) the model operates in (e.g. Language, Vision)
TaskstringTask(s) the model is designed for
AuthorsstringAuthors of the model or associated paper
Notability criteriastringCriteria that make this model notable
Notability criteria notesstringAdditional notes on notability criteria
Model accessibilitystringAccessibility of the model weights (e.g. open, closed)
LinkstringURL to model paper or announcement
CitationsnumberNumber of citations
ReferencestringCitation reference for the model
Publication datedate%Y-%m-%dDate the model was published or released
OrganizationstringOrganization(s) that developed the model
ParametersnumberNumber of model parameters
Parameters notesstringNotes on parameter count
Training compute (FLOP)numberTotal training compute in floating point operations
Training compute notesstringNotes on training compute estimate
Training datasetstringName or description of the training dataset
Training dataset notesstringNotes on the training dataset
Training dataset size (datapoints)numberNumber of datapoints in the training dataset
Dataset size notesstringNotes on dataset size
EpochsnumberNumber of training epochs
Inference compute (FLOP)numberCompute per inference pass in FLOP
Inference compute notesstringNotes on inference compute estimate
Training time (hours)numberTotal training time in hours
Training time notesstringNotes on training time estimate
Training hardwarestringHardware used for training (e.g. A100, H100)
ApproachstringModeling approach or architecture type
Compute cost notesstringNotes on compute cost estimate
Compute sponsor categorizationstringCategory of the compute sponsor
ConfidencestringConfidence level of the data entries
AbstractstringAbstract of the associated paper
Last modifiedstringTimestamp when the record was last modified
Created BystringPerson who created this record
Benchmark datastringBenchmark evaluation data
ExcludebooleanWhether this model is excluded from certain analyses
Country (of organization)stringCountry where the developing organization is based
Base modelstringBase model this model was fine-tuned from, if any
Finetune compute (FLOP)numberCompute used for fine-tuning in FLOP
Finetune compute notesstringNotes on fine-tune compute estimate
Hardware quantitynumberNumber of hardware units used for training
Hardware utilization (MFU)numberModel FLOP utilization (MFU) of training hardware
Training cost trendsstringTrend information for training costs
Training cloud compute vendorstringCloud vendor used for training compute
Training data centerstringData center used for training
Archived linksstringArchived URLs for the model or paper
Batch sizenumberTraining batch size
Batch size notesstringNotes on batch size
Organization categorizationstringCategory of the developing organization (e.g. Industry, Academia)
Foundation modelbooleanWhether this is a foundation model
Training compute lower boundnumberLower bound estimate of training compute in FLOP
Training compute upper boundnumberUpper bound estimate of training compute in FLOP
Training chip-hoursnumberTotal chip-hours used for training
Training code accessibilitystringAccessibility of training code
Accessibility notesstringNotes on accessibility of model or code
Organization categorization (from Organization)stringOrganization category derived from organization field
Possibly over 1e23 FLOPbooleanWhether training compute may exceed 1e23 FLOP
Training compute cost (2023 USD)numberEstimated training compute cost in 2023 US dollars
Training dataset sizenumberSize of the training dataset (alternative field)
SparsitynumberModel sparsity ratio
Utilization notesstringNotes on hardware utilization
Estimated over 1e25 FLOPbooleanWhether training compute is estimated to exceed 1e25 FLOP
Power per GPUnumberPower draw per GPU unit in watts
Cluster total TDPnumberTotal thermal design power of the training cluster in watts
Base model computenumberTraining compute of the base model in FLOP
Total compute - (base + finetune)numberTotal compute including base model and fine-tuning in FLOP
API pricesstringAPI pricing information for the model
CreatedstringTimestamp when the record was created
Inference code accessibilitystringAccessibility of inference code
Numerical formatstringNumerical precision format used in training (e.g. FP16, BF16)
Model versionsstringAvailable versions of the model
Frontier modelbooleanWhether this model was a frontier model at the time of release
Training power draw (W)numberPower consumption during training in watts
Benchmark evalsstringBenchmark evaluation results
FLOP/$numberTraining compute efficiency in FLOP per dollar
Hardware release datedateanyRelease date of the training hardware
Hardware agenumberAge of the training hardware in years at time of training
Hardware FP32numberHardware FP32 FLOP/s throughput
Hardware TF32numberHardware TF32 FLOP/s throughput
Hardware countnumberNumber of hardware units in the training cluster
Hardware TF16numberHardware TF16 FLOP/s throughput
Hardware FP16numberHardware FP16 FLOP/s throughput
Assumed precisionstringAssumed numerical precision for compute estimates
Assumed hardware FLOP/snumberAssumed hardware throughput in FLOP/s used for compute estimates
Hardware typestringType of hardware used (e.g. GPU, TPU)
Training compute estimation methodstringMethod used to estimate training compute
Biological model safeguardsstringSafeguards related to biological model risks
BenchmarkHub-v1stringBenchmarkHub v1 evaluation results
Hugging Face developer idstringHugging Face developer or organization identifier
Post-training compute (FLOP)numberCompute used for post-training (RLHF, fine-tuning, etc.) in FLOP
Post-training compute notesstringNotes on post-training compute estimate
Hardware makerstringManufacturer of the training hardware
benchmarks/modelsstringBenchmark to model mapping data
Maybe over 1e25 FLOPbooleanWhether training compute may exceed 1e25 FLOP
Updated dataset sizenumberUpdated or revised training dataset size
WT103 pplnumberWikiText-103 perplexity score
WT2 pplnumberWikiText-2 perplexity score
PTB pplnumberPenn Treebank perplexity score
Distillation or synthetic datastringWhether model was trained on distillation or synthetic data
Distillation or synthetic data computenumberCompute used to generate distillation or synthetic training data in FLOP
Distillation or synthetic data compute notesstringNotes on distillation or synthetic data compute
Knowledge cutoffstringTraining data knowledge cutoff date
Context windownumberMaximum context window size in tokens
Hardware utilization (HFU)numberHardware FLOP utilization (HFU) during training
Training compute cost (cloud)numberEstimated training compute cost using cloud pricing in USD
Training compute cost (upfront)numberEstimated training compute cost using upfront hardware pricing in USD

Data Files

FileDescriptionSizeLast modifiedDownload
all-ai-models
All AI models in the Epoch database (~21,000 entries).5.72 MBabout 13 hours ago
all-ai-models
notable-ai-models
Subset of notable AI models with richer metadata (~7,400 entries).1.85 MBabout 13 hours ago
notable-ai-models
large-scale-ai-models
Large-scale AI models subset (~3,600 entries).902 kBabout 13 hours ago
large-scale-ai-models
frontier-ai-models
Frontier AI models subset — the most capable models at each point in time (~1,600 entries).371 kBabout 13 hours ago
frontier-ai-models
FilesSizeFormatCreatedUpdatedLicenseSource
48.85 MBabout 13 hours agoCreative Commons Attribution 4.0Epoch AI — Notable AI Models

Epoch Data on AI Models

Comprehensive database of AI/ML models tracking key factors driving machine learning progress. Sourced from Epoch AI.

Data

The dataset is split into four CSV files covering different subsets of models:

FileDescriptionRows
data/all_ai_models.csvAll AI models in the Epoch database~21,600
data/notable_ai_models.csvNotable models with richer metadata~7,400
data/large_scale_ai_models.csvLarge-scale models subset~3,600
data/frontier_ai_models.csvFrontier models — most capable at each point in time~1,600

Key fields

  • Model — name of the model
  • Organization — developing organization (e.g. Google, OpenAI, DeepMind)
  • Publication date — date of release or publication (1950–2025)
  • Domain — area of application (Language, Vision, Multimodal, Robotics, etc.)
  • Task — specific task(s) the model performs
  • Parameters — number of model parameters
  • Training compute (FLOP) — total training compute in floating point operations
  • Training dataset — name/description of training data
  • Training dataset size (datapoints) — number of training examples
  • Training hardware — hardware used for training
  • Training compute cost (2023 USD) — estimated cost of training
  • Frontier model — whether the model was state-of-the-art at release
  • Model accessibility — open weights, closed, API-only, etc.

Full field-level documentation is in datapackage.json.

License

Creative Commons Attribution 4.0 (CC-BY-4.0) — Epoch AI.

Source

Epoch AI — Notable AI Models: https://epochai.org/data/notable-ai-models