Published

Artificial Intelligence

Artificial intelligence has become one of the defining technological forces of our era. This collection brings together datasets tracking the progress of AI — from historical adoption curves and compute scaling to model capability benchmarks and economic impact indicators.

Datasets on DataHub

AI Models & Capabilities

  • Epoch AI — Notable ML Models: https://datahub.io/ai/epoch-data-on-ai-models. A curated dataset of notable machine learning models from 1950 to the present, tracking publication year, parameters, training compute (FLOPs), hardware, and organization.

  • Historical Adoption of Technology: https://datahub.io/ai/historical-adoption-of-technology. Long-run adoption curves for transformative technologies in the United States, including the internet, smartphones, and AI-related tools. Useful for contextualizing how quickly AI is being adopted relative to prior waves of technology.

Key Themes

Compute scaling — Training compute (measured in FLOPs) has roughly doubled every 6 months for frontier models since 2010, far outpacing Moore's Law. Epoch AI's dataset is the canonical source for tracking this trend.

Adoption curves — AI tools like ChatGPT reached 100 million users faster than any prior consumer technology. Historical adoption data provides the benchmark for comparison.

Concentration — A small number of organizations (Google, OpenAI, Meta, DeepMind, Anthropic) account for the majority of frontier model development. This is visible in the Epoch dataset's organization field.

External Resources