Updated

Dataset: epoch-data-on-ai-models

Cross-country Historical Adoption of Technology (CHAT) — an unbalanced panel dataset with adoption data for over 100 technologies in more than 150 countries from 1800 to 2008. Compiled by Diego Comin and Bart Hobijn at the National Bureau of Economic Research (NBER).

API Access

Access dataset files directly from scripts, code, or AI agents.

Browse dataset files
Dataset Files

Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.

/technology/historical-adoption-of-technology/
https://datahub.io/technology/historical-adoption-of-technology/_r/-/AGENTS.md
https://datahub.io/technology/historical-adoption-of-technology/_r/-/README.md
https://datahub.io/technology/historical-adoption-of-technology/_r/-/data/chat.csv
https://datahub.io/technology/historical-adoption-of-technology/_r/-/datapackage.json
Key Files

Start with these files — they give you everything you need to understand and access the dataset.

datapackage.jsonmetadata & schema
https://datahub.io/technology/historical-adoption-of-technology/_r/-/datapackage.json
README.mddocumentation
https://datahub.io/technology/historical-adoption-of-technology/_r/-/README.md
Typical Usage
  1. 1. Fetch datapackage.json to inspect schema and resources
  2. 2. Download data resources listed in datapackage.json
  3. 3. Read README.md for full context

Data Views

Data Previews

Historical Adoption of Technology

Loading data...

Schema

nametypedescription
country_namestringCountry name
yearintegerYear of observation
cellphonenumberMobile/cellular telephone subscriptions
telephonenumberFixed telephone lines
internetusernumberInternet users
computernumberPersonal computers
tvnumberTelevision sets
radionumberRadio receivers
vehicle_carnumberPassenger cars
vehicle_comnumberCommercial vehicles
elecprodnumberElectricity production (kWh)
raillinenumberRailway lines (km)
railpkmnumberRailway passenger-kilometres
railtkmnumberRailway freight tonne-kilometres
atmnumberAutomated teller machines (ATMs)
newspapernumberDaily newspaper circulation
mailnumberPieces of mail handled
telegramnumberTelegrams sent
fert_totalnumberFertilizer consumption (metric tons)
ag_tractornumberAgricultural tractors
xlpopulationnumberTotal population
xlrealgdpnumberReal GDP
pctivliteracynumberLiteracy rate (%)
pctimmunizdptnumberDPT immunization rate (%)
pctimmunizmeasnumberMeasles immunization rate (%)

Data Files

FileDescriptionSizeLast modifiedDownload
historical-adoption-of-technology
Unbalanced panel dataset covering adoption of 100+ technologies in 150+ countries, 1750–2008. Each row is a country-year observation. Technology columns contain adoption units (varies by technology).6.37 MBabout 2 months ago
historical-adoption-of-technology
FilesSizeFormatCreatedUpdatedLicenseSource
16.37 MBabout 2 months agoOpen Data Commons Public Domain Dedication and LicenseNBER — The CHAT Dataset (Comin & Hobijn, 2009)

Dataset: epoch-data-on-ai-models

This is a Frictionless Data Package.

Concepts

Data hierarchy (from broad to specific):

  • Catalog = a collection of datasets (maps to a DataHub publication, one GitHub repo)
  • Dataset = a coherent data concept with a defined schema and coverage — this directory
  • Data file = a concrete file artifact (csv, json, parquet…) listed as a resource in datapackage.json

Dataset lifecycle — a dataset doesn't need to be complete on day one:

  • capture — just a URL or note, intent to explore
  • stub — minimal entry: title, description, source link, no files yet
  • archived — raw files downloaded locally
  • structured — cleaned, normalised, schema documented
  • enriched — analysis, visualisations, derived data added
  • monitored — living source, versioned and updated over time

Catalog-as-repo pattern: if the source is a portal or collection containing many datasets (e.g. a data.gov agency, an institutional archive), give it its own repo and DataHub publication — not a subfolder here.


Structure

epoch-data-on-ai-models/
  datapackage.json   # dataset metadata and resource list
  data/              # data files (csv, json, parquet, etc.)
  .datahubignore     # files to exclude when pushing (gitignore syntax)

datapackage.json

Keep resources in sync with what's in data/:

{
  "name": "epoch-data-on-ai-models",
  "title": "Human readable title",
  "description": "What this dataset is about",
  "resources": [
    {
      "path": "data/my-file.csv",
      "name": "my-file",
      "mediatype": "text/csv"
    }
  ]
}

Workflow

# Add data files to data/
# Edit datapackage.json — update resources to list them
data pack .   # validate
dh push .     # publish to DataHub

Key rules

  • Every file in data/ that you want published must be listed in resources
  • name in datapackage.json must be URL-safe (lowercase, hyphens)
  • Use .datahubignore to exclude scratch files, large intermediaries, etc.
  • It is fine to push a stub — set lifecycle stage in datapackage.json as "status": "stub" if incomplete