API AccessAccess dataset files directly from scripts, code, or AI agents.
Browse dataset files
Access dataset files directly from scripts, code, or AI agents.
Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.
Start with these files — they give you everything you need to understand and access the dataset.
- 1. Fetch datapackage.json to inspect schema and resources
- 2. Download data resources listed in datapackage.json
- 3. Read README.md for full context
Data Views
Data Previews
Historical Adoption of Technology
Schema
| name | type | description |
|---|---|---|
| country_name | string | Country name |
| year | integer | Year of observation |
| cellphone | number | Mobile/cellular telephone subscriptions |
| telephone | number | Fixed telephone lines |
| internetuser | number | Internet users |
| computer | number | Personal computers |
| tv | number | Television sets |
| radio | number | Radio receivers |
| vehicle_car | number | Passenger cars |
| vehicle_com | number | Commercial vehicles |
| elecprod | number | Electricity production (kWh) |
| railline | number | Railway lines (km) |
| railpkm | number | Railway passenger-kilometres |
| railtkm | number | Railway freight tonne-kilometres |
| atm | number | Automated teller machines (ATMs) |
| newspaper | number | Daily newspaper circulation |
| number | Pieces of mail handled | |
| telegram | number | Telegrams sent |
| fert_total | number | Fertilizer consumption (metric tons) |
| ag_tractor | number | Agricultural tractors |
| xlpopulation | number | Total population |
| xlrealgdp | number | Real GDP |
| pctivliteracy | number | Literacy rate (%) |
| pctimmunizdpt | number | DPT immunization rate (%) |
| pctimmunizmeas | number | Measles immunization rate (%) |
Data Files
| File | Description | Size | Last modified | Download |
|---|---|---|---|---|
historical-adoption-of-technology | Unbalanced panel dataset covering adoption of 100+ technologies in 150+ countries, 1750–2008. Each row is a country-year observation. Technology columns contain adoption units (varies by technology). | 6.37 MB | about 2 months ago | historical-adoption-of-technology |
| Files | Size | Format | Created | Updated | License | Source |
|---|---|---|---|---|---|---|
| 1 | 6.37 MB | about 2 months ago | Open Data Commons Public Domain Dedication and License | NBER — The CHAT Dataset (Comin & Hobijn, 2009) |
Dataset: epoch-data-on-ai-models
This is a Frictionless Data Package.
Concepts
Data hierarchy (from broad to specific):
- Catalog = a collection of datasets (maps to a DataHub publication, one GitHub repo)
- Dataset = a coherent data concept with a defined schema and coverage — this directory
- Data file = a concrete file artifact (csv, json, parquet…) listed as a resource in datapackage.json
Dataset lifecycle — a dataset doesn't need to be complete on day one:
- capture — just a URL or note, intent to explore
- stub — minimal entry: title, description, source link, no files yet
- archived — raw files downloaded locally
- structured — cleaned, normalised, schema documented
- enriched — analysis, visualisations, derived data added
- monitored — living source, versioned and updated over time
Catalog-as-repo pattern: if the source is a portal or collection containing many datasets (e.g. a data.gov agency, an institutional archive), give it its own repo and DataHub publication — not a subfolder here.
Structure
epoch-data-on-ai-models/
datapackage.json # dataset metadata and resource list
data/ # data files (csv, json, parquet, etc.)
.datahubignore # files to exclude when pushing (gitignore syntax)
datapackage.json
Keep resources in sync with what's in data/:
{
"name": "epoch-data-on-ai-models",
"title": "Human readable title",
"description": "What this dataset is about",
"resources": [
{
"path": "data/my-file.csv",
"name": "my-file",
"mediatype": "text/csv"
}
]
}
Workflow
# Add data files to data/
# Edit datapackage.json — update resources to list them
data pack . # validate
dh push . # publish to DataHub
Key rules
- Every file in
data/that you want published must be listed inresources namein datapackage.json must be URL-safe (lowercase, hyphens)- Use
.datahubignoreto exclude scratch files, large intermediaries, etc. - It is fine to push a stub — set lifecycle stage in
datapackage.jsonas"status": "stub"if incomplete