API AccessAccess dataset files directly from scripts, code, or AI agents.
Browse dataset files
Access dataset files directly from scripts, code, or AI agents.
Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.
Start with these files — they give you everything you need to understand and access the dataset.
- 1. Fetch datapackage.json to inspect schema and resources
- 2. Download data resources listed in datapackage.json
- 3. Read README.md for full context
Data Views
Data Previews
wealth-distribution-global
Schema
| name | type | description |
|---|---|---|
| year | integer | Year of observation |
| wealth_band | string | Wealth band label (e.g. under $10k, $10k-$100k, $100k-$1M, over $1M) |
| lower_bound_usd | number | Lower bound of the wealth band in USD (0 for the lowest band) |
| upper_bound_usd | number | Upper bound of the wealth band in USD (empty for the top band) |
| adults_millions | number | Number of adults in this wealth band (millions) |
| adults_share_pct | number | Share of global adult population in this wealth band (%) |
| wealth_usd_billions | number | Total wealth held by adults in this band (USD billions) |
| wealth_share_pct | number | Share of global private wealth held by this band (%) |
wealth-by-country
Schema
| name | type | description |
|---|---|---|
| year | integer | Year of observation |
| country | string | Country name |
| country_code | string | ISO 3166-1 alpha-3 country code |
| adults_millions | number | Adult population (millions) |
| mean_wealth_usd | number | Mean wealth per adult (USD) |
| median_wealth_usd | number | Median wealth per adult (USD) |
| total_wealth_usd_billions | number | Total household wealth (USD billions) |
| gini | number | Wealth Gini coefficient (0–100 scale; higher = more unequal) |
Data Files
| File | Description | Size | Last modified | Download |
|---|---|---|---|---|
wealth-distribution-global | Global wealth pyramid by wealth band, 2010–2023. Each row represents one wealth band for one year, showing the number of adults in that band, their share of the global adult population, the total wealth held, and its share of global private wealth. | 2.2 kB | about 1 month ago | wealth-distribution-global |
wealth-by-country | Country-level wealth statistics for 20 major economies, 2023. Includes mean and median wealth per adult, total household wealth, adult population, and the wealth Gini coefficient. | 1.08 kB | about 1 month ago | wealth-by-country |
| Files | Size | Format | Created | Updated | License | Source |
|---|---|---|---|---|---|---|
| 2 | 3.28 kB | csv | about 1 month ago | Open Data Commons Public Domain Dedication and License | Credit Suisse Research Institute — Global Wealth Databook 2016 |
DataPressr — AI Agent Instructions
You are helping wrangle raw data finds into clean, publishable datasets on DataHub.
Concepts
Data hierarchy
- Catalog — a collection of datasets. Maps to one GitHub repo + one DataHub publication. Example: "World Bank Open Data", "Our World in Data".
- Dataset — a coherent data concept with defined schema and coverage. One directory, one
datapackage.json. Example: "World GDP 1960–2024". - Data file — a concrete file artifact (csv, json, parquet…). Listed as a resource in
datapackage.json.
Catalog-as-repo rule: if the source is a portal or collection containing many datasets, give it its own repo and DataHub publication — not a subfolder inside another dataset.
Dataset lifecycle
A dataset doesn't need to be complete to be published. Lifecycle stages:
| Stage | Description |
|---|---|
capture | Just a URL or note — intent to explore |
stub | Title, description, source link. No files yet. Publishable. |
archived | Raw files downloaded locally |
structured | Cleaned, normalised, schema documented |
enriched | Analysis, visualisations, derived data added |
monitored | Living source, versioned and updated over time |
Set "status": "<stage>" in datapackage.json to track this.
Dataset structure
Every dataset is a directory:
<name>/
datapackage.json # metadata and resource list (required)
data/ # data files go here
.datahubignore # gitignore-style exclusions for dh push
AGENTS.md # this file (copy into new datasets)
datapackage.json
Minimal valid example:
{
"name": "world-gdp",
"title": "World GDP",
"description": "GDP by country from World Bank, 1960–2024",
"status": "structured",
"resources": [
{
"path": "data/gdp.csv",
"name": "gdp",
"title": "GDP by Country",
"mediatype": "text/csv"
}
]
}
Rules:
namemust be URL-safe: lowercase, hyphens only- Every file in
data/that should be published must be inresources statusshould reflect the lifecycle stage above- Use
.datahubignoreto exclude scratch files, large intermediaries, raw downloads
Adding charts (views)
Add a views array to datapackage.json to render charts on the dataset page:
{
"views": [
{
"name": "gdp-over-time",
"title": "GDP Over Time",
"specType": "simple",
"resources": ["gdp"],
"spec": {
"type": "line",
"group": "year",
"series": ["gdp_usd"]
}
}
]
}
Supported chart types: line, bar, lines-and-points. Only CSV and GeoJSON resources can be visualised. group is the x-axis field, series is the list of y-axis fields.
Workflow
Start a new dataset
Create the directory structure:
mkdir -p <name>/data
cd <name>
Create datapackage.json with at minimum name, title, description. Add "status": "stub" if no data files yet.
Copy this AGENTS.md into the new directory so future AI sessions have context.
Push to DataHub
dh push .
Requires env vars:
export DATAHUB_API_URL=https://datahub.io
export DATAHUB_API_TOKEN=<your-token>
export DATAHUB_PUBLICATION=<your-publication-slug>
dh is the DataHub CLI — install from datopian/datahub-next.
Delete a dataset
dh delete <name>
Claude Code skills
If using Claude Code, the following slash commands are available in this repo:
| Command | What it does |
|---|---|
/init <name> | Scaffold a new dataset directory |
/push | Push current directory to DataHub |
/validate | Check datapackage.json for common issues |