API AccessAccess dataset files directly from scripts, code, or AI agents.
Browse dataset files
Access dataset files directly from scripts, code, or AI agents.
Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.
Start with these files — they give you everything you need to understand and access the dataset.
- 1. Fetch datapackage.json to inspect schema and resources
- 2. Download data resources listed in datapackage.json
- 3. Read README.md for full context
Data Views
Data Previews
Satellite Catalog (satcat)
Schema
| name | type | format | description | constraints | title |
|---|---|---|---|---|---|
| jcat | string | Jonathan's Catalog ID — unique GCAT identifier. Prefix letter indicates catalog (S = standard satcat). | JCAT ID | ||
| satcat | string | US Space Force / NORAD catalog number. Not set for all objects. | NORAD Catalog Number | ||
| name | string | Official or common name of the space object. | Object Name | ||
| launch_date | string | default | Date of launch (ISO 8601, YYYY-MM-DD). Partial dates (year only) appear when exact date is uncertain. | Launch Date | |
| launch_year | number | Four-digit launch year, extracted from launch_date. Suitable for aggregation. | Launch Year | ||
| object_type | string | Simplified object classification. Values: Payload, Rocket Body, Debris, Component, Suborbital Payload, Unknown. Derived from the GCAT SatType byte 1. | { "enum": [ "Payload", "Rocket Body", "Debris", "Component", "Suborbital Payload", "Unknown" ] } | Object Type | |
| state | string | ISO country code of the owning nation or organization (e.g. US, SU, CN, RU, FR). Historical codes like SU (Soviet Union) are preserved. | State | ||
| owner | string | Abbreviated name of the owning organization or agency. | Owner | ||
| status | string | Current or final orbital status. Values: In Orbit, Decayed, Deorbited, Beyond Earth Orbit, Exploded. | Status | ||
| orbit_class | string | Operational orbit category code (e.g. LEO/I, GEO/S, MEO, HEO). See https://planet4589.org/space/gcat/web/intro/orbits.html for definitions. | Orbit Class | ||
| perigee_km | number | Perigee altitude above Earth's surface in kilometres, at last known orbital epoch. | Perigee (km) | ||
| apogee_km | number | Apogee altitude above Earth's surface in kilometres, at last known orbital epoch. | Apogee (km) | ||
| inclination_deg | number | Orbital inclination in degrees, at last known orbital epoch. | Inclination (degrees) |
Objects Launched per Year by Type
Schema
| name | type | description | title |
|---|---|---|---|
| year | number | Year | |
| Payload | number | Number of payloads launched. | Payloads |
| Rocket Body | number | Number of rocket bodies (launch vehicle stages) tracked. | Rocket Bodies |
| Debris | number | Number of fragmentation debris pieces tracked. | Debris |
| Component | number | Number of payload components tracked. | Components |
Data Files
| File | Description | Size | Last modified | Download |
|---|---|---|---|---|
satcat | Standard catalog of all artificial space objects. One row per phase; most objects have a single phase. Covers all objects ever tracked in Earth orbit and beyond. | 6.41 MB | about 2 months ago | satcat |
objects-per-year | Pre-aggregated count of objects launched per year, broken down by object type. Used for the bar chart view. | 1.46 kB | about 2 months ago | objects-per-year |
| Files | Size | Format | Created | Updated | License | Source |
|---|---|---|---|---|---|---|
| 2 | 6.41 MB | csv | about 2 months ago | Open Data Commons Public Domain Dedication and License | GCAT — Jonathan McDowell's General Catalog of Artificial Space Objects |
DataPressr — AI Agent Instructions
You are helping wrangle raw data finds into clean, publishable datasets on DataHub.
Concepts
Data hierarchy
- Catalog — a collection of datasets. Maps to one GitHub repo + one DataHub publication. Example: "World Bank Open Data", "Our World in Data".
- Dataset — a coherent data concept with defined schema and coverage. One directory, one
datapackage.json. Example: "World GDP 1960–2024". - Data file — a concrete file artifact (csv, json, parquet…). Listed as a resource in
datapackage.json.
Catalog-as-repo rule: if the source is a portal or collection containing many datasets, give it its own repo and DataHub publication — not a subfolder inside another dataset.
Dataset lifecycle
A dataset doesn't need to be complete to be published. Lifecycle stages:
| Stage | Description |
|---|---|
capture | Just a URL or note — intent to explore |
stub | Title, description, source link. No files yet. Publishable. |
archived | Raw files downloaded locally |
structured | Cleaned, normalised, schema documented |
enriched | Analysis, visualisations, derived data added |
monitored | Living source, versioned and updated over time |
Set "status": "<stage>" in datapackage.json to track this.
Dataset structure
Every dataset is a directory:
<name>/
datapackage.json # metadata and resource list (required)
data/ # data files go here
.datahubignore # gitignore-style exclusions for dh push
AGENTS.md # this file (copy into new datasets)
datapackage.json
Minimal valid example:
{
"name": "world-gdp",
"title": "World GDP",
"description": "GDP by country from World Bank, 1960–2024",
"status": "structured",
"resources": [
{
"path": "data/gdp.csv",
"name": "gdp",
"title": "GDP by Country",
"mediatype": "text/csv"
}
]
}
Rules:
namemust be URL-safe: lowercase, hyphens only- Every file in
data/that should be published must be inresources statusshould reflect the lifecycle stage above- Use
.datahubignoreto exclude scratch files, large intermediaries, raw downloads
Adding charts (views)
Add a views array to datapackage.json to render charts on the dataset page:
{
"views": [
{
"name": "gdp-over-time",
"title": "GDP Over Time",
"specType": "simple",
"resources": ["gdp"],
"spec": {
"type": "line",
"group": "year",
"series": ["gdp_usd"]
}
}
]
}
Supported chart types: line, bar, lines-and-points. Only CSV and GeoJSON resources can be visualised. group is the x-axis field, series is the list of y-axis fields.
Workflow
Start a new dataset
Create the directory structure:
mkdir -p <name>/data
cd <name>
Create datapackage.json with at minimum name, title, description. Add "status": "stub" if no data files yet.
Copy this AGENTS.md into the new directory so future AI sessions have context.
Push to DataHub
dh push .
Requires env vars:
export DATAHUB_API_URL=https://datahub.io
export DATAHUB_API_TOKEN=<your-token>
export DATAHUB_PUBLICATION=<your-publication-slug>
dh is the DataHub CLI — install from datopian/datahub-next.
Delete a dataset
dh delete <name>
Claude Code skills
If using Claude Code, the following slash commands are available in this repo:
| Command | What it does |
|---|---|
/init <name> | Scaffold a new dataset directory |
/push | Push current directory to DataHub |
/validate | Check datapackage.json for common issues |