API AccessAccess dataset files directly from scripts, code, or AI agents.
Browse dataset files
Access dataset files directly from scripts, code, or AI agents.
Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.
Start with these files — they give you everything you need to understand and access the dataset.
- 1. Fetch datapackage.json to inspect schema and resources
- 2. Download data resources listed in datapackage.json
- 3. Read README.md for full context
Data Previews
core-list
Schema
| name | type | description |
|---|---|---|
| name | string | Name of the dataset |
| github_url | string | The location in GitHub |
| run_date | string | Last run date |
| modified | string | Frequency information (year-A, quarter-Q, month-M, day-D, no-N) |
| validated_metadata | string | Metadata validation status |
| validated_data | string | Data validation status |
| published | string | Published location on DataHub |
| ok_on_datahub | string | Status on DataHub |
| validated_metadata_message | string | Error messages if validation fails |
| validated_data_message | string | Error messages if validation fails |
| auto_publish | string | Published by DataHub automatically |
Data Files
| File | Description | Size | Last modified | Download |
|---|---|---|---|---|
core-list | 13.8 kB | 26 days ago | core-list |
| Files | Size | Format | Created | Updated | License | Source |
|---|---|---|---|---|---|---|
| 1 | 13.8 kB | over 1 year ago | Open Data Commons Public Domain Dedication and License v1.0 |
Core data registry and tooling.
Registry
Registry is maintained as Tabular Data Package with list of datasets in core-list.csv.
To add a dataset add it to the core-list.csv - we recommend fork and pull.
Discussion of proposals for new datasets and for incorporation of prepared datasets takes place in the issues.
To propose a new dataset for inclusion, please create a new issue.
Core Dataset Tools
Installation
$ npm install
Usage
- Environmental variables
DOMAIN - testing or production environment. For example: https://datahub.io
TYPE - type of dataset. For example: examples or core
node index.js [COMMAND] [PATH]
# PATH - path to csv file
Clone datasets
To clone all core datasets run the following command:
node index.js clone [PATH]
It will clone all core datasets into following directory: data/${pkg_name}
Check datasets
To check all core datasets run the following command:
node index.js check [PATH]
It will validate metadata and data according to the latest spec.
Normalize datasets
To normalize all core datasets run the following command:
node index.js norm [PATH]
It will normalize all core datasets into following directory: data/${pkg_name}
Push datasets
To publish all core data packages run the following command:
node index.js push [PATH]
Running tests
We use Ava for our tests. For running tests use:
$ [sudo] npm test
To run tests in watch mode:
$ [sudo] npm run watch:test