Registry of Core Datasets
Files | Size | Format | Created | Updated | License | Source |
---|---|---|---|---|---|---|
1 | 13.8 kB | csv | over 6 years ago | Open Data Commons Public Domain Dedication and License v1.0 |
Registry of published datasets in the Core Datasets Project
Data Files
File | Description | Size | Last modified | Download |
---|---|---|---|---|
core-list | 13.8 kB | over 6 years ago | core-list |
Data Previews
core-list
Schema
name | type | description |
---|---|---|
name | string | Name of the dataset |
github_url | string | The location in GitHub |
run_date | string | Last run date |
modified | string | Frequency information (year-A, quarter-Q, month-M, day-D, no-N) |
validated_metadata | string | Metadata validation status |
validated_data | string | Data validation status |
published | string | Published location on DataHub |
ok_on_datahub | string | Status on DataHub |
validated_metadata_message | string | Error messages if validation fails |
validated_data_message | string | Error messages if validation fails |
auto_publish | string | Published by DataHub automatically |
Core data registry and tooling.
Registry
Registry is maintained as Tabular Data Package with list of datasets in core-list.csv.
To add a dataset add it to the core-list.csv
- we recommend fork and pull.
Discussion of proposals for new datasets and for incorporation of prepared datasets takes place in the issues.
To propose a new dataset for inclusion, please create a new issue.
Core Dataset Tools
Installation
$ npm install
Usage
- Environmental variables
DOMAIN
- testing or production environment. For example: https://datahub.io
TYPE
- type of dataset. For example: examples or core
node index.js [COMMAND] [PATH]
# PATH - path to csv file
Clone datasets
To clone all core datasets run the following command:
node index.js clone [PATH]
It will clone all core datasets into following directory: data/${pkg_name}
Check datasets
To check all core datasets run the following command:
node index.js check [PATH]
It will validate metadata and data according to the latest spec.
Normalize datasets
To normalize all core datasets run the following command:
node index.js norm [PATH]
It will normalize all core datasets into following directory: data/${pkg_name}
Push datasets
To publish all core data packages run the following command:
node index.js push [PATH]
Running tests
We use Ava for our tests. For running tests use:
$ [sudo] npm test
To run tests in watch mode:
$ [sudo] npm run watch:test