This post walks you through the major changes in the Data Package v1 specs compared to pre-v1. It covers changes in the full suite of Data Package specifications including Data Resources and Table Schema. It is particularly valuable if:

  • you were using Data Packages pre v1 and want to know how to upgrade your datasets
  • if you are implementing Data Package related tooling and want to know how to upgrade your tools or want to support or auto-upgrade pre-v1 Data Packages for backwards compatibility

It also includes a script we have created (in JavaScript) that we've been using ourselves to automate upgrades of the Core Data.

The Changes

Two major changes in v1 were presentational:

  • Creating Data Resource as a separate spec from Data Package. This did not change anything substantive in terms of how data packages worked but is important presentationally. In parallel, we also split out a Tabular Data Resource from the Tabular Data Package.
  • Renaming JSON Table Schema to just Table Schema

In addition, there were a fair number of substantive changes. We summarize these in the sections below. For more detailed info see the current specifications and the old site containing the pre spec v1 specifications.

Table Schema

Link to spec: https://specs.frictionlessdata.io/table-schema/

Data Resource

Link to spec: https://specs.frictionlessdata.io/data-resource/

Note: Data Resource did not exist as a separate spec pre-v1 so strictly we are comparing the Data Resource section of the old Data Package spec with the new Data Resource spec.

Tabular Data Resource

Link to spec: https://specs.frictionlessdata.io/data-resource/

Just as Data Resource split out from Data Package so Tabular Data Resource split out from the old Tabular Data Package spec.

There were no significant changes here beyond those in Data Resource.

Data Package

Link to spec: https://specs.frictionlessdata.io/data-package/

Tabular Data Package

Link to spec: https://specs.frictionlessdata.io/tabular-data-package/

Tabular Data Package is unchanged.

Profiles

Profiles arrived in v1:

http://specs.frictionlessdata.io/profiles/

Profiles are the first step on supporting a rich ecosystem of "micro-schemas" for data. They provide a very simple way to quickly state that your data follows a specific structure and/or schema. From the docs:

Different kinds of data need different formats for their data and metadata. To support these different data and metadata formats we need to extend and specialise the generic Data Package. These specialized types of Data Package (or Data Resource) are termed profiles.

For example, there is a Tabular Data Package profile that specializes Data Packages specifically for tabular data. And there is a "Fiscal" Data Package profile designed for government financial data that includes requirements that certain columns are present in the data e.g. Amount or Date and that they contain data of certain types.

We think profiles are an easy, lightweight way to starting adding more structure to your data.

Profiles can be specified on both resources and packages.

Automate upgrading your descriptor according to the spec v1

We have created a data package normalization script that you can use to automate the process of upgrading a datapackage.json or Table Schema from pre-v1 to v1.

The script enables you to automate updating your datapackage.json for the following properties: path, contributors, resources, sources and licenses.

This is a simple script that you can download directly from here:

https://raw.githubusercontent.com/datahq/datapackage-normalize-js/master/normalize.js

e.g. using wget:

wget https://raw.githubusercontent.com/datahq/datapackage-normalize-js/master/normalize.js
# path (optional) is the path to datapackage.json
# if not provided looks in current directory
normalize.js [path]

# prints out updated datapackage.json

You can also use as a library:

# install it from npm
npm install datapackage-normalize

so you can use it in your javascript:

const normalize = require('datapackage-normalize')

const path = 'path/to/datapackage.json'
normalize(path)

Conclusion

The above summarizes the main changes for v1 of Data Package suite of specs and instructions on how to upgrade.

If you want to see specification for more details, please visit Data Package specifications. You can also visit the Frictionless Data initiative for more information about Data Packages.