Files Size Format Created Updated License Source
4 0B xlsx csv zip 7 months ago

Data Files

File Description Size Last changed Download
sample-2sheets 5kB xlsx (5kB)
sample-2sheets-sheet-1 64B csv (64B) , json (166B)
sample-2sheets-sheet-2 64B csv (64B) , json (166B)
datapackage_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 14kB zip (14kB)


This is a preview version. There might be more data in the original version.


This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
header1 1 string (default)
header2 2 string (default)
header3 3 string (default)


This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
header4 1 string (default)
header5 2 string (default)
header6 3 string (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get
data info anuveyatsu/sample-2sheets-blue-eagle-55
tree anuveyatsu/sample-2sheets-blue-eagle-55
# Get a list of dataset's resources
curl -L -s | grep path

# Get resources

curl -L

curl -L

curl -L

curl -L

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="")

json_file <- ''
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = ''

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('')

# print list of all resources:

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = ''

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data