datahub-qa unlisted

examples

Files Size Format Created Updated License Source
2 48kB csv zip 1 year ago 1 year ago
Bugs, issues and suggestions re datahub.io. Create a new issue If you have found a bug, have a suggestion for a feature or just have a question about DataHub.io, please: Create a new issue » Note on our use of severities (choose the correct label): Critical The system or a key read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
issues 1kB csv (1kB) , json (5kB)
datahub-qa-issues-tracker_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 4kB zip (4kB)

issues  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
date 1 date (%Y-%m-%d)
Critical 2 integer (default)
Major 3 integer (default)
Minor 4 integer (default)
Trivial 5 integer (default)
NEW FEATURE 6 integer (default)
closed 7 integer (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/examples/datahub-qa-issues-tracker
data info examples/datahub-qa-issues-tracker
tree examples/datahub-qa-issues-tracker
# Get a list of dataset's resources
curl -L -s https://datahub.io/examples/datahub-qa-issues-tracker/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/examples/datahub-qa-issues-tracker/r/0.csv

curl -L https://datahub.io/examples/datahub-qa-issues-tracker/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/examples/datahub-qa-issues-tracker/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/examples/datahub-qa-issues-tracker/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/examples/datahub-qa-issues-tracker/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/examples/datahub-qa-issues-tracker/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

datahub

Bugs, issues and suggestions re datahub.io.

Create a new issue

If you have found a bug, have a suggestion for a feature or just have a question about DataHub.io, please:

Create a new issue »

Note on our use of severities (choose the correct label):

  • Critical
    • The system or a key scenario is broken, no workaround
    • System performance is highly degraded
    • High impact on user
    • Major security breach
    • Embarrassing
  • Major
    • The system or a key scenario is broken, with a workaround
    • A common scenario is broken, no workaround
    • System performance is moderately degraded
    • Moderate impact on user
  • Minor
    • A common scenario is broken, with a workaround
    • An uncommon scenario is broken, no workaround
    • System performance is slightly degraded
    • Low impact on user
  • Trivial
    • An uncommon scenario is broken, with a workaround
    • No impact on user
    • Minor cosmetic issues

We also have a Blocker label in cases where this issue blocks someone from working.

Chat Resources

If you would prefer to get help via live chat rather than the issue tracker in this repository, you can try:

Gitter Datahub.io room

Daily statistics

Datahub-QA issues dataset shows daily statistics for this issue-tracker repo. Watch it here: https://datahub.io/examples/datahub-qa-issues-tracker

Data

Dataset count issues with labels:
’Critical’, ‘Major’, ‘Minor’, ‘Trivial’, ‘NEW FEATURE’, 'closed’

Last column shows how many issues was closed on this date (issues with ‘Duplicate’ label not counts)

Data is sourced from the github API. Process is recorded and automated in python script. Data updates daily using Travis cron job.

Datapackage.json