Air quality

london

Files Size Format Created Updated License Source
3 7MB csv zip 4 years ago 4 years ago Open Government Licence
This dataset was scraped from London data website. The data shows roadside and background average readings for Nitric Oxide, Nitrogen Dioxide, Oxides of Nitrogen, Ozone, Particulate Matter (PM10 and PM2.5), and Sulphur Dioxide. Measured in Micrograms per Cubic Meter of Air (ug/m3). The spreadsheet read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
monthly-averages 34kB csv (34kB) , json (125kB)
time-of-day-per-month 787kB csv (787kB) , json (3MB)
air-quality_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 660kB zip (660kB)

monthly-averages  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Month 1 any (default)
London Mean Roadside Nitric Oxide (ug/m3) 2 any (default)
London Mean Roadside Nitrogen Dioxide (ug/m3) 3 number (default)
London Mean Roadside Oxides of Nitrogen (ug/m3) 4 any (default)
London Mean Roadside Ozone (ug/m3) 5 number (default)
London Mean Roadside PM10 Particulate (ug/m3) 6 number (default)
London Mean Roadside PM2.5 Particulate (ug/m3) 7 number (default)
London Mean Roadside Sulphur Dioxide (ug/m3) 8 number (default)
London Mean Background Nitric Oxide (ug/m3) 9 any (default)
London Mean Background Nitrogen Dioxide (ug/m3) 10 number (default)
London Mean Background Oxides of Nitrogen (ug/m3) 11 any (default)
London Mean Background Ozone (ug/m3) 12 number (default)
London Mean Background PM10 Particulate (ug/m3) 13 number (default)
London Mean Background PM2.5 Particulate (ug/m3) 14 any (default)
London Mean Background Sulphur Dioxide (ug/m3) 15 number (default)

time-of-day-per-month  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Month (text) 1 string (default)
GMT 2 string (default)
London Mean Roadside Nitric Oxide (ug/m3) 3 string (default)
London Mean Roadside Nitrogen Dioxide (ug/m3) 4 number (default)
London Mean Roadside Oxides of Nitrogen (ug/m3) 5 string (default)
London Mean Roadside Ozone (ug/m3) 6 number (default)
London Mean Roadside PM10 Particulate (ug/m3) 7 number (default)
London Mean Roadside PM2.5 Particulate (ug/m3) 8 number (default)
London Mean Roadside Sulphur Dioxide (ug/m3) 9 number (default)
London Mean Background Nitric Oxide (ug/m3) 10 string (default)
London Mean Background Nitrogen Dioxide (ug/m3) 11 number (default)
London Mean Background Oxides of Nitrogen (ug/m3) 12 string (default)
London Mean Background Ozone (ug/m3) 13 number (default)
London Mean Background PM10 Particulate (ug/m3) 14 number (default)
London Mean Background PM2.5 Particulate (ug/m3) 15 any (default)
London Mean Background Sulphur Dioxide (ug/m3) 16 number (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/london/air-quality
data info london/air-quality
tree london/air-quality
# Get a list of dataset's resources
curl -L -s https://datahub.io/london/air-quality/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/london/air-quality/r/0.csv

curl -L https://datahub.io/london/air-quality/r/1.csv

curl -L https://datahub.io/london/air-quality/r/2.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/london/air-quality/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/london/air-quality/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/london/air-quality/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/london/air-quality/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

This dataset was scraped from London data website.

The data shows roadside and background average readings for Nitric Oxide, Nitrogen Dioxide, Oxides of Nitrogen, Ozone, Particulate Matter (PM10 and PM2.5), and Sulphur Dioxide. Measured in Micrograms per Cubic Meter of Air (ug/m3). The spreadsheet shows which Index level each reading falls in, and contains charts showing pollutant levels by time of day per month.

Data

Dataset used for this scraping have been found on London Average Air Quality Levels.

Output data is located in data directory, it consists of two csv files:

  • monthly-averages.csv
  • time-of-day-per-month.csv

Preparation

You will need Python 3.6 or greater and dataflows library to run the script

To update the data run the process script locally:

# Install dataflows
pip install dataflows

# Run the script
python london-air-quality.py

License

Open Government Licence

You are encouraged to use and re-use the Information that is available under this licence freely and flexibly, with only a few conditions. Using Information under this licence Use of copyright and database right material expressly made available under this licence (the ‘Information’) indicates your acceptance of the terms and conditions below. The Licensor grants you a worldwide, royalty-free, perpetual, non-exclusive licence to use the Information subject to the conditions below. This licence does not affect your freedom under fair dealing or fair use or any other copyright or database right exceptions and limitations.

You may find further information here

Datapackage.json