London public journeys by type of transport

london

Files Size Format Created Updated License Source
2 194kB csv zip 2 years ago 2 years ago
London public journeys by type of transport - this dataset was scrapped from London data Data The dataset is inside data folder. The data presents number of journeys on the public transport network by TFL reporting period, by type of transport. Data is broken down by bus, underground, DLR, tram, read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
london-public-transport 33kB csv (33kB) , json (51kB)
public-transport_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 40kB zip (40kB)

london-public-transport  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Period and Financial year 1 string (default)
Reporting Period 2 integer (default)
Days in period 3 integer (default)
Period beginning 4 datetime (%Y-%m-%d %H:%M:%S)
Period ending 5 datetime (%Y-%m-%d %H:%M:%S)
Bus journeys (m) 6 number (default)
Underground journeys (m) 7 number (default)
DLR Journeys (m) 8 number (default)
Tram Journeys (m) 9 number (default)
Overground Journeys (m) 10 any (default)
Emirates Airline Journeys (m) 11 any (default)
TfL Rail Journeys (m) 12 any (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/london/public-transport
data info london/public-transport
tree london/public-transport
# Get a list of dataset's resources
curl -L -s https://datahub.io/london/public-transport/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/london/public-transport/r/0.csv

curl -L https://datahub.io/london/public-transport/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/london/public-transport/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/london/public-transport/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/london/public-transport/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/london/public-transport/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

London public journeys by type of transport - this dataset was scrapped from London data

Data

The dataset is inside data folder. The data presents number of journeys on the public transport network by TFL reporting period, by type of transport. Data is broken down by bus, underground, DLR, tram, Overground and cable car.

  • Period lengths are different in periods 1 and 13, and the data is not adjusted to account for that.
  • Docklands Light Railway journeys are based on automatic passenger counts at stations.
  • Overground and Tram journeys are based on automatic on-carriage passenger counts.
  • Reliable Overground journey numbers have only been available since October 2010.

The Emirates Air Line cable car service began 28 June 2012.

Preparation

You will need Python 3.6 or greater and dataflows library to run the script

To update the data run the process script locally:

# Install dataflows
pip install dataflows

# Run the script
python london-data.py:

Licence

Open goverment licence

You are encouraged to use and re-use the Information that is available under this licence freely and flexibly, with only a few conditions. Using Information under this licence Use of copyright and database right material expressly made available under this licence (the ‘Information’) indicates your acceptance of the terms and conditions below. The Licensor grants you a worldwide, royalty-free, perpetual, non-exclusive licence to use the Information subject to the conditions below. This licence does not affect your freedom under fair dealing or fair use or any other copyright or database right exceptions and limitations.

You may find further information here

Datapackage.json