Crime

london

Files Size Format Created Updated License Source
3 772kB csv zip 2 years ago 2 years ago Open Government Licence
This dataset was scraped from London data website. Numbers of recorded offences, and rates of offences per thousand population, by broad crime grouping, by financial year and borough. Rate is given as per thousand population, and are calculated using mid-year population from the first part of the read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
crime-rates 45kB csv (45kB) , json (451kB)
recorded-offences 25kB csv (25kB) , json (57kB)
crime_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 50kB zip (50kB)

crime-rates  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Code 1 any (default)
Borough 2 string (default)
Mid-year estimates 1999 3 any (default)
Mid-year estimates 2000 4 any (default)
Mid-year estimates 2001 5 any (default)
Mid-year estimates 2002 6 any (default)
Mid-year estimates 2003 7 any (default)
Mid-year estimates 2004 8 any (default)
Mid-year estimates 2005 9 any (default)
Mid-year estimates 2006 10 any (default)
Mid-year estimates 2007 11 any (default)
Mid-year estimates 2008 12 any (default)
Mid-year estimates 2009 13 any (default)
Mid-year estimates 2010 14 any (default)
Mid-year estimates 2011 15 any (default)
Mid-year estimates 2012 16 any (default)
Mid-year estimates 2013 17 any (default)
Mid-year estimates 2014 18 any (default)
Mid-year estimates 2015 19 any (default)
Mid-year estimates 2016 20 any (default)
Year 21 date (%Y-%m-%d)
Value 22 any (default)

recorded-offences  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Code 1 any (default)
Borough 2 string (default)
Year 3 date (%Y-%m-%d)
Value 4 any (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/london/crime
data info london/crime
tree london/crime
# Get a list of dataset's resources
curl -L -s https://datahub.io/london/crime/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/london/crime/r/0.csv

curl -L https://datahub.io/london/crime/r/1.csv

curl -L https://datahub.io/london/crime/r/2.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/london/crime/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/london/crime/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/london/crime/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/london/crime/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

This dataset was scraped from London data website.

Numbers of recorded offences, and rates of offences per thousand population, by broad crime grouping, by financial year and borough.

Rate is given as per thousand population, and are calculated using mid-year population from the first part of the financial year eg For Financial year 2008-09, mid-year estimates for 2008 are used.

Offences: These are confirmed reports of crimes being committed. All data relates to “notifiable offences” - which are designated categories of crimes that all police forces in England and Wales are required to report to the Home Office Crime rates are not available for Heathrow due to no population figures

There were changes to the police recorded crime classifications from April 2012. Therefore caution should be used when comparing sub-groups of crime figures from 2012/13 with earlier years.

Action Fraud have taken over the recording of fraud offences on behalf of individual police forces. This process began in April 2011 and was rolled out to all police forces by March 2013. Due to this change caution should be applied when comparing data over this transitional period and with earlier years.

Data

Dataset used for this scraping have been found on Recorderd crime: Borugh Rates.

Output data is located in data directory, it consists of two csv files:

  • crime-rates.csv
  • recorded-offences.csv

Preparation

You will need Python 3.6 or greater and dataflows library to run the script

To update the data run the process script locally:

# Install dataflows
pip install dataflows

# Run the script
python london-crime.py

License

Open Government Licence

You are encouraged to use and re-use the Information that is available under this licence freely and flexibly, with only a few conditions. Using Information under this licence Use of copyright and database right material expressly made available under this licence (the ‘Information’) indicates your acceptance of the terms and conditions below. The Licensor grants you a worldwide, royalty-free, perpetual, non-exclusive licence to use the Information subject to the conditions below. This licence does not affect your freedom under fair dealing or fair use or any other copyright or database right exceptions and limitations.

You may find further information here

Datapackage.json