US House Price Index (Case-Shiller)

core

Files Size Format Created Updated License Source
2 360kB csv zip 1 month ago public_domain_dedication_and_license Standard and Poors Case-Shiller Indices
Case-Shiller Index of US residential house prices. Data comes from S&P Case-Shiller data and includes both the national index and the indices for 20 metropolitan regions. The indices are created using a repeat-sales methodology. Data As per the home page for Indices on S&P website: > The read more
Download

Data Files

File Description Size Last changed Download Other formats
cities [csv] Case-Shiller US home price index levels at national and city level. Monthly. 52kB cities [csv] cities [json] (183kB)
datapackage_zip [zip] Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 72kB datapackage_zip [zip]

cities  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Date 1 date (%Y-%m-%d)
AZ-Phoenix 2 number
CA-Los Angeles 3 number
CA-San Diego 4 number
CA-San Francisco 5 number
CO-Denver 6 number
DC-Washington 7 number
FL-Miami 8 number
FL-Tampa 9 number
GA-Atlanta 10 number
IL-Chicago 11 number
MA-Boston 12 number
MI-Detroit 13 number
MN-Minneapolis 14 number
NC-Charlotte 15 number
NV-Las Vegas 16 number
NY-New York 17 number
OH-Cleveland 18 number
OR-Portland 19 number
TX-Dallas 20 number
WA-Seattle 21 number
Composite-10 22 number
Composite-20 23 number
National-US 24 number

datapackage_zip  

This is a preview version. There might be more data in the original version.

Read me

Case-Shiller Index of US residential house prices. Data comes from S&P Case-Shiller data and includes both the national index and the indices for 20 metropolitan regions. The indices are created using a repeat-sales methodology.

Data

As per the home page for Indices on S&P website:

The S&P/Case-Shiller U.S. National Home Price Index is a composite of single-family home price indices for the nine U.S. Census divisions and is calculated monthly. It is included in the S&P/Case-Shiller Home Price Index Series which seeks to measure changes in the total value of all existing single-family housing stock.

Documentation of the methodology can be found at: http://www.spindices.com/documents/methodologies/methodology-sp-cs-home-price-indices.pdf

Key points are (excerpted from methodology):

  • The indices use the “repeat sales method” of index calculation which uses data on properties that have sold at least twice, in order to capture the true appreciated value of each specific sales unit.
  • The quarterly S&P/Case-Shiller U.S. National Home Price Index aggregates nine quarterly U.S. Census division repeat sales indices using a base period a nd estimates of the aggregate value of single family housing stock for those periods.
  • The S&P/Case - Shiller Home Price Indices originated in the 1980s by Case Shiller Weiss’s research principals, Karl E. Case and Robert J. Shiller. At the time, Case and Shiller developed the repeat sales pricing technique. This methodology is recognized as the most reliable means to measure housing price movements and is used by other home price ind ex publishers, including the Office of Federal Housing Enterprise Oversight (OFHEO)

Preparation

To download and process the data do:

python scripts/process.py

Updated data files will then be in data directory.

Note: the URLs and structure of the source data have evolved over time with the source data URLs changing on every release.

Originally (2013) the site provided a table of links but these are not direct file URLs and you have dig around in S&P’s javascript to find the actual download locations. As of mid-2014 the data is consolidated in one primary XLS but the HTML you see in your browser and the source HTML are different. In addition, the actual location of the XLS file continues to change on each release.

License

Any rights of the maintainer are licensed under the PDDL. Exact legal status of source data (and hence of resulting processe data) is unclear but could have a presumption of public domain given its factual nature and US provenance. However, the current application of PDDL is indicative of maintainers best-guess (and comes with no warranty).

Import into your tool

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite")
library("jsonlite")

json_file <- "http://datahub.io/core/house-prices-us/datapackage.json"
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# access csv file by the index starting from 1
path_to_file = json_data$resources$path[1][1]
data <- read.csv(url(path_to_file))
print(data)

In order to work with Data Packages in Pandas you need to install the Frictionless Data data package library and the pandas extension:

pip install datapackage
pip install jsontableschema-pandas

To get the data run following code:

import datapackage

data_url = "http://datahub.io/core/house-prices-us/datapackage.json"

# to load Data Package into storage
storage = datapackage.push_datapackage(data_url, 'pandas')

# data frames available (corresponding to data files in original dataset)
storage.buckets

# you can access datasets inside storage, e.g. the first one:
storage[storage.buckets[0]]

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('http://datahub.io/core/house-prices-us/datapackage.json')

# get list of resources:
resources = package.descriptor['resources']
resourceList = [resources[x]['name'] for x in range(0, len(resources))]
print(resourceList)

data = package.resources[0].read()
print(data)

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'http://datahub.io/core/house-prices-us/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
(async () => {
  const dataset = await Dataset.load(path)

  // Get the first data file in this dataset
  const file = dataset.resources[0]
  // Get a raw stream
  const stream = await file.stream()
  // entire file as a buffer (be careful with large files!)
  const buffer = await file.buffer
})()

Install the datapackage library created specially for Ruby language using gem:

gem install datapackage

Now get the dataset and read the data:

require 'datapackage'

path = 'http://datahub.io/core/house-prices-us/datapackage.json'

package = DataPackage::Package.new(path)
# So package variable contains metadata. You can see it:
puts package

# Read data itself:
resource = package.resources[0]
data = resource.read
puts data
Datapackage.json