Real Estate Across The United States Building Inventory

JohnSnowLabs

Files Size Format Created Updated License Source
2 0B csv zip 6 months ago johnsnowlabs Data.gov
Download

Data Files

File Description Size Last changed Download
real-estate-across-the-united-states-building-inventory-csv 2MB csv (2MB) , json (6MB)
datapackage_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 951kB zip (951kB)

real-estate-across-the-united-states-building-inventory-csv  

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
Location_Code 1 string Unique identifier for the PBS owned or leased building.
Region 2 string The Region field identifies the region where the building is located.
Building_Address1 3 string Line one of the street address of the building.
Building_Address2 4 string Optional 2nd line of a building’s address.
Building_City 5 string The city in which the building is located
Building_County 6 string The county in which the building is located.
Building_State 7 string The state or country abbreviation where the building is located
Building_Zip 8 integer A 9-digit code that identifies the city zip code for the building location
Congressional_District 9 string The number of the Congressional District where the building is located.
Building_Status 10 string The current status of the building.
Property_Type 11 string FRPC (Federal Real Property Council) Real Property Type identifies the asset as one of the following categories of real property: Building, Land, or Structure. This field works in conjunction with the Property Type field.
American_National_Standards_Institute_Usable_Squarefeet 12 number The sum of usable SQFT for a given building.
Total_Parking_Spaces 13 number Total number of parking spaces for a given building record.
Owned_Or_Leased 14 string Internal system value that identifies whether the building is owned or leased.
Construction_Date 15 string This field is used to identify the date the building is substantially completed.
Historical_Type 16 string Code identifying the historical type of the location or area
Historical_Status 17 string Code identifying the historical value of the location
Architectural_Barriers_Act_Accessibility_Flag 18 string The Architectural Barriers Act (ABA) requires that facilities designed, built, altered, or leased with federal funds are accessible to the physically handicapped. This field indicates if the building does or does not meet these guidelines.

Import into your tool

Data-cli or just data is the program to get and post your data with the datahub.
Download CLI tool and use it with the datahub almost like you use git with the github:

data get https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory
data info JohnSnowLabs/real-estate-across-the-united-states-building-inventory
tree JohnSnowLabs/real-estate-across-the-united-states-building-inventory
# Get a list of dataset's resources
curl -L -s https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/r/0.csv

curl -L https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/JohnSnowLabs/real-estate-across-the-united-states-building-inventory/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()
Datapackage.json