Now you can request additional data and/or customized columns!

Try It Now!

Climate model simulation crashes

machine-learning

Files Size Format Created Updated License Source
3 310kB arff csv zip 10 months ago 10 months ago Open Data Commons Public Domain Dedication and License
The resources for this dataset can be found at https://www.openml.org/d/1467 Author: D. Lucas, R. Klein, J. Tannahill, D. Ivanova, S. Brandon, D. Domyancic, Y. Zhang. Source: UCI Please Cite: Lucas, D. D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., and Zhang, Y.: Failure read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
climate-model-simulation-crashes_arff 89kB arff (89kB)
climate-model-simulation-crashes 86kB csv (86kB) , json (175kB)
climate-model-simulation-crashes_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 160kB zip (160kB)

climate-model-simulation-crashes_arff  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

climate-model-simulation-crashes  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
V1 1 number (default)
V2 2 number (default)
V3 3 number (default)
V4 4 number (default)
V5 5 number (default)
V6 6 number (default)
V7 7 number (default)
V8 8 number (default)
V9 9 number (default)
V10 10 number (default)
V11 11 number (default)
V12 12 number (default)
V13 13 number (default)
V14 14 number (default)
V15 15 number (default)
V16 16 number (default)
V17 17 number (default)
V18 18 number (default)
V19 19 number (default)
V20 20 number (default)
Class 21 number (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/machine-learning/climate-model-simulation-crashes
data info machine-learning/climate-model-simulation-crashes
tree machine-learning/climate-model-simulation-crashes
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/climate-model-simulation-crashes/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/machine-learning/climate-model-simulation-crashes/r/0.arff

curl -L https://datahub.io/machine-learning/climate-model-simulation-crashes/r/1.csv

curl -L https://datahub.io/machine-learning/climate-model-simulation-crashes/r/2.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/machine-learning/climate-model-simulation-crashes/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/machine-learning/climate-model-simulation-crashes/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/machine-learning/climate-model-simulation-crashes/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/machine-learning/climate-model-simulation-crashes/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

The resources for this dataset can be found at https://www.openml.org/d/1467

Author: D. Lucas, R. Klein, J. Tannahill, D. Ivanova, S. Brandon, D. Domyancic, Y. Zhang.

Source: UCI

Please Cite: Lucas, D. D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., and Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models, Geosci. Model Dev. Discuss., 6, 585-623, Web Link, 2013.

Source:

D. Lucas (ddlucas .at. alum.mit.edu), Lawrence Livermore National Laboratory; R. Klein (rklein .at. astron.berkeley.edu), Lawrence Livermore National Laboratory & U.C. Berkeley; J. Tannahill (tannahill1 .at. llnl.gov), Lawrence Livermore National Laboratory; D. Ivanova (ivanova2 .at. llnl.gov), Lawrence Livermore National Laboratory; S. Brandon (brandon1 .at. llnl.gov), Lawrence Livermore National Laboratory; D. Domyancic (domyancic1 .at. llnl.gov), Lawrence Livermore National Laboratory; Y. Zhang (zhang24 .at. llnl.gov), Lawrence Livermore National Laboratory .

This data was constructed using LLNL’s UQ Pipeline, was created under the auspices of the US Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, was funded by LLNL’s Uncertainty Quantification Strategic Initiative Laboratory Directed Research and Development Project under tracking code 10-SI-013, and is released under UCRL number LLNL-MISC-633994.

Data Set Information:

This dataset contains records of simulation crashes encountered during climate model uncertainty quantification (UQ) ensembles. Ensemble members were constructed using a Latin hypercube method in LLNL’s UQ Pipeline software system to sample the uncertainties of 18 model parameters within the Parallel Ocean Program (POP2) component of the Community Climate System Model (CCSM4). Three separate Latin hypercube ensembles were conducted, each containing 180 ensemble members. 46 out of the 540 simulations failed for numerical reasons at combinations of parameter values. The goal is to use classification to predict simulation outcomes (fail or succeed) from input parameter values, and to use sensitivity analysis and feature selection to determine the causes of simulation crashes. Further details about the data and methods are given in the publication ‘Failure Analysis of Parameter-Induced Simulation Crashes in Climate Models,’ Geoscientific Model Development (Web Link).

Attribute Information:

The goal is to predict climate model simulation outcomes (column 21, fail or succeed) given scaled values of climate model input parameters (columns 3-20).

  • Column 1: Latin hypercube study ID (study 1 to study 3)

  • Column 2: simulation ID (run 1 to run 180)

  • Columns 3-20: values of 18 climate model parameters scaled in the interval [0, 1]

  • Column 21: simulation outcome (0 = failure, 1 = success)

Relevant Papers:

Lucas, D. D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., and Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models, Geosci. Model Dev. Discuss., 6, 585-623, Web Link, 2013.

Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below