Now you can request additional data and/or customized columns!

Try It Now!

Speed dating

machine-learning

Files Size Format Created Updated License Source
2 2MB csv zip 8 months ago 8 months ago SpeedDating
This dataset is about speed dating. This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the opposite sex. At the end of their four minutes, participants read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
speed-dating 5MB csv (5MB) , json (24MB)
speed-dating_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 1MB zip (1MB)

speed-dating  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
has_null 1 number (default)
wave 2 number (default)
gender 3 string (default)
age 4 number (default)
age_o 5 number (default)
d_age 6 number (default)
d_d_age 7 string (default)
race 8 string (default)
race_o 9 string (default)
samerace 10 number (default)
importance_same_race 11 number (default)
importance_same_religion 12 number (default)
d_importance_same_race 13 string (default)
d_importance_same_religion 14 string (default)
field 15 string (default)
pref_o_attractive 16 number (default)
pref_o_sincere 17 number (default)
pref_o_intelligence 18 number (default)
pref_o_funny 19 number (default)
pref_o_ambitious 20 number (default)
pref_o_shared_interests 21 number (default)
d_pref_o_attractive 22 string (default)
d_pref_o_sincere 23 string (default)
d_pref_o_intelligence 24 string (default)
d_pref_o_funny 25 string (default)
d_pref_o_ambitious 26 string (default)
d_pref_o_shared_interests 27 string (default)
attractive_o 28 number (default)
sinsere_o 29 number (default)
intelligence_o 30 number (default)
funny_o 31 number (default)
ambitous_o 32 number (default)
shared_interests_o 33 number (default)
d_attractive_o 34 string (default)
d_sinsere_o 35 string (default)
d_intelligence_o 36 string (default)
d_funny_o 37 string (default)
d_ambitous_o 38 string (default)
d_shared_interests_o 39 string (default)
attractive_important 40 number (default)
sincere_important 41 number (default)
intellicence_important 42 number (default)
funny_important 43 number (default)
ambtition_important 44 number (default)
shared_interests_important 45 number (default)
d_attractive_important 46 string (default)
d_sincere_important 47 string (default)
d_intellicence_important 48 string (default)
d_funny_important 49 string (default)
d_ambtition_important 50 string (default)
d_shared_interests_important 51 string (default)
attractive 52 number (default)
sincere 53 number (default)
intelligence 54 number (default)
funny 55 number (default)
ambition 56 number (default)
d_attractive 57 string (default)
d_sincere 58 string (default)
d_intelligence 59 string (default)
d_funny 60 string (default)
d_ambition 61 string (default)
attractive_partner 62 number (default)
sincere_partner 63 number (default)
intelligence_partner 64 number (default)
funny_partner 65 number (default)
ambition_partner 66 number (default)
shared_interests_partner 67 number (default)
d_attractive_partner 68 string (default)
d_sincere_partner 69 string (default)
d_intelligence_partner 70 string (default)
d_funny_partner 71 string (default)
d_ambition_partner 72 string (default)
d_shared_interests_partner 73 string (default)
sports 74 number (default)
tvsports 75 number (default)
exercise 76 number (default)
dining 77 number (default)
museums 78 number (default)
art 79 number (default)
hiking 80 number (default)
gaming 81 number (default)
clubbing 82 number (default)
reading 83 number (default)
tv 84 number (default)
theater 85 number (default)
movies 86 number (default)
concerts 87 number (default)
music 88 number (default)
shopping 89 number (default)
yoga 90 number (default)
d_sports 91 string (default)
d_tvsports 92 string (default)
d_exercise 93 string (default)
d_dining 94 string (default)
d_museums 95 string (default)
d_art 96 string (default)
d_hiking 97 string (default)
d_gaming 98 string (default)
d_clubbing 99 string (default)
d_reading 100 string (default)
d_tv 101 string (default)
d_theater 102 string (default)
d_movies 103 string (default)
d_concerts 104 string (default)
d_music 105 string (default)
d_shopping 106 string (default)
d_yoga 107 string (default)
interests_correlate 108 number (default)
d_interests_correlate 109 string (default)
expected_happy_with_sd_people 110 number (default)
expected_num_interested_in_me 111 number (default)
expected_num_matches 112 number (default)
d_expected_happy_with_sd_people 113 string (default)
d_expected_num_interested_in_me 114 string (default)
d_expected_num_matches 115 string (default)
like 116 number (default)
guess_prob_liked 117 number (default)
d_like 118 string (default)
d_guess_prob_liked 119 string (default)
met 120 number (default)
decision 121 number (default)
decision_o 122 number (default)
match 123 number (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/machine-learning/speed-dating
data info machine-learning/speed-dating
tree machine-learning/speed-dating
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/speed-dating/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/machine-learning/speed-dating/r/0.csv

curl -L https://datahub.io/machine-learning/speed-dating/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/machine-learning/speed-dating/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/machine-learning/speed-dating/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/machine-learning/speed-dating/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/machine-learning/speed-dating/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

This dataset is about speed dating.

This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute “first date” with every other participant of the opposite sex. At the end of their four minutes, participants were asked if they would like to see their date again. They were also asked to rate their date on six attributes:

  • Attractiveness
  • Sincerity
  • Intelligence
  • Fun
  • Ambition
  • Shared Interests.

The dataset also includes questionnaire data gathered from participants at different points in the process. These fields include:

  • demographics
  • dating habits
  • self-perception across key attributes
  • beliefs on what others find valuable in a mate
  • and lifestyle information

Data

This dataset was found under the name SpeedDating on OpenML.org

Data is located in directory called data

data/speed-dating.csv

Preparation

Python 3.x is needed to run script located in scripts.

Python scripts are located in directory scripts

scripts/main.py

License

Licensed under the Public Domain Dedication and License (assuming either no rights or public domain license in source data).

Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below