machine-learning

Machine learning datasets

Since January 2018


Recent events

No events.

Datasets 94

Har

har | files 2 | 345MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1478 Author: Jorge L. Reyes-Ortiz, Davide Anguita, Alessandro Ghio, Luca Oneto and Xavier Parra Source: UCI Please cite: Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. A Public Domain explore more

Qsar biodeg

qsar-biodeg | files 3 | 1MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1494 The resources for this dataset can be found at https://www.openml.org/d/1494 Author: Kamel Mansouri, Tine Ringsted, Davide Ballabio Source: UCI Please cite: Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R., explore more

Heart statlog

heart-statlog | files 3 | 144kB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/53 Author: Source: Unknown - Please cite: This database contains 13 attributes (which have been extracted from a larger set of 75) Attribute Information: ------------------------ -- 1. age -- 2. sex explore more

Breast cancer

breast-cancer | files 3 | 149kB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/13 Author: Source: Unknown - Please cite: Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. explore more

Bank marketing

bank-marketing | files 3 | 27MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1461 Author: Paulo Cortez, Sérgio Moro Source: UCI Please cite: S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. In P. Novais et al. (Eds.), explore more

One hundred plants texture

one-hundred-plants-texture | files 3 | 4MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1493 Author: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. Source: UCI - 2010 Please cite: Charles Mallah, James Cope, James Orwell. Plant Leaf Classification Using Probabilistic Integration of Shape, Texture and explore more

Liver disorders

liver-disorders | files 3 | 77kB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/8 Author: BUPA Medical Research Ltd. Donor: Richard S. Forsyth Source: UCI - 5/15/1990 Please cite: BUPA liver disorders The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that explore more

Sick

sick | files 3 | 4MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/38 Author: Ross Quinlan Source: UCI Please cite: Thyroid disease records supplied by the Garavan Institute and J. Ross Quinlan, New South Wales Institute, Syndney, Australia. 1987. Attribute information: sick, negative. | explore more

Mfeat karhunen

mfeat-karhunen | files 3 | 10MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/16 Author: Robert P.W. Duin, Department of Applied Physics, Delft University of Technology Source: UCI - 1998 Please cite: UCI Multiple Features Dataset: Karhunen One of a set of 6 datasets describing features of handwritten explore more

Mammography

mammography | files 3 | 6MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/310 Author: Source: U explore more

Letter

letter | files 3 | 9MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/6 Author: David J. Slate Source: UCI - 01-01-1991 Please cite: P. W. Frey and D. J. Slate. "Letter Recognition Using Holland-style Adaptive Classifiers". Machine Learning 6(2), 1991 TITLE: Letter Image Recognition Data explore more

Tic Tac Toe Endgame

tic-tac-toe-endgame | files 2 | 167kB
updated 1 year ago

This dataset contains tic-tac-toe endgame snapshots. First nine attributes are representing nine fields on tic-tac-toe board and tenth is class attribute which contains information if x player won. Data This dataset was found on UCI - Tic-Tac-Toe Endgame Data set This database encodes the complete explore more

Mnist_784

mnist_784 | files 2 | 1GB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/554 Author: Yann LeCun, Corinna Cortes, Christopher J.C. Burges Source: MNIST Website - Date unknown Please cite: The MNIST database of handwritten digits with 784 features, raw data available at: explore more

Speeddating

speeddating | files 2 | 46MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/40536 Author: Ray Fisman and Sheena Iyengar Source: Columbia Business School - 2004 Please cite: None This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the explore more

Gina_prior2

gina_prior2 | files 3 | 100MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1041 Author: Source: Unknown - Date unknown Please cite: Note: Identical to the MNIST dataset? Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: explore more

Madelon

madelon | files 3 | 61MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/1485 Author: Isabelle Guyon Source: UCI Please cite: Isabelle Guyon, Steve R. Gunn, Asa Ben-Hur, Gideon Dror, 2004. Result analysis of the NIPS 2003 feature selection challenge. Abstract: MADELON is an artificial dataset, explore more

Satimage

satimage | files 3 | 17MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/182 Author: Ashwin Srinivasan, Department of Statistics and Data Modeling, University of Strathclyde Source: UCI) - 1993 Please cite: UCI The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a explore more

Waveform 5000

waveform-5000 | files 3 | 9MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/60 Author: Breiman,L., Friedman,J.H., Olshen,R.A., & Stone,C.J. Source: UCI) - 1988 Please cite: UCI Waveform Database Generator Generator generating 3 classes of waves. Each class is generated from a combination of 2 of 3 explore more

Sonar

sonar | files 3 | 700kB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/40 Author: Source: Unknown - Please cite: NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task explore more

Page blocks

page-blocks | files 3 | 3MB
updated 1 year ago

The resources for this dataset can be found at https://www.openml.org/d/30 Author: Source: Unknown - Please cite: Title of Database: Blocks Classification Sources: (a) Donato Malerba Dipartimento di Informatica University of Bari via Orabona 4 70126 Bari - explore more