Now you can request additional data and/or customized columns!
Try It Now!Files | Size | Format | Created | Updated | License | Source |
---|---|---|---|---|---|---|
2 | 87kB | csv zip | 4 years ago | 4 years ago | OpenML - dermatology |
Download files in this dataset
File | Description | Size | Last changed | Download |
---|---|---|---|---|
dermatology | 26kB | csv (26kB) , json (335kB) | ||
dermatology_zip | Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. | 23kB | zip (23kB) |
Signup to Premium Service for additional or customised data - Get Started
This is a preview version. There might be more data in the original version.
Field Name | Order | Type (Format) | Description |
---|---|---|---|
erythema | 1 | integer (default) | 0,1,2,3 |
scaling | 2 | integer (default) | 0,1,2,3 |
definite_borders | 3 | integer (default) | 0,1,2,3 |
itching | 4 | integer (default) | 0,1,2,3 |
koebner_phenomenon | 5 | integer (default) | 0,1,2,3 |
polygonal_papules | 6 | integer (default) | 0,1,2,3 |
follicular_papules | 7 | integer (default) | 0,1,2,3 |
oral_mucosal_involvement | 8 | integer (default) | 0,1,2,3 |
knee_and_elbow_involvement | 9 | integer (default) | 0,1,2,3 |
scalp_involvement | 10 | integer (default) | 0,1,2,3 |
family_history | 11 | integer (default) | 0,1 |
melanin_incontinence | 12 | integer (default) | 0,1,2,3 |
eosinophils_in_the_infiltrate | 13 | integer (default) | 0,1,2,3 |
pnl_infiltrate | 14 | integer (default) | 0,1,2,3 |
fibrosis_of_the_papillary_dermis | 15 | integer (default) | 0,1,2,3 |
exocytosis | 16 | integer (default) | 0,1,2,3 |
acanthosis | 17 | integer (default) | 0,1,2,3 |
hyperkeratosis | 18 | integer (default) | 0,1,2,3 |
parakeratosis | 19 | integer (default) | 0,1,2,3 |
clubbing_of_the_rete_ridges | 20 | integer (default) | 0,1,2,3 |
elongation_of_the_rete_ridges | 21 | integer (default) | 0,1,2,3 |
thinning_of_the_suprapapillary_epidermis | 22 | integer (default) | 0,1,2,3 |
spongiform_pustule | 23 | integer (default) | 0,1,2,3 |
munro_microabcess | 24 | integer (default) | 0,1,2,3 |
focal_hypergranulosis | 25 | integer (default) | 0,1,2,3 |
disappearance_of_the_granular_layer | 26 | integer (default) | 0,1,2,3 |
vacuolisation_and_damage_of_basal_layer | 27 | integer (default) | 0,1,2,3 |
spongiosis | 28 | integer (default) | 0,1,2,3 |
saw-tooth_appearance_of_retes | 29 | integer (default) | 0,1,2,3 |
follicular_horn_plug | 30 | integer (default) | 0,1,2,3 |
perifollicular_parakeratosis | 31 | integer (default) | 0,1,2,3 |
inflammatory_monoluclear_inflitrate | 32 | integer (default) | 0,1,2,3 |
band-like_infiltrate | 33 | integer (default) | 0,1,2,3 |
age | 34 | integer (default) | |
class | 35 | integer (default) | 1,2,3,4,5,6 |
Use our data-cli tool designed for data wranglers:
data get https://datahub.io/machine-learning/dermatology
data info machine-learning/dermatology
tree machine-learning/dermatology
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/dermatology/datapackage.json | grep path
# Get resources
curl -L https://datahub.io/machine-learning/dermatology/r/0.csv
curl -L https://datahub.io/machine-learning/dermatology/r/1.zip
If you are using R here's how to get the data you want quickly loaded:
install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")
json_file <- 'https://datahub.io/machine-learning/dermatology/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
# get list of all resources:
print(json_data$resources$name)
# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
if(json_data$resources$datahub$type[i]=='derived/csv'){
path_to_file = json_data$resources$path[i]
data <- read.csv(url(path_to_file))
print(data)
}
}
Note: You might need to run the script with root permissions if you are running on Linux machine
Install the Frictionless Data data package library and the pandas itself:
pip install datapackage
pip install pandas
Now you can use the datapackage in the Pandas:
import datapackage
import pandas as pd
data_url = 'https://datahub.io/machine-learning/dermatology/datapackage.json'
# to load Data Package into storage
package = datapackage.Package(data_url)
# to load only tabular data
resources = package.resources
for resource in resources:
if resource.tabular:
data = pd.read_csv(resource.descriptor['path'])
print (data)
For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):
pip install datapackage
To get Data Package into your Python environment, run following code:
from datapackage import Package
package = Package('https://datahub.io/machine-learning/dermatology/datapackage.json')
# print list of all resources:
print(package.resource_names)
# print processed tabular data (if exists any)
for resource in package.resources:
if resource.descriptor['datahub']['type'] == 'derived/csv':
print(resource.read())
If you are using JavaScript, please, follow instructions below:
Install data.js
module using npm
:
$ npm install data.js
Once the package is installed, use the following code snippet:
const {Dataset} = require('data.js')
const path = 'https://datahub.io/machine-learning/dermatology/datapackage.json'
// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
const dataset = await Dataset.load(path)
// get list of all resources:
for (const id in dataset.resources) {
console.log(dataset.resources[id]._descriptor.name)
}
// get all tabular data(if exists any)
for (const id in dataset.resources) {
if (dataset.resources[id]._descriptor.format === "csv") {
const file = dataset.resources[id]
// Get a raw stream
const stream = await file.stream()
// entire file as a buffer (be careful with large files!)
const buffer = await file.buffer
// print data
stream.pipe(process.stdout)
}
}
})()
This dataset contains instances of dermatology cancer occurrences.
This dataset was found on OpenML - dermatology
Original owners:
Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey
H. Altay Guvenir, PhD., Bilkent University, Department of Computer Engineering and Information Science, 06533 Ankara, Turkey
Donor:
Data is located in directory data
data/dermatology.csv
Scripts are in directory scripts
scripts/main.py
Licensed under the Public Domain Dedication and Licence (assuming either no rights or public domain licence in source data).
Notifications of data updates and schema changes
Warranty / guaranteed updates
Workflow integration (e.g. Python packages, NPM packages)
Customized data (e.g. you need different or additional data)
Or suggest your own feature from the link below