Now you can request additional data and/or customized columns!

Try It Now!

Dermatology unlisted

machine-learning

Files Size Format Created Updated License Source
2 87kB csv zip 8 months ago 6 months ago OpenML - dermatology
This dataset contains instances of dermatology cancer occurrences. Data This dataset was found on OpenML - dermatology Original owners: Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey H. Altay Guvenir, PhD., Bilkent University, Department of Computer Engineering read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
dermatology 26kB csv (26kB) , json (335kB)
dermatology_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 23kB zip (23kB)

dermatology  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
erythema 1 integer (default) 0,1,2,3
scaling 2 integer (default) 0,1,2,3
definite_borders 3 integer (default) 0,1,2,3
itching 4 integer (default) 0,1,2,3
koebner_phenomenon 5 integer (default) 0,1,2,3
polygonal_papules 6 integer (default) 0,1,2,3
follicular_papules 7 integer (default) 0,1,2,3
oral_mucosal_involvement 8 integer (default) 0,1,2,3
knee_and_elbow_involvement 9 integer (default) 0,1,2,3
scalp_involvement 10 integer (default) 0,1,2,3
family_history 11 integer (default) 0,1
melanin_incontinence 12 integer (default) 0,1,2,3
eosinophils_in_the_infiltrate 13 integer (default) 0,1,2,3
pnl_infiltrate 14 integer (default) 0,1,2,3
fibrosis_of_the_papillary_dermis 15 integer (default) 0,1,2,3
exocytosis 16 integer (default) 0,1,2,3
acanthosis 17 integer (default) 0,1,2,3
hyperkeratosis 18 integer (default) 0,1,2,3
parakeratosis 19 integer (default) 0,1,2,3
clubbing_of_the_rete_ridges 20 integer (default) 0,1,2,3
elongation_of_the_rete_ridges 21 integer (default) 0,1,2,3
thinning_of_the_suprapapillary_epidermis 22 integer (default) 0,1,2,3
spongiform_pustule 23 integer (default) 0,1,2,3
munro_microabcess 24 integer (default) 0,1,2,3
focal_hypergranulosis 25 integer (default) 0,1,2,3
disappearance_of_the_granular_layer 26 integer (default) 0,1,2,3
vacuolisation_and_damage_of_basal_layer 27 integer (default) 0,1,2,3
spongiosis 28 integer (default) 0,1,2,3
saw-tooth_appearance_of_retes 29 integer (default) 0,1,2,3
follicular_horn_plug 30 integer (default) 0,1,2,3
perifollicular_parakeratosis 31 integer (default) 0,1,2,3
inflammatory_monoluclear_inflitrate 32 integer (default) 0,1,2,3
band-like_infiltrate 33 integer (default) 0,1,2,3
age 34 integer (default)
class 35 integer (default) 1,2,3,4,5,6

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/machine-learning/dermatology
data info machine-learning/dermatology
tree machine-learning/dermatology
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/dermatology/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/machine-learning/dermatology/r/0.csv

curl -L https://datahub.io/machine-learning/dermatology/r/1.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/machine-learning/dermatology/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/machine-learning/dermatology/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/machine-learning/dermatology/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/machine-learning/dermatology/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

This dataset contains instances of dermatology cancer occurrences.

Data

This dataset was found on OpenML - dermatology

Original owners:

  • Nilsel Ilter, M.D., Ph.D., Gazi University, School of Medicine 06510 Ankara, Turkey

  • H. Altay Guvenir, PhD., Bilkent University, Department of Computer Engineering and Information Science, 06533 Ankara, Turkey

Donor:

  • H. Altay Guvenir, Bilkent University, Department of Computer Engineering and Information Science, 06533 Ankara, Turkey

Data is located in directory data

data/dermatology.csv

Attribute information

Class

  • 1: psoriasis
  • 2: seboreic dermatitis
  • 3: lichen planus
  • 4: pityriasis rosea
  • 5: cronic dermatitis
  • 6: pityriasis rubra pilaris

Family history:

  • 1: if any of these diseases has been observed in the family
  • 0: otherwise

Age:

  • Represents the age of the patient

All other attributes:

  • 0: feature not present
  • 1, 2 indicate the relative intermediate values
  • 3 indicates the largest amount possible

Preparation

Scripts are in directory scripts

scripts/main.py

Licence

Licensed under the Public Domain Dedication and Licence (assuming either no rights or public domain licence in source data).

Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below