Updated

Breast cancer

This is a dataset about breast cancer occurrences. This dataset is taken from OpenML - breast-cancer This breast cancer domain was obtained from the University Medical Centre, Institute of Oncolog...

API Access

Access dataset files directly from scripts, code, or AI agents.

Browse dataset files
Dataset Files

Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.

/core/breast-cancer/
https://datahub.io/core/breast-cancer/_r/-/README.md
https://datahub.io/core/breast-cancer/_r/-/data/breast-cancer.csv
https://datahub.io/core/breast-cancer/_r/-/datapackage.json
Key Files

Start with these files — they give you everything you need to understand and access the dataset.

datapackage.jsonmetadata & schema
https://datahub.io/core/breast-cancer/_r/-/datapackage.json
README.mddocumentation
https://datahub.io/core/breast-cancer/_r/-/README.md
Typical Usage
  1. 1. Fetch datapackage.json to inspect schema and resources
  2. 2. Download data resources listed in datapackage.json
  3. 3. Read README.md for full context

Data Previews

breast-cancer

Loading data...

Schema

nametypeformatdescription
agestringdefault10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99.
mefalsepausestringdefaultlt40, ge40, premeno
tumor-sizestringdefault0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59
inv-falsedesstringdefault0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39
falsede-capsbooleandefaultyes, no
deg-maligintegerdefault1, 2, 3
breaststringdefaultleft, right
breast-quadstringdefaultleft-up, left-low, right-up, right-low, central
irradiatbooleandefaultyes, no
classstringdefaultno-recurrence-events, recurrence-events

Data Files

FileDescriptionSizeLast modifiedDownload
breast-cancer
20.2 kBabout 2 months ago
breast-cancer
FilesSizeFormatCreatedUpdatedLicenseSource
120.2 kBcsvover 1 year agoOpenML - breast-cancer

badge

This is a dataset about breast cancer occurrences.

Data

This dataset is taken from OpenML - breast-cancer

This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. Please include this citation if you plan to use this database.

Matjaz Zwitter & Milan Soklic (physicians) Institute of Oncology University Medical Center Ljubljana, Yugoslavia – Donors: Ming Tan and Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) – Date: 11 July 1988.

  • 286 instances
  • 10 attributes
  • Missing values: yes

Class Distribution:

  • no-recurrence-events: 201 instances
  • recurrence-events: 85 instances

Output data

Output data is located in directory data

data/breast-cancer.csv

Scripts

Scripts for dataset are located in directory scripts

scripts/main.py

Licence

Licensed under the Public Domain Dedication and License (assuming either no rights or public domain license in source data).