Cervical cancer

FilesSizeFormatCreatedUpdatedLicenseSource
195.9 kBcsvalmost 7 years agoUCI - Cervical cancer (Risk Factors) Data Set

This is dataset about cervical cancer occurrences. Cervical cancer is one the most frequent cancer diseases that occur to women. This dataset is showing some factors that might influence cervical ca...

Read more

Data Files

FileDescriptionSizeLast modifiedDownload
cervical-cancer
95.9 kBalmost 7 years ago
cervical-cancer

Data Previews

cervical-cancer

Schema

nametypeformat
Ageintegerdefault
Number of sexual partnersnumberdefault
First sexual intercoursenumberdefault
Num of pregnanciesnumberdefault
Smokesnumberdefault
Smokes (years)numberdefault
Smokes (packs/year)numberdefault
Hormonal Contraceptivesnumberdefault
Hormonal Contraceptives (years)numberdefault
IUDnumberdefault
IUD (years)numberdefault
STDsnumberdefault
STDs (number)numberdefault
STDs:condylomatosisnumberdefault
STDs:cervical condylomatosisnumberdefault
STDs:vaginal condylomatosisnumberdefault
STDs:vulvo-perineal condylomatosisnumberdefault
STDs:syphilisnumberdefault
STDs:pelvic inflammatory diseasenumberdefault
STDs:genital herpesnumberdefault
STDs:molluscum contagiosumnumberdefault
STDs:AIDSnumberdefault
STDs:HIVnumberdefault
STDs:Hepatitis Bnumberdefault
STDs:HPVnumberdefault
STDs: Number of diagnosisintegerdefault
STDs: Time since first diagnosisstringdefault
STDs: Time since last diagnosisstringdefault
Dx:Cancerintegerdefault
Dx:CINintegerdefault
Dx:HPVintegerdefault
Dxintegerdefault
Hinselmannintegerdefault
Schillerintegerdefault
Citologyintegerdefault
Biopsyintegerdefault

badge

This is dataset about cervical cancer occurrences. Cervical cancer is one the most frequent cancer diseases that occur to women. This dataset is showing some factors that might influence cervical cancer.

Data

This dataset was found on UCI under the name Cervical cancer (Risk Factors) Data Set

The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela. The dataset comprises demographic information, habits, and historic medical records of 858 patients. Several patients decided not to answer some of the questions because of privacy concerns (missing values).

  • 835 instances
  • 36 attributes
  • Missing values: yes

Output data is located in directory called data

data/cervical-cancer.csv

Attributes are the same as they were in input data.

Preparation

To get our output data several things are done to input data:

  • missing values marked with "?" are replaced with ""(empty space)

Python scripts are located in directory scripts

scripts/main.py

License

Licensed under the Public Domain Dedication and License (assuming either no rights or public domain license in source data).