Novel Coronavirus 2019

17,736
0
Updated:
Files:9
Size:310 MB
Formats:csv
License:ODC-PDDL

Coronavirus disease 2019 (COVID-19) time series listing confirmed cases, reported deaths and reported recoveries. Data is disaggregated by country (and sometimes subregion). Coronavirus disease (COVID...

API Access

Access dataset files directly from scripts, code, or AI agents.

Browse dataset files
Dataset Files

Each file has a stable URL (r-link) that you can use directly in scripts, apps, or AI agents. These URLs are permanent and safe to hardcode.

/core/covid-19/
https://datahub.io/core/covid-19/_r/-/.gitignore
https://datahub.io/core/covid-19/_r/-/README.md
https://datahub.io/core/covid-19/_r/-/data/countries-aggregated-sample.csv
https://datahub.io/core/covid-19/_r/-/data/countries-aggregated.csv
https://datahub.io/core/covid-19/_r/-/data/key-countries-pivoted.csv
https://datahub.io/core/covid-19/_r/-/data/reference.csv
https://datahub.io/core/covid-19/_r/-/data/time-series-19-covid-combined-sample.csv
https://datahub.io/core/covid-19/_r/-/data/time-series-19-covid-combined.csv
https://datahub.io/core/covid-19/_r/-/data/us_confirmed-sample.csv
https://datahub.io/core/covid-19/_r/-/data/us_confirmed.csv
https://datahub.io/core/covid-19/_r/-/data/us_deaths-sample.csv
https://datahub.io/core/covid-19/_r/-/data/us_deaths.csv
https://datahub.io/core/covid-19/_r/-/data/us_simplified-sample.csv
https://datahub.io/core/covid-19/_r/-/data/us_simplified.csv
https://datahub.io/core/covid-19/_r/-/data/worldwide-aggregate.csv
https://datahub.io/core/covid-19/_r/-/data/worldwide-mortality-rate.csv
https://datahub.io/core/covid-19/_r/-/datapackage.json
Key Files

Start with these files — they give you everything you need to understand and access the dataset.

datapackage.jsonmetadata & schema
https://datahub.io/core/covid-19/_r/-/datapackage.json
README.mddocumentation
https://datahub.io/core/covid-19/_r/-/README.md
Typical Usage
  1. 1. Fetch datapackage.json to inspect schema and resources
  2. 2. Download data resources listed in datapackage.json
  3. 3. Read README.md for full context

Data Views

Data Files

Explore with AI

countries-aggregated

Loading data...

Download

Download CSV

About

Last updated
9 February 2026
Total rows
...
Format
CSV
File size
5.51 MB

reference

Loading data...

Download

Download CSV

About

Last updated
9 February 2026
Total rows
...
Format
CSV
File size
416 kB

key-countries-pivoted

Loading data...

Download

Download CSV

About

Last updated
9 February 2026
Total rows
...
Format
CSV
File size
57.6 kB

time-series-19-covid-combined

Loading data...

Download

Download CSV

About

Last updated
9 February 2026
Total rows
...
Format
CSV
File size
8.58 MB

us_confirmed

Loading data...

Download

Download CSV

About

Last updated
9 February 2026
Total rows
...
Format
CSV
File size
97.1 MB

About this dataset

badge

COVID-19 dataset

Coronavirus disease 2019 (COVID-19) time series listing confirmed cases, reported deaths and reported recoveries. Data is disaggregated by country (and sometimes subregion). Coronavirus disease (COVID-19) is caused by the Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) and has had a worldwide effect. On March 11 2020, the World Health Organization (WHO) declared it a pandemic, pointing to the over 118,000 cases of the Coronavirus illness in over 110 countries and territories around the world at the time.

This dataset includes time series data tracking the number of people affected by COVID-19 worldwide, including:

  • confirmed tested cases of Coronavirus infection
  • the number of people who have reportedly died while sick with Coronavirus
  • the number of people who have reportedly recovered from it

Data

Data is in CSV format and updated daily. It is sourced from this upstream repository maintained by the amazing team at Johns Hopkins University Center for Systems Science and Engineering (CSSE) who have been doing a great public service from an early point by collating data from around the world.

We have cleaned and normalized that data, for example tidying dates and consolidating several files into normalized time series. We have also added some metadata such as column descriptions and data packaged it.

You can view the data, its structure as well as download it in alternative formats (e.g. JSON) from the DataHub:

https://datahub.io/core/covid-19

Sources

The upstream dataset currently lists the following upstream data sources:

We will endeavour to provide more detail on how regularly and by which technical means the data is updated. Additional background is available in the CSSE blog, and in the Lancet paper (DOI), which includes this figure:

countries timeline

Preparation

This repository uses Pandas to process and normalize the data.

You first need to install the dependencies:

pip install -r scripts/requirements.txt

Then run the following scripts:

python scripts/process_worldwide.py
python scripts/process_us.py

Python 3.8 .github/workflows/actions.yml

License

This dataset is licensed under the Open Data Commons Public Domain and Dedication License.

The data comes from a variety public sources and was collated in the first instance via Johns Hopkins University on GitHub. We have used that data and processed it further. Given the public sources and factual nature we believe that there the data is public domain and are therefore releasing the results under the Public Domain Dedication and License. We are also, of course, explicitly licensing any contribution of ours under that license.