For part of our day, we decided to clean up and package some data on COVID-19 (coronavirus). The data includes province/state, country/region, latitude, longitude, date, confirmed, recovered, and deaths. Our source was from the Data Repository by Johns Hopkins CSSE, which is updated daily by Johns Hopkins Whiting School of Engineering.
To clean up the data, we used a Python library called dataflows, which is available in the PyPI, and on GitHub. We used this library to unpivot the data, accumulate the daily cases, and consolidate our 3 sources (Johns Hopkins has separate CSV files for cases: confirmed, recovered, and deaths).
Whether or not you’ve participated in Open Data Day before, we hope to see you participate next year!