The severity of the current SARS-CoV-2 epidemic is undeniable: since the latest months of 2019, the COVID-19 outbreak is having a significant impact in the world at the macro level, starting its spread from China, then to the Asia-Pacific and then around the rest of the globe.
Photo by Edward Howell on Unsplash
More than 41 articles were published about COVID-19 just from the beginning of 2020, more than 10,000 GitHub forks of CSSEGISandData were made, various open data sets (including on GitHub and on Kaggle) shared the number of cases and the number of people affected as well as other metrics.
Many scientists around the globe participated in a global hackathon, trying to address current issues (lockdowns, lack of resources, etc.), trying to fit, predict and estimate the impact. Which models from these ones are close to the reality and which ones have a chance to be implemented as solutions for humanitarian response?
There are various types of models in epidemiology, which are used for prediction of prevalence (total number of people infected), the duration of an epidemic spreading processes and other variables.
One of the most commonly used types are compartmental models, developed in the 1920s from the Kermack–McKendrick theory. In compartmental models, the population is divided into compartments of people with the same properties: people who are susceptible to the virus; those who have recovered; etc. There are three main types of compartmental models in epidemiology based on which many other models can be built upon:
In the SIR model, one can study the number of people in each of three compartments: susceptible, infectious and recovered, denoted by the variables S, I and R correspondingly. The SIR system can be expressed by the set of ordinary differential equations proposed by O. Kermack and Anderson Gray McKendrick.
Another model, which has been proven to be much more realistic for some epidemics spreading models is the SEIR model. The main idea of this model is that for many important infections, there is a significant incubation period during which individuals have been infected but are not yet infectious themselves. During this period, the individual is in a special state E (exposed), an additional state to three states in the SIR model.
Some models described here are also explained on this page.
During COVID-19 many open data sources were gathered, which explains the temptation of fitting the curve to empirical data or inferring parameters of spreading models. Here, we list some of the main issues with simply trying to do “curve fitting” approach:
For more publications on COVID-19 please see recent peer-reviewed articles, such as Fergusson et al.
Find more illustrations and code by B.Goncalves at Epidemiology101.
Prediction of next waves of COVID19 can be made using various models, including classical simple SIR model or its variations. One can already see the effect of curve flattening by changing the parameters, such as mobility index or average number of interactions between people. For spatial statistics one may need to use spatially embedded models, such as more generic and complex models Gleamviz model. For calibrating these models, one need to use expert-curated healthcare datasets available to COVID-19 researchers and data scientists. Mobility trends are important components for prediction of epidemics spreading, therefore we are working on making finding open mobility datasets. Below we collected some references of datasets and peer-reviewed publications: