2020 04 15 20066068v1 Full
2020 04 15 20066068v1 Full
1 Faculty
of Economics, Ss. Cyril and Methodius University in Skopje
2 Macedonian Academy of Sciences and Arts
3 Fraunhofer Heinrich Hertz Institute
4 Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University
Abstract
The magnitude of the coronavirus disease (COVID-19) pandemic has an enormous impact
on the social life and the economic activities in almost every country in the world. Besides
the biological and epidemiological factors, a multitude of social and economic criteria also
govern the extent of the coronavirus disease spread in the population. Consequently, there is an
active debate regarding the critical socio-economic determinants that contribute to the resulting
pandemic. In this paper, we contribute towards the resolution of the debate by leveraging
Bayesian model averaging techniques and country level data to investigate the potential of
35 determinants, describing a diverse set of socio-economic characteristics, in explaining the
coronavirus pandemic outcome.
1 Introduction
The coronavirus pandemic began as a simple outbreak in December 2019 in Wuhan, China. How-
ever, it quickly propagated to other countries and became a primary global threat. It seems that
most countries were not prepared for this pandemic. As a consequence, hospitals were over-
crowded with patients and death rates due to the disease skyrocketed. In particular, as of the time
of this writing (11th April 2020), there have been over 1.5 million cases and over 100 thousand
deaths worldwide as a cause of the coronavirus induced disease, COVID-191 .
∗
This is a preliminary report which includes data gathered up to 11th April 2020. It will be updated weekly so as
to only include results based on data that is not older than two weeks.
†
Corresponding author: [email protected]
1 Source: Worldometers coronavirus tracker: https://www.worldometers.info/coronavirus/
1
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Figure 1: Histogram based on probability density estimation for the cases and deaths per million
population using country level data. The x-axis describes the observed value, whereas the y-axis is the
estimated probability density. Data taken on 11th April 2020.
In order to reduce the impact of the disease spread, most governments implemented social dis-
tancing restrictions such as closure of schools, airports, borders, restaurants and shopping malls [1].
In the most severe cases there were even lockdowns – all citizens were prohibited from leaving their
homes. This subsequently lead to a major economic downturn: stock markets plummeted, inter-
national trade slowed down, businesses went bankrupt and people were left unemployed. While
in some countries the implemented restrictions had a significant impact on reducing the expected
shock from the coronavirus, the extent of the disease spread in the population greatly varied from
one economy to another, as illustrated in Fig 1.
A multitude of social and economic criteria have been attributed as potential determinants for
the observed variety in the coronavirus outcome. Some experts say that the hardest hit countries
also had an aging population [2, 3], or an underdeveloped healthcare system [4, 5]. Others em-
phasize the role of the natural environment [6, 7]. In addition, while the developments in most
of the countries follow certain common patterns, several countries are notably outliers, both in
the number of documented cases and in the disease outcome.Having in mind the ongoing debate, a
comprehensive empirical study of the critical socio-economic determinants of the coronavirus pan-
demic would not only provide a glimpse on their potential impact, but would also offer a guidance
for future policies that aim at preventing the emergence of epidemics.
Motivated by this observation, here we perform a detailed statistical analysis on a large set
of potential socio-economic determinants and explore their potential to explain the variety in the
observed coronavirus cases/deaths among countries. To construct the set of potential determi-
nants we conduct a thorough review of the literature describing the social and economic factors
which contribute to the spread of an epidemic. We identify a total of 35 potential determinants
2
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
that describe a diverse ensemble of social and economic factors, including: healthcare infrastruc-
ture, societal characteristics, economic performance, demographic structure etc. To investigate
the performance of each variable in explaining the coronavirus outcome, we utilize the technique
of Bayesian model averaging (BMA). BMA allows us to isolate the most important determinants
by calculating the posterior probability that they truly regulate the process. At the same time,
BMA provides estimates for their relative impact, while also accounting for the uncertainty in the
selection of potential determinants [8–10].
Based on the current data, we observe patterns that suggest that there are only few determinants
(factors) that have explanatory power for the coronavirus outcome. As we will discuss in more
detail in the sequel, we observe that some of these factors are strongly related to the level of
economic development and to the effect of population size in social interactions. However, we
stress that at this point in time, our observations require a careful interpretation for several reasons,
including the following: (i) the analysis is based on aggregate quantities, i.e. averages across
geographic locations. As such, it does not include spatial inhomogeneity and hence can not capture
(potentially) significant interactive local dynamics; (ii) the parameters governing the time evolution
of the disease spread and pandemic outcome are themselves dynamic, and depend on the stage of
the country’s epidemic and on the changing social response efforts. While being aware of these
potential shortcomings of our formulation, in the absence of realistic models that adequately cover
all relevant aspects, this study provides the first step towards a more comprehensive understanding
of the socio-economic factors of the coronavirus pandemic. We expect that, with the availability
of new data and the improved understanding of the dynamics of the coronavirus pandemic, some
of these shortcomings will be overcome, yielding a more reliable interpretation of the results.
2 Results
2.1 Preliminaries
In a formal setting, both the log of registered COVID-19 cases and the log of COVID-19 deaths
are a result of a disease spreading process [11, 12]. The extent to which a disease spreads within a
population is uniquely determined by its reproduction number. This number describes the expected
number of cases directly generated by one case in a population in which all individuals are sus-
ceptible to infection [13, 14]. Obviously, its magnitude depends on various natural characteristics
of the disease, such as its infectivity or the duration of infectiousness [15], and the social distanc-
ing measures imposed by the government [1]. Also, it depends on a plethora on socio-economic
factors that govern the behavioral interactions within a population [16, 17].
In general, we never observe the reproduction number, but instead its outcome, i.e. the number
of cases/deaths. Hence, we can utilize the known properties of the reproduction number to derive
a linear regression model Mm for the coronavirus outcome as
yi = β0 + βmT Xm
i + γsi + δ di + ui ,
where for simplicity we denote both the log of registered COVID-19 cases per million population
and the log of COVID-19 deaths per million population of country i as yi . We focus on regis-
tered quantities normalized on per capita basis for the dependent variable instead of raw values
3
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
as a means to eliminate the bias in the outcome arising as a consequence of the discrepancies
in the sizes of the studied countries. In the equation, Xm i it is a km dimensional vector of socio-
economic explanatory variables that determine the dependent variable, βm is the vector describing
their marginal contributions, β0 is the intercept of the regression, and ui is the error term. The si
term controls for the impact of social distancing measures of the countries, and γ is its coefficient.
Finally, we also include the term di , with δ capturing its marginal effect, that measures the dura-
tion of the pandemics within the economy. This allows us to control for the possibility that the
countries are in a different state of the disease spreading process.
A central question which arises is the selection of the independent variables in Mm . While
the literature review offers a comprehensive overview of all potential determinants, in reality we
are never certain of their credibility. In order to circumvent the problem of choosing a model and
potentially ending up with a wrong selection, we resort to the technique of Bayesian Model Aver-
aging (BMA). BMA leverages Bayesian statistics to account for model uncertainty by estimating
each possible model, and thus evaluating the posterior distribution of each parameter value and
probability that a particular model is the correct one [18].
4
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
6
2
4
0
2 -2
0 -4
-4 -2 0 2 -4 -2 0 2
log of government stringency log of government stringency
Healthcare Infrastructure: The healthcare infrastructure essentially determines both the quan-
tity and quality with which health care services are be delivered in a time of an epidemic. As
measures for this determinant we include 4 variables which capture the quantity of hospital beds,
nurses and medical practitioners, as well as the quality of the coverage of essential health ser-
vices. On the one hand, studies report that well structured healthcare resources positively affect
a country’s capacity to deal with pandemic emergencies [19–25]. On the other hand, the health-
care infrastructure also greatly impacts the country’s ability to perform testing and reporting when
identifying the infected people. In this regard, economies with better structure are able to easily
perform mass testing and more detailed reporting [26–28].
National health statistics: The physical and mental state of a person play an important role in the
degree to which the individual is susceptible to a disease. It is expected that a nation composed of
unhealthy individuals should also experience greater consequences of an emergent epidemics [29–
32]. Specifically, metabolic disorders such as diabetes may intensify pandemic complications [33,
34], whereas it has been observed that communicable diseases account for the majority of deaths in
5
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
complex emergencies [35]. In addition, there is empirical evidence that adequate hygiene greatly
reduces the rate of mortality [36, 37]. To quantify the national health characteristics we include 6
variables that asses the level of healthiness among the studied countries.
Societal characteristics: The characteristics of a society often reveal the way in which people
interact, and thus spread the disease. In this aspect, properties such as education and media usage
reflect the level of a person’s reaction and promotion of self-induced measures for reducing the
spread of the disease [54–58]. Governing behavior such as control of corruption, rule of law or
government effectiveness further enhance societal responsibility [59,60]. There are findings which
identify the religious view as a critical determinant in the health outcome [61, 62]. Evidently,
the religion drives a person’s attitudes towards cooperation, government, legal rules, markets, and
thriftiness [63]. Finally, the way we mix in society may effectively control the spread of infectious
diseases [17, 53, 64–66]. To measure the characteristics of a society we identify 9 variables.
Demographic structure: Similarly to the national health statistics, the demographic structure
may evaluate the susceptibility of the population to a disease. Certain age groups may simply have
weaker defensive health mechanisms to cope with the stress induced by the disease [67–69]. In
addition, the location of living may greatly affect the way in which the disease is spread [70, 71].
To express these phenomena we collect 6 variables.
Natural environment: A preserved natural environment ensures healthy lives and promotes
well-being for all at all ages. In contrast, countries where natural sustainability is deteriorated
and observables such as air pollution are of immense magnitude, are also more vulnerable to epi-
demic outbreak [6, 7, 72, 73]. However, healthy natural environments also attract a plethora of
tourists and thus may help in an easier transformation from epidemic to a pandemic. We gather the
data for 4 variables which capture the essence of this socio-economic characteristic [26].
6
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
7
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Determinants with strong evidence: (PIP > 0.5). The first group describes the determinants
which have by far larger posterior inclusion probability than the prior one, and thus there is strong
evidence to be included in the true model. We find two variables for which there is such evidence
in explaining the coronavirus cases: population size and GDP per capita (p.c.). The population size
is negatively related to the number of registered COVID-19 cases per million population, whereas
the GDP p.c. exhibits a positive effect on the same variable. In the situation of coronavirus deaths,
however, only the GDP p.c. remains a strong predictor, with a positive magnitude.
Determinants with medium evidence: ( 0.5 > PIP > 0.1). One variable displays medium ev-
idence for being a crucial socio-economic determinant of the registered COVID cases 19 – the
government health spending, with a positive impact. When looking at the BMA estimation of
COVID-19 deaths, besides this determinant, government effectiveness also shows a medium PIP
size.
Determinants with weak evidence: (0.1 >PIP> 0.01). These are determinants which have a
lower posterior than the prior probability to be included in the true model, but still may account
for some of the variations in the coronavirus outcome. For the cases per million population there
are 5 such determinants, out of which 2 have a negative impact: the percentage of the population
using internet and the mortality rate from lack of hygiene. The results suggest that life expectancy,
population density and a population consisting of majorly catholic population, have a low evidence
for having a negative marginal effect on the observed COVID-19 cases.
There are 7 determinants for which there is low evidence to be included in the true model
describing the coronavirus deaths. Four of them have a negative Post Mean: population size, rule
of law, control of corruption and mortality rate from lack of hygiene; whereas the presence of
catholic religion and the number of physicians pr capita exhibit a positive Post Mean value.
8
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Determinants with negligible evidence: (PIP< 0.01). In total, we find negligible evidence for
explaining the coronavirus cases in 27 potential determinants and for explaining the coronavirus
deaths in 25 potential determinants.
Table 2: BMA results with COVID-19 cases per million population as dependent variable.
9
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Table 3: BMA results with COVID-19 deaths per million population as dependent variable.
3 Discussion
The preliminary analysis suggests that only a handful of socio-economic determinants are able to
explain the current extent of the coronavirus pandemic. The sole determinant strongly related to
10
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
both coronavirus cases and coronavirus deaths is the level of economic development. Wealthier
countries display larger susceptibility to the disease. Interestingly, besides the level of economic
development, the population size is also a credible predictor of the registered coronavirus cases per
million population, with more populated economies showing greater resistance to being infected
by the virus.
A plentiful of reasons can be used as a possible interpretation for these results. For instance,
it is known that in structured populations, the degree of epidemic spread scales inversely with
population size [75]. This is because, everything else considered, in larger populations it is easier
to identify and target the critical individuals that are susceptible to the disease [76]. In a similar
fashion, various explanations can be found for the observed effect of economic development, such
as increased population mobility and aging population. However, it could also be the case that more
developed countries have a bigger testing power and provide better evidence for the coronavirus
situation. In fact, this may be suggested by our discovery that there is a medium evidence for past
government health expenditure to be positively associated with the coronavirus outcome.
Clearly, the interpretation of our analysis requires a more detailed background due to sev-
eral reasons. Among these reasons is the fact that we include several potential determinants only
through crude approximations. In particular, the level of social mixing is given simply as the aver-
age number of persons in a household or the dominant religion in the country. We do not follow the
exact social network structures within a population. It is evident that the inhomogeneous nature of
these spatial patterns has an essential role in propagation of diseases [17]. In this regard, in future
versions of this study we aim to incorporate more detailed measures which capture the essence
of social connectedness [77] the degree of individualism [78] and/or community mobility data3 .
In addition, the spread of the coronavirus is obviously still in a transient regime. Even though,
we include a proxy for the duration of the coronavirus pandemic in each country, this essentially
hinders the development of a coherent modeling framework.
These underexpressed effects may play a significant role in the final outcome of the coronavirus
pandemic. Nonetheless, in the absence of a unifying framework covering all relevant aspects, our
investigation acts as the starting point for the development of a more comprehensive understanding
of the socio-economic factors of the coronavirus pandemic. We believe that with the availability
of new data and the improved understanding of the dynamics of the coronavirus pandemic, some
of these shortcomings will be overcome, yielding a more reliable interpretation of the results.
References
[1] Neil Ferguson, Daniel Laydon, Gemma Nedjati Gilani, Natsuko Imai, Kylie Ainslie, Marc
Baguelin, Sangeeta Bhatia, Adhiratha Boonyasiri, ZULMA Cucunuba Perez, Gina Cuomo-
Dannenburg, et al. Report 9: Impact of non-pharmaceutical interventions (npis) to reduce
covid19 mortality and healthcare demand. 2020.
[2] William Gardner, David States, and Nicholas Bagley. The coronavirus and the risks to the
elderly in long-term care. Journal of Aging & Social Policy, pages 1–6, 2020.
3 Community mobility data is available at https://www.google.com/covid19/mobility/.
11
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[3] Carlos Kennedy Tavares Lima, Poliana Moreira de Medeiros Carvalho, Igor de Araújo Silva
Lima, José Victor Alexandre de Oliveira Nunes, Jeferson Seves Saraiva, Ricardo Inácio
de Souza, Claúdio Gleidiston Lima da Silva, and Modesto Leite Rolim Neto. The emo-
tional impact of coronavirus 2019-ncov (new coronavirus disease). Psychiatry Research,
page 112915, 2020.
[4] Janice Hopkins Tanne, Erika Hayasaki, Mark Zastrow, Priyanka Pulla, Paul Smith, and
Acer Garcia Rada. Covid-19: how doctors and healthcare systems are tackling coronavirus
worldwide. Bmj, 368, 2020.
[5] Ehab Mudher Mikhael and Ali Azeez Al-Jumaili. Can developing countries alone face corona
virus? an iraqi situation. Public Health in Practice, page 100004, 2020.
[6] Moreno Di Marco, Michelle L Baker, Peter Daszak, Paul De Barro, Evan A Eskew, Cecile M
Godde, Tom D Harwood, Mario Herrero, Andrew J Hoskins, Erica Johnson, et al. Opin-
ion: Sustainable development must account for pandemic risk. Proceedings of the National
Academy of Sciences, 117(8):3888–3892, 2020.
[7] Xiao Wu, Rachel C Nethery, Benjamin M Sabath, Danielle Braun, and Francesca Dominici.
Exposure to air pollution and covid-19 mortality in the united states. medRxiv, 2020.
[8] Adrian E Raftery, David Madigan, and Jennifer A Hoeting. Bayesian model averaging for
linear regression models. Journal of the American Statistical Association, 92(437):179–191,
1997.
[9] Jennifer A Hoeting, David Madigan, Adrian E Raftery, and Chris T Volinsky. Bayesian model
averaging: a tutorial. Statistical science, pages 382–401, 1999.
[10] Xavier Sala-i Martin, Gernot Doppelhofer, and Ronald I Miller. Determinants of long-term
growth: A bayesian averaging of classical estimates (bace) approach. American economic
review, pages 813–835, 2004.
[11] Joseph T Wu, Kathy Leung, and Gabriel M Leung. Nowcasting and forecasting the potential
domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a
modelling study. The Lancet, 395(10225):689–697, 2020.
[12] Adam J Kucharski, Timothy W Russell, Charlie Diamond, Yang Liu, John Edmunds, Sebas-
tian Funk, Rosalind M Eggo, Fiona Sun, Mark Jit, James D Munday, et al. Early dynamics of
transmission and control of covid-19: a mathematical modelling study. The Lancet Infectious
Diseases, 2020.
[13] Norman TJ Bailey et al. The mathematical theory of infectious diseases and its applications.
Charles Griffin & Company Ltd, 5a Crendon Street, High Wycombe, Bucks HP13 6LE.,
1975.
[14] P Van den Driessche and James Watmough. Further notes on the basic reproduction number.
In Mathematical epidemiology, pages 159–178. Springer, 2008.
12
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[15] Ashleigh R Tuite, Amy L Greer, Michael Whelan, Anne-Luise Winter, Brenda Lee, Ping Yan,
Jianhong Wu, Seyed Moghadas, David Buckeridge, Babak Pourbohloul, et al. Estimated
epidemiologic parameters and morbidity associated with pandemic h1n1 influenza. Cmaj,
182(2):131–136, 2010.
[16] Matt J Keeling and Pejman Rohani. Modeling infectious diseases in humans and animals.
Princeton University Press, 2011.
[17] Petra Klepac, Adam J Kucharski, Andrew JK Conlan, Stephen Kissler, Maria Tang, Hannah
Fry, and Julia R Gog. Contacts in context: large-scale setting-specific social mixing matrices
from the bbc pandemic project. medRxiv, 2020.
[19] Stelios H Zanakis, Cecilia Alvarez, and Vivian Li. Socio-economic determinants of hiv/aids
pandemic and nations efficiencies. European Journal of Operational Research, 176(3):1811–
1838, 2007.
[20] Ralf L Itzwerth, C Raina MacIntyre, Smita Shah, and Aileen J Plant. Pandemic influenza
and critical infrastructure dependencies: possible impact on hospitals. Medical journal of
Australia, 185(S10):S70–S72, 2006.
[21] Richard J Whitley and Arnold S Monto. Seasonal and pandemic influenza preparedness: a
global threat. The Journal of infectious diseases, 194(Supplement 2):S65–S69, 2006.
[22] Robert F Breiman, Abdulsalami Nasidi, Mark A Katz, M Kariuki Njenga, and John Verte-
feuille. Preparedness for highly pathogenic avian influenza pandemic in africa. Emerging
infectious diseases, 13(10):1453, 2007.
[23] B Adini, A Goldberg, R Cohen, and Y Bar-Dayan. Relationship between equipment and
infrastructure for pandemic influenza and performance in an avian flu drill. Emergency
Medicine Journal, 26(11):786–790, 2009.
[24] Andrew L Garrett, Yoon Soo Park, and Irwin Redlener. Mitigating absenteeism in hospital
workers during a pandemic. Disaster medicine and public health preparedness, 3(S2):S141–
S147, 2009.
[25] Hitoshi Oshitani, Taro Kamigaki, and Akira Suzuki. Major issues and challenges of influenza
pandemic preparedness in developing countries. Emerging infectious diseases, 14(6):875,
2008.
[26] Parviez Hosseini, Susanne H Sokolow, Kurt J Vandegrift, A Marm Kilpatrick, and Peter
Daszak. Predictive power of air travel and socio-economic data for early pandemic spread.
PLoS One, 5(9), 2010.
[27] Sandra Crouse Quinn and Supriya Kumar. Health inequalities and infectious disease epi-
demics: a challenge for global health security. Biosecurity and bioterrorism: biodefense
strategy, practice, and science, 12(5):263–273, 2014.
13
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[28] Daniel R Hogan, Gretchen A Stevens, Ahmad Reza Hosseinpoor, and Ties Boerma. Moni-
toring universal health coverage within the sustainable development goals: development and
baseline data for an index of essential health services. The Lancet Global Health, 6(2):e152–
e168, 2018.
[29] Michael Marmot. Social determinants of health inequalities. The lancet, 365(9464):1099–
1104, 2005.
[30] S-C Chen and C-M Liao. Modelling control measures to reduce the impact of pandemic
influenza among schoolchildren. Epidemiology & Infection, 136(8):1035–1045, 2008.
[31] Elaine Kelly. The scourge of asian flu in utero exposure to pandemic influenza and the
development of a cohort of british children. Journal of Human resources, 46(4):669–694,
2011.
[32] Jonathan S Nguyen-Van-Tam and Alan W Hampson. The epidemiology and clinical impact
of pandemic influenza. Vaccine, 21(16):1762–1768, 2003.
[33] Dieren Susan van, Joline WJ Beulens, Schouw Yvonne T. van der, Diederick E Grobbee, and
Bruce Nealb. The global burden of diabetes and its complications: an emerging pandemic.
European Journal of Cardiovascular Prevention & Rehabilitation, 17(1 suppl):s3–s8, 2010.
[34] Robert Allard, Pascale Leclerc, Claude Tremblay, and Terry-Nan Tannenbaum. Diabetes and
the severity of pandemic influenza a (h1n1) infection. Diabetes care, 33(7):1491–1493, 2010.
[35] Máire A Connolly, Michelle Gayer, Michael J Ryan, Peter Salama, Paul Spiegel, and David L
Heymann. Communicable diseases in complex emergencies: impact and challenges. The
Lancet, 364(9449):1974–1983, 2004.
[36] Carol W Bassim, Gretchen Gibson, Timothy Ward, Brian M Paphides, and Donald J De-
Nucci. Modification of the risk of mortality from pneumonia with oral hygiene care. Journal
of the American Geriatrics Society, 56(9):1601–1607, 2008.
[37] F Müller. Oral hygiene reduces the mortality from aspiration pneumonia in frail elders.
Journal of dental research, 94(3 suppl):14S–16S, 2015.
[38] John Strauss and Duncan Thomas. Health, nutrition, and economic development. Journal of
economic literature, 36(2):766–817, 1998.
[39] Guillem López i Casasnovas, Berta Rivera, Luis Currais, et al. Health and economic growth:
findings and policy implications. Mit Press, 2005.
[40] Jeffrey Sachs. Macroeconomics and health: investing in health for economic development.
World Health Organization, 2001.
[41] Quamrul H Ashraf, Ashley Lester, and David N Weil. When does improving health raise
gdp? NBER macroeconomics annual, 23(1):157–204, 2008.
[42] Sara Markowitz, Erik Nesson, and Joshua Robinson. The effects of employment on influenza
rates. Technical report, National Bureau of Economic Research, 2010.
14
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[43] Samuel H Preston. The changing relation between mortality and level of economic develop-
ment. Population studies, 29(2):231–248, 1975.
[44] Spencer L James, Paul Gubbins, Christopher JL Murray, and Emmanuela Gakidou. Devel-
oping a comprehensive time series of gdp per capita for 210 countries from 1950 to 2015.
Population health metrics, 10(1):12, 2012.
[45] Stephen Bezruchka. The effect of economic recession on population health. Cmaj,
181(5):281–285, 2009.
[46] José A Tapia Granados and Edward L Ionides. The reversal of the relation between eco-
nomic growth and health progress: Sweden in the 19th and 20th centuries. Journal of health
economics, 27(3):544–563, 2008.
[47] Richard Wilkinson and Kate Pickett. The spirit level. Why equality is better for everyone,
2010.
[48] Majid Ezzati, Ari B Friedman, Sandeep C Kulkarni, and Christopher JL Murray. The reversal
of fortunes: trends in county mortality and cross-county mortality disparities in the united
states. PLoS medicine, 5(4), 2008.
[49] Arjumand Siddiqi and Clyde Hertzman. Towards an epidemiological understanding of the
effects of long-term institutional changes on population health: a case study of canada versus
the usa. Social science & medicine, 64(3):589–603, 2007.
[50] Ichiro Kawachi and Bruce P Kennedy. Income inequality and health: pathways and mecha-
nisms. Health services research, 34(1 Pt 2):215, 1999.
[51] Kim Krisberg. Income inequality: When wealth determines health: Earnings influential as
lifelong social determinant of health, 2016.
[52] Jérôme Adda. Economic activity and the spread of viral diseases: Evidence from high fre-
quency data. The Quarterly Journal of Economics, 131(2):891–941, 2016.
[53] Kiesha Prem, Alex R Cook, and Mark Jit. Projecting social contact matrices in 152 countries
using contact surveys and demographic data. PLoS computational biology, 13(9):e1005697,
2017.
[54] Robert Putnam. Social capital: Measurement and consequences. Canadian journal of policy
research, 2(1):41–51, 2001.
[55] Sherman Folland. Does “community social capital” contribute to population health? Social
science & medicine, 64(11):2342–2354, 2007.
[56] Chul-Joo Lee and Daniel Kim. A comparative analysis of the validity of us state-and county-
level social capital measures and their associations with population health. Social indicators
research, 111(1):307–326, 2013.
15
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[57] David P Baker, Juan Leon, Emily G Smith Greenaway, John Collins, and Marcela Movit. The
education effect on population health: a reassessment. Population and development review,
37(2):307–332, 2011.
[58] Johan P Mackenbach, Irina Stirbu, Albert-Jan R Roskam, Maartje M Schaap, Gwenn Men-
vielle, Mall Leinsalu, and Anton E Kunst. Socioeconomic inequalities in health in 22 euro-
pean countries. New England journal of medicine, 358(23):2468–2481, 2008.
[59] Elinor Ostrom. Governing the commons: The evolution of institutions for collective action.
Cambridge university press, 1990.
[60] Jane Mansbridge. The role of the state in governing the commons. Environmental Science &
Policy, 36:8–10, 2014.
[61] Linda M Chatters. Religion and health: Public health research and practice. Annual review
of public health, 21(1):335–367, 2000.
[62] George K Jarvis and Herbert C Northcott. Religion and differences in morbidity and mortal-
ity. Social science & medicine, 25(7):813–824, 1987.
[63] Luigi Guiso, Paola Sapienza, and Luigi Zingales. People’s opium? religion and economic
attitudes. Journal of monetary economics, 50(1):225–282, 2003.
[64] Niel Hens, Girma Minalu Ayele, Nele Goeyvaerts, Marc Aerts, Joel Mossong, John W Ed-
munds, and Philippe Beutels. Estimating the impact of school closure on social mixing
behaviour and the transmission of close contact infections in eight european countries. BMC
infectious diseases, 9(1):187, 2009.
[65] Joël Mossong, Niel Hens, Mark Jit, Philippe Beutels, Kari Auranen, Rafael Mikolajczyk,
Marco Massari, Stefania Salmaso, Gianpaolo Scalia Tomba, Jacco Wallinga, et al. Social
contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine,
5(3), 2008.
[66] Alessia Melegaro, Mark Jit, Nigel Gay, Emilio Zagheni, and W John Edmunds. What types
of contacts are important for the spread of infections? using contact survey data to explore
european mixing patterns. Epidemics, 3(3-4):143–151, 2011.
[67] Jacco Wallinga, Peter Teunis, and Mirjam Kretzschmar. Using data on social contacts to esti-
mate age-specific transmission parameters for respiratory-spread infectious agents. American
journal of epidemiology, 164(10):936–944, 2006.
[68] Anton Erkoreka. The spanish influenza pandemic in occidental europe (1918–1920) and
victim age. Influenza and other respiratory viruses, 4(2):81–89, 2010.
[69] Gregory L Armstrong, Laura A Conn, and Robert W Pinner. Trends in infectious disease
mortality in the united states during the 20th century. Jama, 281(1):61–66, 1999.
[70] Rossana Mastrandrea, Julie Fournet, and Alain Barrat. Contact patterns in a high school:
a comparison between data collected using wearable sensors, contact diaries and friendship
surveys. PloS one, 10(9), 2015.
16
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
[71] Adam J Kucharski, Kin O Kwok, Vivian WI Wei, Benjamin J Cowling, Jonathan M Read,
Justin Lessler, Derek A Cummings, and Steven Riley. The contribution of social behaviour
to the transmission of influenza a in a human population. PLoS pathogens, 10(6), 2014.
[72] AL Braga, A Zanobetti, and J Schwartz. Do respiratory epidemics confound the association
between air pollution and daily deaths? European Respiratory Journal, 16(4):723–728, 2000.
[73] Karen Clay, Joshua Lewis, and Edson Severnini. Pollution, infectious disease, and mortal-
ity: evidence from the 1918 spanish influenza pandemic. The Journal of Economic History,
78(4):1179–1209, 2018.
[74] Enrique Moral-Benito. Determinants of economic growth: a bayesian panel data approach.
Review of Economics and Statistics, 94(2):566–579, 2012.
[75] Moez Draief, Ayalvadi Ganesh, and Laurent Massoulié. Thresholds for virus spread on
networks. In Proceedings of the 1st international conference on Performance evaluation
methodolgies and tools, pages 51–es, 2006.
[76] Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene
Stanley, and Hernán A Makse. Identification of influential spreaders in complex networks.
Nature physics, 6(11):888–893, 2010.
[77] Michael Bailey, Rachel Cao, Theresa Kuchler, Johannes Stroebel, and Arlene Wong. Social
connectedness: Measurement, determinants, and effects. Journal of Economic Perspectives,
32(3):259–80, 2018.
[78] Geert Hofstede. Culture’s consequences: Comparing values, behaviors, institutions and
organizations across nations. Sage publications, 2001.
[79] Carmen Fernandez, Eduardo Ley, and Mark FJ Steel. Model uncertainty in cross-country
growth regressions. Journal of applied Econometrics, 16(5):563–576, 2001.
[80] Carmen Fernandez, Eduardo Ley, and Mark FJ Steel. Benchmark priors for bayesian model
averaging. Journal of Econometrics, 100(2):381–427, 2001.
[81] Eduardo Ley and Mark FJ Steel. On the effect of prior assumptions in bayesian model averag-
ing with applications to growth regression. Journal of applied econometrics, 24(4):651–674,
2009.
17
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Supplementary material
S1 Data description
The data for the dependent variable are taken from Worldometer’s 2019-20 Coronavirus tracker.
The tracker offers live coverage of country coronavirus statistics, by collecting data from sources
which include Official Websites of Ministries of Health or other Government Institutions and Gov-
ernment authorities’ social media accounts. Because national aggregates often lag behind the re-
gional and local health departments’ data, an important part of the data collection process consists
in monitoring thousands of daily reports released by local authorities. The current results were
made with data gathered on 11th April 2020.
The data used for calculation of the stringency index and the days since the first regisrered
COVID-19 case are gathered from Oxford’s COVID-19 government response tracker. Finally,
the data used for measuring the possible socio-economic determinants are gathered from 7 var-
ious sources. In particular, the collection is as follows: 25 determinants are from the World
Bank’s World Development Indicators (WDI), 3 determinants are respectively from the Nation-
master database (NM) and the World Governance Indicators (WGI), and there is 1 determinant
from the International Monetary Fund’s (IMF), the State of Global Air (SGA), the Global footprint
network (GFN) and from the United Nations (UN) database. The list of sources together with links
to their websites is given in Table S1.
Source Link
Covid cases/deaths www.worldometers.info/coronavirus
GFN data.footprintnetwork.org
Gov. Stringency covidtracker.bsg.ox.ac.uk
IMF www.imf.org/en/data
NM www.nationmaster.com
SGA www.stateofglobalair.org/engage
UN data.un.org
WDI data.worldbank.org/
WGI info.worldbank.org/governance/wgi
To reduce the noise from the data we restrict to using only countries with population above 1
million. In addition, we only use countries for which there is data on all of the potential socio-
economic determinants. Table S2 gives the countries for which all of these data was available.
18
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Country
Argentina Finland Malaysia Slovakia
Australia France Mauritius Slovenia
Austria Germany Mexico South Africa
Bangladesh Gyana Morocco Spain
Belgium Greece Myanmar Switzerland
Bolivia Guatemala Netherlands Tanzania
Brazil Honduras New Zealand Thailand
Bulgaria Hungary Nigeria Turkey
Cameroon India Norway United Kingdom
Canada Indonesia Pakistan United States
Chile Iraq Panama Uganda
China Ireland Paraguay Ukraine
Columbia Israel Peru Venezuela
Costa Rica Italy Philippines Vietnam
Croatia Japan Poland Zambia
Czech Republic Jordan Portugal Zimbabwe
Dominican Rep. Kazakhstan Russia
Ecuador Kenya Rwanda
El Salvador Madagascar Serbia
Altogether, we end up with data on 35 variables and 72 countries. Table S3 reports the summary
statistics of each variable. We hereby point out that as a measure of the determinant the log of the
last observed value is taken, unless otherwise stated in Table S3.
19
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
S2 Methods
S2.1 Stringency index
To calculate our government stringency measure we make use of Oxford’s daily government strin-
gency index. Oxford’s daily government stringency index measures on a scale of 1-100 the varia-
tion in daily government responses to COVID-19 by accumulating ordinal data on country social
distancing measures on school, workplace and public transport closure; cancellation of public
events; restrictions of internal movement; control of international travel and promotion of public
campaigns on prevention of coronavirus spread.
To calculate the overall index stringency index ci (di ) at a final date di from the provided daily
indexes we implement the following procedure. Let ci (t) represent the government stringency on
day t, then our index can be estimated as
di
ci (di ) = ∑ wi(s)ci(s), (S1)
s=1
where wi (s) are the weights given to each day and s = 1 is the day of the first registered case. We
use a simple inverse weight procedure by giving larger weights to earlier dates, i.e.,
1 di 1
wi (s) = / ∑ . (S2)
s k=1 k
It is clear that the posterior probability is proportional to f (y|βm , Mm ), - the likelihood of seeing the
data under model Mm with parameters βm , and g(βm |Mm ) – the prior distribution of the parameters
included in the proposed model. By assuming a prior model probability P(Mm ) we can implement
the same rule to evaluate the posterior probability that model Mm is the true one, as
The term f (y|Mm ) is called the marginal likelihood of the model and is used to compare differ-
ent models to each other. The posterior model probability can also be written as
Bm0 P(Mm )
P(Mm |y) = k , (S5)
∑2n=1 Bn0 P(Mn )
21
medRxiv preprint doi: https://doi.org/10.1101/2020.04.15.20066068; this version posted April 17, 2020. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
where Bm0 is the Bayes information criterion between model Mm and the baseline model M0 . In
our case this is the model including government social distancing measures and the length of the
coronavirus crisis in the country.
With this setup, we can define the posterior distribution of β as a weighted average of the
posterior distributions of the parameters under each model using the posterior model probabilities
as weights
2k
g(β |y) = ∑ g(β |y, Mm)P(Mm|y). (S6)
j=1
Here, we are interested only in some parameters of the posterior distribution, such as the pos-
terior mean and variance of each parameter. Using equation (S6) we can calculate the posterior
mean as:
2k
E [(β |y] = ∑ E [(β |y, Mm] P(Mm|y), (S7)
m=1
2k 2k 2
var [(β |y] = ∑ var [(β |y, Mm] P(Mm|y) + ∑ P(Mm |y) E [(β |y, Mm ] − E [(β |y, ] . (S8)
m=1 m=1
Since the posterior mean is a point estimate of the average marginal contribution, we use it as
our measure of the effect of the determinant on the COVID-19 impact.
Another interesting statistic is the posterior inclusion probability PIPh of a variable h, which
measures the posterior probability that the variable is included in the ‘true’ model. Mathematically,
PIPh is defined as the sum of the posterior model probabilities for all of the models that include
the variable:
2k
PIPh = (P(βh 6= 0) = ∑ P(Mm |y). (S9)
m:βh 6=0
Posterior inclusion probabilities offer a more robust way of determining the effect of a variable
in a model, as opposed to using p-values for determining statistical significance of a model coef-
ficient because they incorporate the uncertainty of model selection. According to equations (S3)
and (S4), it is clear that we need to specify priors for the parameters of each model and for the
model probability itself. To keep the model simple and easily implemented here we use the most
often implemented priors. In other words, for the parameter space we elicit a prior on the error
variance that is proportional to its inverse, p(σ 2 ) ≈ 1/σ 2 , and a uniform distribution on the inter-
cept, p(α) → 1, while the Zellner’s g-prior is used for the βm parameters, and for the model space
we utilise the Beta-Binomial prior. To estimate the posterior parameters we use a Markov Chain
Monte Carlo (MCMC) sampler, and report results from a run with 200 million recorded drawings
and after a burn-in of 100 million discarded drawings. The theoretical background behind our
setup can be read in Refs. [18, 79–81].
22