How bad data quality can turn a simulation into a dissimulation that shapes the future
DSA ADS Course, 2022
Data Quality, Black Box Models, Bad Models, Origins of SARS-CoV-2, Epidemiology, Non-pharmaceutical Interventions, Mitigation Strategies, COVID19, Health Policy, Public Policy
Review data quality evaluations, bad model design, policy decision-making based on data science, black box models, unwarranted assumptions in models, and evidence based policy making with near real-time data.
Discuss data collection processes, importance of uniform and appropriate classifications and definitions in subject matter domain, data quality processes, data interpretation, testing technologies with process quality controls and unknown variable time lags.
Discuss causal inference methods, Bayesian modeling and Bayesian inference of parameters from the data. Review potential dangers of combination of dynamic modeling with Bayesian parameter estimation and how parameter estimates may be significantly affected by priors and constrained to a bad fit. Beware of justifying a priori modeling assumptions. How to account for and measure multi-scale effects and confounding variables?
Discuss parameter degeneracy.
Discuss the separation between the data-fitting and data-interpretation components of statistical modeling.
Examine C19 policy decisions and mistakes as textbook examples of data science negligence and policy making malpractice.
“Once fear has been installed into a community, as was the case with the media reporting on the pandemic (Bendau et al., 2021), rational argument seems to lose effectiveness due to an anxiety-induced hypersensitivity in recognizing, processing, and responding to threat-related information, even in the absence of actual threat and the presence of contradicting information (Bar-Haim, Lamy, Pergamin, Bakermans-Kranenburg, & van IJzendoorn, 2007).”
“The idea that we can plan and control our future is conceptually flawed, as any attempt to plan or control it will inevitably influence its outcome, often in unforeseen ways (Fuller, 2017).”
How bad data quality can turn a simulation into a dissimulation that shapes the future - December, 2021
- We show how an important simulation of SARS-CoV2 growth rates that shaped future policies is in fact wrong and thereby.
- A simulation becomes a dissimulation.
- Because the wrong data were input into the model.
During the spread of SARS-Cov-2, Germany imposed various restrictions, including the closure of schools on March 16 2020, and an extensive lockdown on March 23 2020. In this paper, we show how the influential simulation of the purported beneficial effects of this lockdown in Germany was based on wrong data, but nevertheless played a decisive role in shaping the future by allegedly producing evidence for the effectiveness of these measures, lending scientific credibility to policies. We point out that the evaluation of the success of such policies depends critically on data quality. Using publicly reported confirmed cases for the calculation of time series statistics is apt to produce misleading results because these data come with unknown variable time lags. Using data on incident cases, i.e., dates of the onset of symptoms, produces results that are much more reliable. Using this method demonstrates that previous analyses stating that the mitigation strategies of the German government were necessary and effective are indeed flawed. This in turn shows that model simulations and dissimulations are very close neighbors.