# Statistical Modeling The Two Cultures

DSA ADS Course, 2022

Statistical Analysis, Data Interpretation, Causal Analysis, Causality, Data Fusion, Missing Data, Counterfactuals

Discuss the two cultures of statistical modeling according to Leo Breiman in light of recent advances in machine learning and causal inference and the separation between the data-fitting and data-interpretation components of statistical modeling.

See also: Causally Colored Reflections on Leo Breiman’s “Statistical Modeling: The Two Cultures”

Abstract

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated bya given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical communityhas been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theoryand practice, has developed rapidlyin fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move awayfrom exclusive dependence on data models and adopt a more diverse set of tools.