# The Causal Foundations of Structural Equation Modeling

June, 2021

Abstract

The role of causality in SEM research is widely perceived to be, on the one hand, of pivotal methodological importance and, on the other hand, confusing, enigmatic, and controversial. The confusion is vividly portrayed, for example, in the influential report of Wilkinson and Task Force (1999), on “Statistical Methods in Psychology Journals: Guidelines and Explanations.” In discussing SEM, the report starts with the usual warning: “Correlation does not prove causation,” but then ends with a startling conclusion: “The use of complicated causal-modeling software [read SEM] rarely yields any results that have any interpretation as causal effects.” The implication being that the entire enterprise of causal modeling, from Sewell Wright (1921) to Blalock (1964) and Duncan (1975), the entire literature in econometric research, including modern advances in graphical and nonparametric structural models, has been misguided, for researchers have been chasing parameters that have no causal interpretation.

The motives for such overstatements notwithstanding, readers may rightly ask: “If SEM methods do not ‘prove’ causation – a fact we all accept – how can they yield results that have causal interpretation? – a belief we all share in practice.”

The answer is that a huge logical gap exists between “proving causation,” which requires careful manipulative experiments, and “interpreting parameters as causal effects,” which may be based on firm scientific knowledge or on previously conducted experiments, perhaps by other researchers. One can legitimately be in possession of a parameter that stands for a causal effect and still be unable, using statistical means alone, to determine the magnitude of that parameter given nonexperimental data. As a matter of fact, we know that no such statistical means exists; that is, causal effects in observational studies can only be substantiated from a combination of data and untested theoretical assumptions, not from the data alone.