Causal Inference in Statistics: A Primer Paperback – 19 February 2016
|New from||Used from|
Frequently bought together
From the Back Cover
CAUSAL INFERENCE IN STATISTICS
Causality is central to the understanding and use of data. Without an understanding of cause-effect relationships, we cannot use data to answer questions as basic as "Does this treatment harm or help patients?" But though hundreds of introductory texts are available on statistical methods of data analysis, until now, no beginner-level book has been written about the exploding arsenal of methods that can tease causal information from data.
Causal Inference in Statistics fills that gap. Using simple examples and plain language, the book lays out how to define causal parameters; the assumptions necessary to estimate causal parameters in a variety of situations; how to express those assumptions mathematically; whether those assumptions have testable implications; how to predict the effects of interventions; and how to reason counterfactually. These are the foundational tools that any student of statistics needs to acquire in order to use statistical methods to answer causal questions of interest.
This book is accessible to anyone with an interest in interpreting data, from undergraduates, professors, researchers, or to the interested layperson. Examples are drawn from a wide variety of fields, including medicine, public policy, and law; a brief introduction to probability and statistics is provided for the uninitiated; and each chapter comes with study questions to reinforce the readers understanding.
No customer reviews
|5 star (0%)||0%|
|4 star (0%)||0%|
|3 star (0%)||0%|
|2 star (0%)||0%|
|1 star (0%)||0%|
Review this product
Most helpful customer reviews on Amazon.com
I would recommend this book to anyone who has a at least a working knowledge of statistics. I would consider this book for an upper level undergrad course, and certainly one of the books for a graduate course on the topic. If Professor Pearl’s lectures are anything like this book, I would enjoy sitting in on any lecture he gives.
The book starts off by challenging the reader with the intriguing proposition that data, by themselves, lack sufficient information to permit proper causal analysis. What is required for sensible evaluations of data are causal hypotheses. Clever, simple examples are used to show that if we make the wrong scientific assumptions about how a system works, we can derive very incorrect conclusions from our data. This illustration sets the stage for the rest of the book by leaving us wondering, “What additional set of rules would we need in order to draw causal inferences from data?”
Chapter 1 does a rather brilliant job of providing the minimum essential set of background information for the task at hand. Basic concepts such as conditional probability and conditional independence are defined, along with essential quantities and relationships, to set the stage for later computations. After a few pages, the book then departs from conventional treatments by presenting the elements of graph theory as an equally-important set of background information. Graphs, specifically probabilistic causal networks, represent one of the key pieces that has been missing from the field of statistics, but that is absolutely essential for representing and evaluating causal hypotheses for analysis. The reader simply needs to get to page 24 to begin to encounter the unique information in this highly readable treatment. As Chapter 1 continues, probability theory and graph theory are married though the combination of the “Structural Causal Model”, which specifies the variables and connecting functions, and the “Graphical Causal Model”, which summaries the causal logic of network relations.
In Chapter 2, in only a few pages, the book presents the core “rules” that establish much of the logic for causal analysis within a graph-theoretic framework. Again, the book is truly outstanding in its capacity to distill fundamental ideas to their basics and clearly illustrate with examples. Following this treatment, Chapter 3 then begins to move the reader into a thorough consideration of the interventionist perspective. In essence, causal modeling asks questions about the outcomes of interventions – “What would happen to Y if we were to change X?” Of course sometimes we have information from manipulative experiments, but the greater challenge is to address this question using observational data and causal rules. In this chapter, the rules of engagement are presented. We encounter new mathematical concepts, like the “do” operator and formulae for adjusting for covariates and calculating causal effects. We also encounter rules like the backdoor and front-door criteria. A central feature of causal networks, mediation, is presented and described. Chapter 3 ends by transitioning from general rules that apply to models of all forms to illustrations obtained through reference to linear Gaussian systems. This final set of examples connects the graph-theoretic perspective with more traditional formulations and examples of structural equations. Here more direct comparisons between, for example, regression coefficients and structural coefficients are made. An elegant and crystal clear introduction to instrumental variables ends the chapter and in the process, links the new material presented in this book with yet another historical body of causal modeling literature. It is impressive to see this accomplished in such a compact fashion.
Chapter 4 turns to a topic that will be unfamiliar to many as a formal subject – counterfactuals. Simply put, counterfactuals are questions about, “What would have happened to individual i if they had not been exposed to treatment X=1 (if they had not received the drug treatment)?” This seemingly innocent question, as the chapter goes on to reveal, unlocks much additional power derived from the causal modeling system presented in the book. To begin with, counterfactuals lead us necessarily from the population to the individual level, since these are questions about what would have happened to an individual if a different choice or event had happened in the past. Considering the individual level, we begin to realize that all along we have had unique information about individuals that has been ignored via summarization. With counterfactuals, ignoring is no longer appropriate. At the outset, the reader will assume perhaps that the counterfactual question is an impossible question to answer, even with randomized experiments. If individual X(1) is included in the treatment group that received a placebo, how are we to know what might have happened if they had actually received the drug? Surprisingly, a general solution to this problem is offered using the logic of the Structural Causal Model and the fundamental law of counterfactuals. Following a series of illustrations developed for a variety of situations, the chapter ends with a summary of essential information in the form of a computational toolkit for causal analysis. Clearly, this book goes beyond an exposition of ideas to provide the reader with a functional knowledge of causal analysis principles.
Throughout, this lucid and concise book explains concepts through the presentation of multiple, simple examples – a strategy that works exceptionally well, making this the most accessible presentation of this material I have read. The reader will be well rewarded for buying and reading this book and I recommend it with enthusiasm for both practicing scientists and students of statistics.
The sections of the book which deal with graphical solutions to causal inference problems are particularly well written, as one would expect since Pearl was a pioneer in this approach.
However, as someone who makes use of causal inference theory on a daily basis, I was hoping for something that would add a theoretical grounding to my applied experience but found almost nothing applicable in this text. There is no discussion of techniques like Coarsened Exact Weighting or Propensity Score Weighting and a fortiori no practical guidance on how to use these techniques. Dr. Pearl does mention that his approach is compatible with the alternative theoretical work by Rosenbaum but he doesn't demonstrate how this is true. Those coming from a tradition which starts with Rubin and Rosenbaum's The Central Role of the Propensity Score will find in this text a completely different approach.
The book is certainly engaging, however I wish I was theoretically sophisticated enough to connect this approach with more traditional approaches to causal inference. My intuition is that most people doing applied work in this field will come to a similar conclusion.