Skip to main content

Biostatistics Department Seminar: Statistical Challenges in the Selection of Causal Graphical Models

Department & Center Events

Monday, April 29, 2024, 12:05 p.m. - 12:50 p.m. ET
Wolfe Street Building/W2008
Past Event

Biostatistics Department Seminar 

Title: Statistical Challenges in the Selection of Causal Graphical Models (“causal discovery”) and Post-Selection Estimation of Causal Effects

Abstract: Causal graphical models (e.g., DAGs) are used many scientific domains to represent important causal assumptions about the processes that underlie collected data. The focus of this work is on graphical causal discovery, i.e., the data-driven model selection of graphs, for the “downstream” purpose of using the estimated graphs for subsequent causal inference tasks such as establishing the identifying formula for some causal effect of interest and then estimating it. An obstacle to having confidence in existing causal discovery algorithms in public health applications is that these algorithms tend to estimate structures that are overly sparse – missing too many edges. However, statistical “caution” (or “conservativism”) would err on the side of more dense graphs rather than more sparse graphs. We propose to reformulate the conditional independence hypothesis tests of classical constraint-based algorithms as equivalence tests: test the null hypothesis of association greater than some (user-chosen, sample-size dependent) threshold, rather than test the null of no association. We argue this addresses several important statistical issues in applied causal model selection and leads to procedures with desirable behaviors and properties. Time-permitting, we will also discuss recent work on addressing a related issue: the problem of valid “post-selection” inference, i.e., constructing valid confidence intervals for causal effects that account for the model selection process.


Daniel Malinsky, PhD

Daniel Malinsky, PhD, is assistant professor of Biostatistics in the Mailman School of Public Health at Columbia University. His research focuses on causal inference: developing statistical methods and machine learning tools to support inference about the consequences of (e.g) medical decisions, environmental & social exposures, and policies.

Zoom Registration

If you would like to join via Zoom, please register here.

2023-2024 Monday Seminar Series

All seminars are held at 12:05 PM via Zoom and onsite in Room W2008. View all seminar information here.