Skip to main content

Biostatistics Dept Seminar: Proximal Causal Inference With Text Data

Department and Center Event
Monday, November 11, 2024, 12:05 p.m. - 1:00 p.m. ET
Location
Wolfe Street Building/W3030
Hybrid
Add to Calendar 15 jhu-bsph-308611 Biostatistics Dept Seminar: Proximal Causal Inference With Text Data

For more information, visit the event page:
https://publichealth.jhu.edu/node/308611.

Johns Hopkins Bloomberg School of Public Health
2024-11-11 17:05 2024-11-11 18:00 UTC use-title Location Wolfe Street Building/W3030

Biostatistics Department Seminar 

Title: Proximal Causal Inference With Text Data

Abstract: Recent text-based causal methods attempt to mitigate confounding bias by estimating proxies of confounding variables that are partially or imperfectly measured from unstructured text data. These approaches, however, assume analysts have supervised labels of the confounders given text for a subset of instances, a constraint that is sometimes infeasible due to data privacy or annotation costs. In this work, we address settings in which an important confounding variable is completely unobserved. We propose a new causal inference method that uses multiple instances of pre-treatment text data, infers two proxies from two zero-shot models (e.g., large language models) on the separate instances, and applies these proxies in the proximal g-formula. We prove that our text-based proxy method satisfies identification conditions required by the proximal g-formula while other seemingly reasonable proposals do not. We evaluate our method in synthetic and semi-synthetic settings and find that it produces estimates with low bias. To address untestable assumptions associated with the proximal g-formula, we further propose an odds ratio falsification heuristic. This new combination of proximal causal inference and zero-shot classifiers expands the set of text-specific causal methods available to practitioners.

Rohit Bhattacharya

Speakers

Rohit Bhattacharya is an assistant professor in the Department of Computer Science at Williams College.

Zoom Registration

If you would like to join via Zoom, please register here.

2024-2025 Monday Seminar Series

All seminars are held at 12:05 PM via Zoom and onsite. View all seminar information here.