Enrol Here
- –
- 2 Days
- In Person: Sidney Sussex College, University of Cambridge
- Stata, Python
Overview
This two-day interactive course on Causal Inference and Machine Learning offers a rigorous and comprehensive exploration of when, why, and how machine learning methods enhance causal inference. The program equips participants with both the technical foundations and the critical perspective needed to navigate the rapidly evolving landscape where econometric rigor meets algorithmic flexibility.
The course will focus upon topics at the intersection of machine learning and econometrics, covering a mix of theory and applications, providing a conceptual framework and the practical skills needed to make informed methodological choices, understanding both the power and the limitations of machine learning in causal contexts.
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown.
Breiman [2001], p199.
Ideal for economists, industry professionals, data scientists and policymakers the course will provide essential knowledge for navigating the intersection of causality and prediction in the modern data era. Participants will gain a sophisticated, nuanced understanding of how principled approaches to causal machine learning can enhance empirical research when applied thoughtfully and rigorously.
Course Aims & Objectives
- Review the fundamental differences between machine learning and econometrics
- Introduce the use of a range of 'regularisation' methods that seek to manage the problems associated with having too much data
- Introduce Causal and Generalised Random Forests as means to estimate heterogenous treatment effects
- Gain the technical foundations and the critical perspective needed to navigate the rapidly evolving landscape when econometric rigor meets algorithmic flexibility
- Develop a sophisticated judgement about when machine learning methods enhance causal inference and when traditional econometric approaches may be preferable
- Explore average and heterogeneous Treatment Effects
- Examine the architecture of machine learning methods such as random forests, neural networks, and other algorithms
What Participants Will Gain
The workshop equips participants with both theoretical understanding and practical skills in causal inference and machine learning. Through a mix of lectures, discussions, and hands-on applications, participants will:
-
Master the foundations of causal inference using the potential outcomes framework
-
Understand how classical methods like doubly robust estimation connect to modern machine learning approaches
-
Develop critical judgment about when machine learning methods enhance causal inference and when traditional approaches may be preferable
-
Learn the mechanics and implicit assumptions of double/ debiased machine learning for treatment effect estimation
-
Explore methods for indentifying heterogeneous treatment effects using causal forests while understanding their limitations
-
Understand the learner selection problem and how to match methods to empirical contexts
-
Recognise the role of sparsity, sample splitting, and orthogonalisation in preventing contamination
-
Apply these techniques to complex scenarios while understanding the tradeoffs involved
-
Gain practical experience with leading software implementations in Python and Stata
Agenda
Over the two days, material will be taken from the following list of sessions:
- Session 1
Introduction - Session 2
The Best Predictor and The Conditional Expectation Function - Session 3
Estimation and Inference for Causal Effects - Session 4
High Dimensional Methods for Linear Models - Session 5
Applications of Regularised Regression for Linear Models - Session 6
Double Machine Learning - Session 7
Treatment Effects and Double Robust Estimators - Session 8
Random Forests - Session 9
The Architecture of Causal Trees and Generalised Random Forests - Session 10
Generalised Causal Forests - Session 10b
Testing for Heterogeneity - Session 11
An Introduction to Generative ai and Large Language Models (time permitting)
Recommended Reading
- L. Breiman Statistical Modeling: The Two Cultures Statistical Science, Vol 16, No. 3. pp.199-215
- S. Athey The Impact of Machine Learning on Economics. in, The Economics of Artificial Inteligence: An Agenda, 2018. National Bureau of Economic Research. See http://bit.ly/2EENtvy S. Athey, G. Imbens Machine Learning Methods Economists Should Know About. Working Paper, 2019, Graduate School of Business, Stanford University
- S. Mullainathan, J Spiess. Machine Learning: An Applied Econometric Approach Journal of Economic Perspectives vol. 31, 2017, pp 87-106.
- A. Belloni, V. Chernozhukov, C. Hansen. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives, 28(2):29-50, 2014 (a)