 
        
        - Cambridge 2025
Causal Machine Learning: Principled Approaches for Econometric Analysis
The Cambridge Summer School on Causal Inference and Machine Learning offers an opportunity to understand a principled approach to modern empirical work through a comprehensive two-part program.
The course begins with an introduction to causal inference using the potential outcomes framework, the conceptual cornerstone upon which modern causal machine learning techniques are built. Participants will first be introduced to traditional estimation techniques including regression adjustment, inverse probability weighting, and matching methods before advancing to more sophisticated approaches.
This foundational knowledge is critical, as many cutting-edge machine learning techniques for causal inference represent extensions or adaptations of these classical methods. For instance, doubly robust methods which combine outcome modeling and propensity score approaches, form the theoretical basis for many modern double machine learning techniques that we explore later in the course.
Beyond Prediction: The Principled Approach to Causal ML
As Belloni, Chernozhukov, and Hansen (2014) note in their influential work, modern causal machine learning represents “a principled search for ’true’ predictive power that guards against false discovery and overfitting, does not erroneously equate in-sample fit to out-of-sample predictive ability, and accurately accounts for using the same data to examine many different hypotheses or models.”
Unlike traditional machine learning’s focus on pure prediction, causal machine learning combines standard identification conditions alongside a number of methods to establish reliable causal effects. The foundation of this principled methodology lies in sample splitting techniques - creating deliberate barriers between the nonparametric methods used to select a model and the subsequent estimation and inference process.
This deliberate separation is not merely a technical detail but a fundamental shift in how we approach data analysis. By partitioning the training data appropriately, we prevent the negative effects of “data mining”, allowing us to leverage the flexibility of machine learning methods while maintaining the reliability of our causal estimates.
Course Design: Points of Departure
Building on classical methods like difference-in-differences, late estimation, and control function approaches, the summer school provides several key “points of departure” that serve as bridges between traditional econometrics and machine learning.
Throughout the course, we contrast Parametric versus Nonparametric Methods, examining the trade-offs and complementarities between traditional modeling approaches and the flexibility of data-driven alternatives.
We begin with a familiar object, the Conditional Expectation Function, the fundamental building block of regression analysis. We will examine how this concept connects traditional regression methods to more complex machine learning approaches.
As datasets grow in complexity, our exploration extends to High Dimensional Methods in Statistics, including techniques like lasso and double (debiased) lasso that enable causal effects estimation when facing numerous potential control variables—effectively extending the doubly robust methods introduced in the first part of the course. The classical Frisch-Waugh-Lovell Theorem takes on new importance in the causal ml context, facilitating unbiased model selection for treatment effect estimation with highdimensional controls. Participants will appreciate how the partialling out of control effects, becomes a difficult problem when aligned with uncertainty over the dimension of controls and the attendant use of regularisation techniques.
Building on the potential outcomes framework, we explore how machine learning can enhance our understanding of Treatment Effects, both average and heterogeneous. Finally, we examine the architecture of machine Learning methods such as random forests, neural networks, and other algorithms, developing intuition for how these sophisticated tools can be adapted for causal questions. Together, these interconnected topics provide a comprehensive foundation for understanding how machine learning can be rigorously applied to causal inference problems.
The Angrist Perspective: ML as Robustness Check
Our approach aligns with recent Nobel Prize winner Joshua Angrist’s perspective on machine learning in economics. As discussed in his paper “Machine Labor,” machine learning results can serve as valuable robustness checks for parametric benchmarks. Rather than treating ml as a replacement for traditional econometric methods, we view it as a complementary tool that can validate or challenge conventional approaches.
This perspective maintains the interpretability and theoretical grounding that economists value while incorporating the flexibility and data-adaptive nature of machine learning. The result is a more robust empirical toolkit that can tackle increasingly complex causal questions in economic research.
What Participants will Gain
The summer school equips participants with both theoretical understanding and practical skills in causal inference and machine learning. Through a mix of lectures, discussions, and hands-on applications, participants will:
- Master the foundations of causal inference using the potential outcomes framework
- Understand how classical methods like doubly robust estimation connect to modern machine learning approaches
- Learn to implement double/debiased machine learning for treatment effect estimation
- Explore methods for identifying heterogeneous treatment effects using causal forests
- Apply these techniques to complex scenarios including staggered adoption designs and settings with heterogeneous treatment effects
- Gain practical experience with leading software implementations in Python, and Stata
Whether you’re an economist looking to expand your methodological toolkit, a data scientist interested in causal questions, or a policymaker seeking to evaluate program impacts more effectively, this course provides essential knowledge for navigating the intersection of causality and prediction in the big data era.
Join us in Cambridge this summer to explore how principled approaches to causal machine learning can transform your empirical research.


Dr. Melvyn Weeks, University of Cambridge
Dr Melvyn Weeks is a senior lecturer and fellow of Clare College, Cambridge University. Dr Weeks is an assistant editor of the Journal of Applied Econometrics, as well as an associate at Cambridge Econometrics. His work has been published in The Economic Journal, Journal of the American Statistical Association, Journal of Applied Econometrics, European Economic Review, Computational & Economics.
Some of the literature we will review
- L. Breiman Statistical Modeling: The Two Cultures Statistical Science, Vol. 16, No. 3. pp. 199-215
- S. Athey The Impact of Machine Learning on Economics. in, The Economics of Artificial Intelligence: An Agenda, 2018. National Bureau of Economic Research. See http://bit.ly/2EENtvy S. Athey,G. Imbens Machine Learning Methods Economists Should Know About. Working Paper, 2019, Graduate School of Business, Stanford University.
- S. Mullainathan, J. Spiess. Machine Learning: An Applied Econometric Approach Journal of Economic Perspectives vol. 31, 2017, pp. 87-106.
- A. Belloni, V. Chernozhukov, C. Hansen. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives, 28(2):29-50, 2014(a)
 
                

 
        
         
        
         
        
         
        
         
        
         
                             
                             
                             
                             
                             
                            