127 Corwin Hall

Much evidence in applied research is based on observational studies where investigators assume that there are no unobservable differences between the groups under comparison. Treatment effects are estimated after adjusting for observed confounders via statistical methods. However, even if the assumption of no unobserved confounding holds, bias from model misspecification may be significant. Traditionally, regression models of various kinds have been used to adjust for confounders. Such models impose strong functional form assumptions that are most prone to model misspecification. In the causal inference literature, there has been considerable effort on the development of more flexible adjustment methods. In fact, there has been an explosion in the number of methods that can be used to adjust for observed confounders. Now investigators can choose between many forms of matching, weighting, doubly robust methods, and a variety of machine learning based estimators. The general trend has been to move toward flexible methods of estimation. Specifically, most recent work has sought to combine methods from machine learning with a doubly robust framework. While these methods have clear theoretical advantages, they see little use in the applied literature. Moreover, the development of guidelines for applied researchers has been limited. In this presentation, I review key concepts related to functional form assumptions and how those can contribute to bias from model misspecification. I also review the logic behind why machine learning methods have been so widely proposed for estimation and review the strengths and weaknesses of these methods.  I present two case studies where I seek to recover experimental benchmarks using observational study data. In these case studies, I compare the performance of a wide variety of methods for statistical adjustment. I find that several widely used methods are subject to bias from model misspecification. I also find that while machine learning methods are among the strongest performers, they are not always reliable.