Date of Award
Quantitative Research Methods
Nicholas J. Cutforth
Deep neural networks, Logistic regression, Multilevel modeling, Observational data, Propensity scores, Treatment effects
This study used a propensity score approach to estimate treatment effects in a multilevel setting. The propensity score approach involves the estimation of propensity scores for covariate balancing and the estimation of treatment effects. This study aimed at understanding how propensity scores estimated through a simple logistic regression compare with propensity scores estimated through an optimized deep neural networks model. The study also examined how treatment effects estimated with propensity score weights from logistic regression compare with treatment effects estimated with propensity score weights from deep neural networks.
Few causal studies have been conducted in a multi-level setting with observational data, and very few studies have used deep neural networks to estimate propensity scores. This study will shine more light on how to find causal effects in a multilevel setting in the absence of randomized experiments. The use of deep neural networks to estimate propensity scores appears to have some advantages compared to a simple logistic regression. Deep neural networks are better at capturing non-linearity and complex relationships in the data compared to logistic regression. Moreover, deep neural networks can be optimized to automatically detect the optimal interactions and relationship in the data, eliminating the tedious task of manually respecifying propensity score models when covariate balance is not achieved.
This study used the Educational Longitudinal Study (ELS:2002) dataset, where participants were selected through a stratified two-stage sampling design. The participants used for analysis in this research consisted of 10,080 students from a cohort of high school sophomores (10th grade) through college into adult careers. For the 10,080 students, 48% were male and 52% were female. Also, 35% of the students were in the non-treatment group and 65% were in the treatment group.
The treatment variable in this study was parental involvement, and the outcome variable was “Standardized test composite score-math/reading”, a measure of student achievement. A total of 200 covariates consisting of 150 student-level and 50 school-level covariates were used to create ten datasets for this study.
The study's findings revealed that propensity scores estimated through logistic regression achieved a better covariate balance than propensity scores estimated through deep neural networks. Propensity scores estimated through deep neural networks achieve a better overlap (common support). Treatment effects estimated with propensity score weights from deep neural networks were significantly higher, positive, and appear to be more reasonable than treatment effects estimated with propensity score weights from logistic regression. Treatment effects estimated with propensity score weights from logistic regression were mostly negative and similar to treatment effects estimated without propensity score weighting. Treatment effects estimated with propensity score weights from deep neural networks were statistically significantly different from treatment effects estimated without propensity score weighting.
Copyright is held by the author. User is responsible for all copyright compliance.
Nfonsang, Neba, "Estimation of Treatment Effects with Multilevel Observational Data Using Deep Neural Networks, Logistic Regression, and Multilevel Modeling: A Propensity Score Approach" (2022). Electronic Theses and Dissertations. 2070.
Received from ProQuest
Educational tests and measurements, Education