We are upgrading the repository! A content freeze is in effect until November 22nd, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.
The Hidden Costs of Complexity: Using Causal Inference and Double Machine Learning to Uncover Important Relationships in Higher Education Data Sets
Author
Akbarsharifi, MelikaIssue Date
2024Keywords
Causal InferenceCurricular Complexity
Double Machine Learning
Generalized Propensity Score
Hierarchical Linear Models
Propensity Score Matching
Advisor
Heileman, Gregory
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Graduation rates are a critical performance metric for higher education institutions, reflecting both student success and the effectiveness of educational programs and policies. Among various influencing factors, curricular complexity has emerged as a significant determinant. This study rigorously estimates the causal effect of curricular complexity on four-year graduation rates across 26 universities in the United States. To achieve this, we employ a multifaceted methodological framework integrating advanced causal inference techniques. We calculate the Generalized Propensity Score (GPS) to adjust for confounding variables and predict the treatment variable using Hierarchical Linear Modeling (HLM), accounting for the nested data structure (students within universities). The data is stratified into quintiles based on GPS values to ensure balanced comparison groups. Within each quintile, Double Machine Learning (DML) is utilized to estimate the causal effect of curricular complexity on four-year graduation rates, leveraging logistic regression for the binary outcome variable (four-year graduation) and linear regression for the continuous treatment variable (curricular complexity). Additionally, we construct a causal network using the PC Algorithm, refined by domain experts for plausibility and relevance. The Bayesian Information Criterion (BIC) score is used to select the optimal adjusted network. Sensitivity analysis assesses the robustness of our findings against potential unmeasured confounding factors. Our results indicate a significant causal relationship between curricular complexity and four-year graduation rates. Specifically, higher curricular complexity is associated with lower graduation rates, with an estimated causal effect of -3.879% per unit increase in complexity. Sensitivity analysis confirms the robustness of these findings, with a new effect estimate of -3.763% per unit increase in complexity after accounting for potential unobserved confounders. Detailed analysis across quintiles showed consistent results, indicating that higher curricular complexity within each stratified group reduces the likelihood of graduating in four years. The Average Treatment Effect (ATE) across quintiles ranged from -7.5% to -21.5% per unit increase in complexity. The implications of this study are far-reaching. By highlighting the impact of curricular complexity, our findings can inform university policies aimed at optimizing curricula to enhance student success. Moreover, the methodological framework presented here offers a comprehensive approach to causal inference in educational research, combining GPS, HLM, DML, and network analysis to provide robust and actionable insights.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeElectrical & Computer Engineering