Date of Award


Document Type


Degree Name


Organizational Unit

Morgridge College of Education, Research Methods and Statistics

First Advisor

Antonio Olmos, Ph.D.

Second Advisor

Kathy Green

Third Advisor

Shimelis Assefa

Fourth Advisor

Janette Benson


Classification, Institutional research, Logistic regression, Machine learning, Multilayer perceptron, Persistence


Multilayer perceptron neural networks, Gaussian naïve Bayes, and logistic regression classifiers were compared when used to make early predictions regarding one-year college student persistence. Two iterations of each model were built, utilizing a grid search process within 10-fold cross-validation in order to tune model parameters for optimal performance on the classification metrics F-Beta and F-1. The results of logistic regression, the historically favored approach in the domain, were compared to the alternative approaches of multilayer perceptron and naïve Bayes based primarily on FBeta and F-1 score performance on a hold-out dataset. A single logistic regression model was found to perform optimally on both F-1 and F-Beta. The logistic regression model outperformed all four of the individual alternative models on the evaluation criteria of concern. A majority voting ensemble and two additional ensembles with empirically derived weights were also applied to the hold-out set. The logistic regression model also outperformed all three ensemble models on the scoring metrics of concern. A visualization technique for comparing and summarizing case-level classifier performance was introduced. The features used in the modeling process comprised traditional and non-traditional elements.

Publication Statement

Copyright is held by the author. User is responsible for all copyright compliance.

Rights Holder

Ben Siebrase


Received from ProQuest

File Format




File Size

153 p.


Statistics, Higher education