Publication Date
9-2-2024
Document Type
Article
Organizational Units
Daniels College of Business, Reiman School of Finance
Keywords
Machine learning, Out-of-sample predictability, Pooling, Ensembles, Return predictability
Abstract
We evaluate US market return predictability using a novel data set of several hundred ag- gregated firm-level characteristics. We apply LASSO, Elastic Net, Random Forest, Neural Net, Extreme Gradient Boosting, and Light Gradient Boosting Machine methods and find these models experience large prediction errors that lead to forecast failures. However, winsorizing and pooling machine learning model forecasts provides consistent out-of-sample predictability. To assess robustness, we apply machine learning methods to high-dimensional data for Canada, China, Germany and the UK as well as the Goyal-Welch data. All machine learning models we consider, except for the ensemble pooled methods, fail to significantly predict returns across our samples, highlighting the importance of pooling, evaluating additional economies, and the fragility of individual machine learning methods. Our results shed light on the sparsity versus density debate as the degree of sparsity and variable importance evolves over time.
Copyright Date
9-11-2024
Copyright Statement / License for Reuse

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Rights Holder
Erik Mekelburg and Jack Strauss
Provenance
Received from Elsevier
File Format
application/pdf
Language
English (eng)
Extent
25 pgs
File Size
3.26 MB
Publication Statement
Copyright is held by the Authors. User is responsible for all copyright compliance. This article was originally published as
Mekelburg, E., & Strauss, J. (2024). Pooling and Winsorizing Machine Learning Forecasts to Predict Stock Returns with High-Dimensional Data. Journal of Empirical Finance, 79, 101538. https://doi.org/10.1016/j.jempfin.2024.101538
Publication Title
Journal of Empirical Finance
Volume
79
First Page
101538
ISSN
0927-5398
Recommended Citation
Mekelburg, Erik and Strauss, Jack, "Pooling and Winsorizing Machine Learning Forecasts to Predict Stock Returns with High-Dimensional Data" (2024). Finance: Faculty Scholarship. 5.
https://digitalcommons.du.edu/finance_fac/5
https://doi.org/10.1016/j.jempfin.2024.101538
Included in
Artificial Intelligence and Robotics Commons, Business Analytics Commons, Corporate Finance Commons, Finance and Financial Management Commons, Statistics and Probability Commons