A machine learning approach to volatility forecasting
Abstract
We inspect how accurate machine learning (ML) is at forecasting realized variance of the Dow Jones Industrial Average index constituents. We compare several ML algorithms, including regularization, regression trees, and neural networks, to multiple Heterogeneous AutoRegressive (HAR) models. ML is implemented with minimal hyperparameter tuning. In spite of this, ML is competitive and beats the HAR lineage, even when the only predictors are the daily, weekly, and monthly lags of realized variance. The forecast gains are more pronounced at longer horizons. We attribute this to higher persistence in the ML models, which helps to approximate the long-memory of realized variance. ML also excels at locating incremental information about future volatility from additional predictors. Lastly, we propose a ML measure of variable importance based on accumulated local effects. This shows that while there is agreement about the most important predictors, there is disagreement on their ranking, helping to reconcile our results.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.