From Model Explanation to Data Misinterpretation: A Cautionary Analysis of Post Hoc Explainers in Business Research

Abstract

Post hoc explainers such as SHAP and LIME are used widely in business research to interpret complex machine learning models. Although they were designed to explain model predictions, there has been an increasing trend in which the explanations they generate are treated as evidence about underlying data relationships. Based on a systematic review of 181 studies, including 56 published in leading journals, we document that this explanation interpretation is widespread and examine its validity. We also evaluate how well post hoc explanations reliably recover the direction and relative importance of features in true data-generating process. We introduce two metrics-direction alignment and strength alignment-and assess SHAP and LIME using simulated data with known ground truth. Although explanations often appear reasonable on average, they exhibit substantial heterogeneity in their alignment and can fail to be aligned even when their predictive accuracy is high. High predictive performance is therefore necessary but insufficient for reliable explanation. We further show that feature correlation and the Rashomon effect (where many equally accurate models rely on different feature attributions) are key drivers of misalignment. Agreement in explanations across such models provides a practical diagnostic of reliability. Overall, our findings caution against using post hoc explainers for hypothesis validation and instead position them as exploratory tools that generate, rather than confirm, substantive insights.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…