Could Large Language Models work as Post-hoc Explainability Tools in Credit Risk Models?
Abstract
Large language models (LLMs) have shown promise in translating model-based explanations into human-readable narratives. This study evaluates whether LLMs can serve as post-hoc explainability interfaces for credit risk models, focusing on their ability to preserve feature-importance rankings and generate autonomous explanations. Using a LendingClub dataset, we compare LLM outputs with SHAP and coefficient-based attributions on three major LLMs, including GPT-4-turbo, Claude-Sonnet-4.5, and Gemini-2.5-Flash. Results indicate that LLMs reliably reproduce reference rankings under controlled prompts but show limited alignment when generating explanations autonomously. These findings suggest that LLMs are best deployed as narrative interfaces rather than substitutes for formal attribution methods in credit risk governance.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.