HattriQ: Designing Integrated Gradients for Feature Attribution in Quantum Machine Learning
Abstract
Quantum machine learning (QML) algorithms have demonstrated early promise across hardware platforms, but remain difficult to interpret due to the inherent opacity of quantum state evolution. Widely used classical interpretability methods, such as integrated gradients and surrogate-based sensitivity analysis, are not directly compatible with quantum circuits due to measurement collapse and the exponential complexity of simulating state evolution. In this work, we introduce HattriQ, a general-purpose framework for computing amplitude-based input-attribution scores in circuit-based QML models. HattriQ supports the widely-used input amplitude embedding feature encoding scheme and uses a Hadamard test-based construction to compute input gradients directly on quantum hardware to compute integrated gradient attributions. We validate HattriQ on classification tasks across several datasets (Bars and Stripes, MNIST, FashionMNIST, and TFIM quantum data).
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.