Calibration Attention: Learning Reliability-Aware Representations for Vision Transformers

Weitong Chen

Calibration Attention: Learning Reliability-Aware Representations for Vision Transformers

Abstract

Most calibration methods operate at the logit level, implicitly assuming that miscalibration can be corrected without changing the underlying representation. We challenge this assumption and propose Calibration Attention (CalAttn), a representation-aware calibration module for vision transformers that couples instance-wise temperature scaling to transformer token geometry under a proper scoring objective. CalAttn predicts a sample-specific temperature from the [CLS] token and backpropagates calibration gradients into the backbone, thereby reshaping the uncertainty structure of the representation rather than post-hoc adjusting confidence. This yields token-conditioned uncertainty modulation with negligible overhead (\(<0.1\%\) additional parameters). Across multiple datasets with ViT/DeiT/Swin backbones, CalAttn consistently improves calibration while preserving accuracy, achieving relative ECE reductions of \(3.7\%\) to \(77.7\%\) over strong baselines across diverse training objectives. Our results indicate that treating calibration as a representation-level problem is a practical and effective direction for trustworthy uncertainty estimation in transformers. Code: [https://github.com/EagleAdelaide/CalibrationAttention-CalAttn-](https://github.com/EagleAdelaide/CalibrationAttention-CalAttn-)

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…