OrthoGrad Improves Neural Calibration

Abstract

We study , a geometry-aware modification to gradient-based optimization that constrains descent directions to address overconfidence, a key limitation of standard optimizers in uncertainty-critical applications. By enforcing orthogonality between gradient updates and weight vectors, alters optimization trajectories without architectural changes. On CIFAR-10 with 10% labeled data, matches SGD in accuracy while achieving statistically significant improvements in test loss (p=0.05), predictive entropy (p=0.001), and confidence measures. These effects show consistent trends across corruption levels and architectures. is optimizer-agnostic, incurs minimal overhead, and remains compatible with post-hoc calibration techniques. Theoretically, we characterize convergence and stationary points for a simplified variant, revealing that orthogonalization constrains loss reduction pathways to avoid confidence inflation and encourage decision-boundary improvements. Our findings suggest that geometric interventions in optimization can improve predictive uncertainty estimates at low computational cost.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…