Rethinking Multi-Modal Learning from Gradient Uncertainty

Abstract

Multi-Modal Learning (MML) integrates information from diverse modalities to improve predictive accuracy. While existing optimization strategies have made significant strides by mitigating gradient direction conflicts, we revisit MML from a gradient-based perspective to explore further improvements. Empirically, we observe an interesting phenomenon: performance fluctuations can persist in both conflict and non-conflict settings. Based on this, we argue that: beyond gradient direction, the intrinsic reliability of gradients acts as a decisive factor in optimization, necessitating the explicit modeling of gradient uncertainty. Guided by this insight, we propose Bayesian-Oriented Gradient Calibration for MML (BOGC-MML). Our approach explicitly models gradients as probability distributions to capture uncertainty, interpreting their precision as evidence within the framework of subjective logic and evidence theory. By subsequently aggregating these signals using a reduced Dempster's combination rule, BOGC-MML adaptively weights gradients based on their reliability to generate a calibrated update. Extensive experiments demonstrate the effectiveness and advantages of the proposed method.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…