SciFlow: Semantic Cross Interference for Self-Supervised Optical Flow Domain Generalization

Abstract

Motions of objects and scenes carry essential intelligence in video understanding, offering rich cues for interpreting dynamic settings and interactions. Due to the cost and scarcity of high-quality annotation or ground truth of pixel-wise optical flow, however, motion estimation models are typically trained in synthetic domains while deployed in real-world domains. Addressing synthetic-to-real domain generalization challenges has been crucial for developing practical solutions in diverse open-world use cases. This paper introduces SciFlow, a simple yet effective, network-agnostic, training-based approach that leverages self-supervised learning to generalize motion estimation across synthetic and open-world domains. Specifically, SciFlow imposes semantic interference from open-world images onto synthetic images during training, blending indomain features with cross-domain interference, which enables the network to adapt to the real-world domains. Additionally, SciFlow utilizes geometric consistency to ensure validity of the self-supervision. Our experiment results show that SciFlow not only significantly enhances model robustness amidst domain variations, but also remarkably enables synthetic-to-real domain generalization without requiring any ground truth in the open world.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…