Social Context-aware GCN for Video Character Search via Scene-prior Enhancement

Abstract

With the increasing demand for intelligent services of online video platforms, video character search task has attracted wide attention to support downstream applications like fine-grained retrieval and summarization. However, traditional solutions only focus on visual or coarse-grained social information and thus cannot perform well when facing complex scenes, such as changing camera view or character posture. Along this line, we leverage social information and scene context as prior knowledge to solve the problem of character search in complex scenes. Specifically, we propose a scene-prior-enhanced framework, named SoCoSearch. We first integrate multimodal clues for scene context to estimate the prior probability of social relationships, and then capture characters' co-occurrence to generate an enhanced social context graph. Afterwards, we design a social context-aware GCN framework to achieve feature passing between characters to obtain robust representation for the character search task. Extensive experiments have validated the effectiveness of SoCoSearch in various metrics.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…