Towards Understanding Link Predictor Generalizability Under Distribution Shifts
Abstract
State-of-the-art link prediction (LP) models demonstrate impressive benchmark results. However, popular benchmark datasets often assume that training, validation, and testing samples are representative of the overall dataset distribution. In real-world situations, this assumption is often incorrect; uncontrolled factors lead new dataset samples to come from a different distribution than training samples. Additionally, the majority of recent work with graph dataset shift focuses on node- and graph-level tasks, largely ignoring link-level tasks. To bridge this gap, we introduce a novel splitting strategy, known as LPShift, which utilizes structural properties to induce a controlled distribution shift. We verify LPShift's effect through empirical evaluation of SOTA LP models on 16 LPShift variants of original dataset splits, with results indicating drastic changes to model performance. Additional experiments demonstrate graph structure has a strong influence on the success of current generalization methods. Source Code Available Here: https://github.com/revolins/LPShift
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.