Transfer Learning for Robust Structured Regression with Bi-level Source Detection

Abstract

High-dimensional data in modern applications, such as COVID-19 mortality, often span multiple domains. Leveraging auxiliary information from source domains to improve performance in a target domain motivates the use of transfer learning. However, a practical issue that has been overlooked is data contamination, which induces heterogeneity and can significantly degrade transfer learning performance. To address this challenge, we propose a novel approach that tackles transfer learning under data contamination within a structured regression setting. By employing the robust L2E criterion, we develop the TransL2E method that accounts for contamination in both target and source data while effectively transferring relevant information. Beyond robust estimation, TransL2E introduces a data-driven bi-level source detection mechanism, operating at both individual and cohort levels, which possesses multiple advantages over existing source detection approaches. Comprehensive simulation studies and a real data application demonstrate the superior performance of TransL2E in both robust estimation and structure recovery in the presence of data limitation and contamination.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…