A New Integrative Learning Framework for Integrating Multiple Secondary Outcomes into Primary Outcome Analysis: A Case Study on Liver Health
Abstract
In the era of big data, secondary outcomes have become increasingly important alongside primary outcomes. These secondary outcomes, which can be derived from traditional endpoints in clinical trials, compound measures, or risk prediction scores, hold the potential to enhance the analysis of primary outcomes. Our method is motivated by the challenge of utilizing multiple secondary outcomes, such as blood biochemistry markers and urine assays, to improve the analysis of the primary outcome related to liver health. Current integration methods often fall short, as they impose strong model assumptions or require prior knowledge to construct over-identified working functions. This paper addresses these statistical challenges and potentially opens a new avenue in data integration by introducing a novel integrative learning framework that is applicable in a general setting. The proposed framework allows for the robust, data-driven integration of information from multiple secondary outcomes, promotes the development of efficient learning algorithms, and ensures optimal use of available data. Extensive simulation studies demonstrate that the proposed method significantly reduces variance in primary outcome analysis, outperforming existing integration approaches. Additionally, applying this method to UK Biobank (UKB) reveals that cigarette smoking is associated with increased fatty liver measures, with these effects being particularly pronounced in the older adult cohort.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.