Error-Controlled Borrowing from External Data Using Wasserstein Ambiguity Sets
Abstract
Incorporating external data can improve the efficiency of clinical trials, but distributional mismatches between current and external populations threaten the validity of inference. While numerous dynamic borrowing methods exist, the calibration of their borrowing parameters relies mainly on ad hoc, simulation-based tuning. To overcome this, we propose BOND (Borrowing under Optimal Nonparametric Distributional robustness), a framework that formalizes data noncommensurability through Wasserstein ambiguity sets centered at the current-trial distribution. By deriving sharp, closed-form bounds on the worst-case mean drift for both continuous and binary outcomes, we construct a distributionally robust, bias-corrected Wald statistic that ensures asymptotic type I error control uniformly over the ambiguity set. Importantly, BOND determines the optimal borrowing strength by maximizing a worst-case power proxy, converting heuristic parameter tuning into a transparent, analytically tractable optimization problem. Furthermore, we demonstrate that many prominent borrowing methods can be reparameterized via an effective borrowing weight, rendering our calibration framework broadly applicable. Simulation studies and a real-world clinical trial application confirm that BOND preserves the nominal size under unmeasured heterogeneity while achieving efficiency gains over standard borrowing methods.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.