PRIMO: Private Regression in Multiple Outcomes

Abstract

We introduce a new private regression setting we call Private Regression in Multiple Outcomes (PRIMO), inspired by the common situation where a data analyst wants to perform a set of l regressions while preserving privacy, where the features X are shared across all l regressions, and each regression i ∈ [l] has a different vector of outcomes yi. Naively applying existing private linear regression techniques l times leads to a l multiplicative increase in error over the standard linear regression setting. We apply a variety of techniques including sufficient statistics perturbation (SSP) and geometric projection-based methods to develop scalable algorithms that outperform this baseline across a range of parameter regimes. In particular, we obtain no dependence on l in the asymptotic error when l is sufficiently large. Empirically, on the task of genomic risk prediction with multiple phenotypes we find that even for values of l far smaller than the theory would predict, our projection-based method improves the accuracy relative to the variant that doesn't use the projection.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…