Chi-square and normal inference in high-dimensional multi-task regression
Abstract
The paper proposes chi-square and normal inference methodologies for the unknown coefficient matrix B* of size p× T in a Multi-Task (MT) linear model with p covariates, T tasks and n observations under a row-sparse assumption on B*. The row-sparsity s, dimension p and number of tasks T are allowed to grow with n. In the high-dimensional regime p n, in order to leverage row-sparsity, the MT Lasso is considered. We build upon the MT Lasso with a de-biasing scheme to correct for the bias induced by the penalty. This scheme requires the introduction of a new data-driven object, coined the interaction matrix, that captures effective correlations between noise vector and residuals on different tasks. This matrix is psd, of size T× T and can be computed efficiently. The interaction matrix lets us derive asymptotic normal and 2T results under Gaussian design and sT+s(p/s)n0 which corresponds to consistency in Frobenius norm. These asymptotic distribution results yield valid confidence intervals for single entries of B* and valid confidence ellipsoids for single rows of B*, for both known and unknown design covariance . While previous proposals in grouped-variables regression require row-sparsity s n up to constants depending on T and logarithmic factors in n,p, the de-biasing scheme using the interaction matrix provides confidence intervals and 2T confidence ellipsoids under the conditions (T2,8p)/n 0 and sT+s(p/s)+\|-1ej\|0 pn0, (s,\|-1ej\|0) n [T+(p/s)] p 0, allowing row-sparsity s n when \|-1ej\|0 T n up to logarithmic factors.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.