Gap-Aware Exact Nonnegative Matrix Factorization: A Two-Sided SVD Gauge and a Three-Regime W-Rank Taxonomy
Abstract
We extend the cone-ray exact-NMF pipeline of Ramteke (arXiv:2606.22451) from the uniform-support regime r+ = r to the gap regime r+ > r, and classify recoverable nonnegative factorisations by the rank of the W-factor into a three-regime taxonomy. Regime A (rank(W) = r+, full column rank): a two-sided SVD-gauge cone-ray pipeline W = Ur+(G) Q, H = P Vr+(K)T with G, K on Stiefel manifolds and square consistency Q P = diag(Sr, 0). On 10x10 dense random gap matrices it gives 100/100 recovery at r+ = 5 and 6. We explain this via two geometric facts: slack enclosure (the data cone has codimension r+ - r in the outer cone) and NRF-variety thickness (valid gauges form a positive-measure set, so the blind SVD lands on one with probability one). Regime B (rank(W) = r, W a column subset of M): a rank-deficient-W branch enumerating r+-subsets of M's columns with per-column LP tests. On the block-diagonal family diag(C, Jk), where additivity of nonnegative rank collapses the valid gauges to a single point and the blind SVD pipeline fails, the column-subset branch restores recovery in milliseconds. Regime C (r < rank(W) < r+, W not a column subset): exposed by the regular octagon's slack matrix. An exact size-6 NRF exists and is reachable by the symmetric formulation at an oracle gauge derived from a known factorisation (residual 1.5e-10), but the blind problem is open: 50 Haar random restarts and Riemannian gradient descent on the Stiefel/Grassmann gauge all stall, because the alt-LP residual is piecewise constant on cells of gauge-space, so local descent cannot cross cell walls. A combined toolkit (Regime B then A) covers regimes A and B with no regression on dense draws; Regime C remains open, with the regular octagon as the cleanest unsolved test case.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.