Structured Sparse Non-negative Matrix Factorization with L20-Norm for scRNA-seq Data Analysis

Abstract

Non-negative matrix factorization (NMF) is a powerful tool for dimensionality reduction and clustering. Unfortunately, the interpretation of the clustering results from NMF is difficult, especially for the high-dimensional biological data without effective feature selection. In this paper, we first introduce a row-sparse NMF with 2,0-norm constraint (NMF_20), where the basis matrix W is constrained by the 2,0-norm, such that W has a row-sparsity pattern with feature selection. It is a challenge to solve the model, because the 2,0-norm is non-convex and non-smooth. Fortunately, we prove that the 2,0-norm satisfies the Kurdyka-ojasiewicz property. Based on the finding, we present a proximal alternating linearized minimization algorithm and its monotone accelerated version to solve the NMF_20 model. In addition, we also present a orthogonal NMF with 2,0-norm constraint (ONMF_20) to enhance the clustering performance by using a non-negative orthogonal constraint. We propose an efficient algorithm to solve ONMF_20 by transforming it into a series of constrained and penalized matrix factorization problems. The results on numerical and scRNA-seq datasets demonstrate the efficiency of our methods in comparison with existing methods.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…