Learning rates of lq coefficient regularization learning with Gaussian kernel
Abstract
Regularization is a well recognized powerful strategy to improve the performance of a learning machine and lq regularization schemes with 0<q<∞ are central in use. It is known that different q leads to different properties of the deduced estimators, say, l2 regularization leads to smooth estimators while l1 regularization leads to sparse estimators. Then, how does the generalization capabilities of lq regularization learning vary with q? In this paper, we study this problem in the framework of statistical learning theory and show that implementing lq coefficient regularization schemes in the sample dependent hypothesis space associated with Gaussian kernel can attain the same almost optimal learning rates for all 0<q<∞. That is, the upper and lower bounds of learning rates for lq regularization learning are asymptotically identical for all 0<q<∞. Our finding tentatively reveals that, in some modeling contexts, the choice of q might not have a strong impact with respect to the generalization capability. From this perspective, q can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.