Structured First-Layer Initialization Pre-Training Techniques to Accelerate Training Process Based on $\varepsilon$-Rank

Quanhui Zhu

Structured First-Layer Initialization Pre-Training Techniques to Accelerate Training Process Based on -Rank

Abstract

Training deep neural networks for scientific computing remains computationally expensive due to the slow formation of diverse feature representations in early training stages. Recent studies identify a staircase phenomenon in training dynamics, where loss decreases are closely correlated with increases in -rank, reflecting the effective number of linearly independent neuron functions. Motivated by this observation, this work proposes a structured first-layer initialization (SFLI) pre-training method to enhance the diversity of neural features at initialization by constructing -linearly independent neurons in the input layer. We present systematic initialization schemes compatible with various activation functions and integrate the strategy into multiple neural architectures, including modified multi-layer perceptrons and physics-informed residual adaptive networks. Extensive numerical experiments on function approximation and PDE benchmarks, demonstrate that SFLI significantly improves the initial -rank, accelerates convergence, mitigates spectral bias, and enhances prediction accuracy. With the help of SILP, we only need to add one line of code to conventional existing algorithms.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…