Learning to Control Stabilization in Column Generation
Abstract
Column generation is a widely used decomposition technique for large-scale linear programs, but it often suffers from slow convergence due to poor initial dual estimates and dual oscillations. Stabilization techniques such as smoothing and penalization can mitigate these issues, but their effectiveness depends heavily on parameter selection, which requires careful tuning to avoid degrading performance. This paper presents a common framework for smoothing and penalization, showing that despite their different mechanisms, both are governed by two design choices: a reference point in the dual space and stabilization parameters that regulate how strongly that reference influences pricing. Within this framework, we derive parameter bounds that ensure progress, analyze predicted duals as reference points, and establish convergence guarantees for both methods. These results motivate and guide the design of RLSCG, a reinforcement learning-guided framework that adaptively selects stabilization parameters at each iteration. Computational experiments on the Cutting Stock Problem show that RLSCG substantially reduces iteration count and computation time on most synthetic and benchmark instances relative to traditional column generation, rule-based adaptive stabilization, and learning-based column selection, with the largest gains on large-scale instances.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.