LoMETab: Beyond Rank-1 Ensembles for Tabular Deep Learning
Abstract
Recent tabular learning benchmarks increasingly show a tight performance cluster rather than a clear hierarchy among leading methods, spanning gradient boosted decision trees, attention-based architectures, and implicit ensembles such as TabM. As benchmark gains plateau, a complementary goal is to understand and control the mechanisms that make simple neural tabular models competitive. We propose LoMETab, a rank-r generalization of multiplicative implicit ensembles. LoMETab lifts the rank-1 BatchEnsemble/TabM modulation to a rank-r identity-residual Hadamard family by parameterizing each member weight as Wk = W (1 + AkBk), where W is shared and (Ak, Bk) are member-specific low-rank factors. This exposes two practical diversity-control axes: the adapter rank r and the initialization scale σinit, and we prove that for r 2 this generalization strictly enlarges BatchEnsemble's hypothesis class. Empirically, we show that this added capacity manifests as measurable predictive diversity after training: on representative classification datasets, LoMETab sustains higher pairwise KL than an additive low-rank ablation, and (r, σinit) provides broad control over pairwise KL, varying by up to several orders of magnitude across configurations. The induced diversity is reflected in task-appropriate output-level measures: argmax disagreement for classification and ambiguity for regression, indicating that the control extends beyond pairwise KL to decision- and output-level member variation. Finally, experiments sweeping over adapter rank r and initialization scale σinit reveal that predictive performance is dataset-dependent over the (r, σinit) grid, supporting LoMETab as a controllable family of implicit ensembles rather than a fixed rank-1 construction.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.