Chem-GMNet: A Sphere-Native Geometric Transformer for Molecular Property Prediction
Abstract
Modern SMILES-based chemical language models obtain strong MoleculeNet performance by treating SMILES as generic text and compensating with multi-million-molecule self-supervised pretraining. We ask: when a domain carries structural priors as rich as chemistry's, does it warrant a domain-native transformer rather than a generic one rescued by scale? We answer affirmatively with GM-Net (Geometric Measure Network), a transformer family in which every module is replaced by a sphere-native counterpart, and instantiate it as Chem-GMNet. Three blocks follow: SH-Embedding (tokens as learnable directions on Sk-1 lifted through a Gegenbauer feature map); DualSKA (a per-head fusion of a linear-time gated Sphere-Flow recurrence whose persistent state we prove is the truncated multipole expansion of the input distribution, and a softmax Sphere-Kernel branch over the same Schoenberg-valid kernel); and SH-FFN (sphere projection Gegenbauer lift moment readout). On canonical DeepChem scaffold splits, against same-shape ChemBERTa-2 baselines under the chemberta3-faithful protocol: (i) random-initialised, Chem-GMNet wins on 7 of 10 MoleculeNet endpoints at \!35\% fewer parameters; (ii) pretrained on the same 10M-SMILES ZINC corpus as ChemBERTa-2 MLM-10M, it matches or beats the public release on 6 of 8 shared endpoints (5/7 excluding a known ClinTox release anomaly). A (k,L) ablation shows that increasing the sphere dimension from k\!=\!8 to k\!=\!10 at fixed L\!=\!3 lowers ESOL RMSE to 0.938 at scratch, beating pretrained ChemBERTa-2 MLM-10M on this endpoint without any pretraining at all.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.