Inferring identified hadron production in pp collisions with physics-informed machine learning at the LHC
Abstract
Machine learning has become a powerful tool in high-energy collider experiments, which enables the studies based on data-driven approaches to complex reconstruction and regression tasks. The study of identified hadron spectra in pseudorapidity regions beyond detector acceptance, which is limited to mid-rapidity regions, carries important information about particle production, yet remains unmeasured. In this work, we develop a physics-informed neural network, trained on PYTHIA8 pp collisions at s=13.6 TeV, to infer p T spectra of π, K, p/p, /, and K0s in different rapidity regions. Physics-motivated constraints, including particle yield ratios, spectral shape, and smoothness, are incorporated into the loss function. A staged hyperparameter optimization strategy is used to ensure stability. The model achieves yield uncertainties of 1.5\%, 1.8\%, and 5.83\% in the training, interpolation, and extrapolation regimes, respectively, outperforming XGBoost and LightGBM. It further reproduces key observables such as particle yield ratios, the multiplicity dependence of p T , and kinetic freeze-out parameters, indicating that the model captures the underlying physics and provides reliable predictions beyond the measured phase space.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.