Populating Galaxies Into Halos Via Machine Learning on the Simba Simulation
Abstract
We present a machine-learning framework, Machine Inferred Galaxy (MIG), to populate dark-matter haloes with galaxies in N-body simulations. MIG predicts stellar mass (M), star-formation rate (SFR), atomic and molecular gas masses (MHI and MH2), and metallicity, and can be extended to other properties and simulations. The pipeline first separates haloes into centrals and satellites, then uses classifiers to distinguish star-forming (SF) from quenched (Q) systems, followed by regressors trained on the SF subsets for both centrals and satellites. Trained on the (100,h-1,Mpc)3 SIMBA galaxy-formation simulation at z=0, MIG achieves high accuracy for key baryonic properties (e.g. R2 ≈ 0.9 for MHI of central galaxies), and remains robust at z=1 and z=2. Training on fractional quantities (e.g. MHI/M) and rescaling by predicted M improves performance over direct predictions across properties and redshifts. MIG also reproduces galaxy mass distribution functions with higher fidelity, enabling accurate predictions of integrated tracers such as H I intensity maps. MIG therefore provides an efficient, physically consistent route to generate mock galaxy catalogues and baryonic tracers in large cosmological volumes for upcoming surveys.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.