Towards LLM-Powered Automation of a Dark Matter Constraint Repository

Abstract

Dark matter constraint repositories are critical community infrastructure, giving experimentalists and theorists a shared landscape of existing bounds. Yet the most widely-used repositories are maintained by individual volunteers, creating a sustainability risk as the pace of new results accelerates. We present a large language model (LLM) pipeline that monitors arXiv, extracts limit curves from papers, integrates them as code, and opens pull requests (PRs) for human review. On a 346-paper benchmark whose ground truth is the upstream-curated repository itself, the pipeline classifies the coupling type correctly for 90.5% of papers and reaches a median coupling residual of 0.33 dex (a factor of two for 48% of curves), with 76% mean mass-range coverage. This is driven by treating each extraction as a noisy sample reconciled through consensus voting, a physics convention canonicalization layer built with the agentic physics assistant Get Physics Done (GPD), and a scoring methodology that separates genuine extraction error from non-comparability. The remaining difficulty is concentrated in rare coupling types with idiosyncratic conventions (macro-averaged residual 1.1 dex). The pipeline is deployed and has generated limit proposals; none have merged. Governance of AI-generated scientific data is itself an unsolved problem.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…