Omnibenchmark: transparent, reproducible, extensible and standardized orchestration of solo and collaborative benchmarks

Abstract

Benchmarking involves designing, running and disseminating rigorous performance assessments of methods, most often for data analysis and software tools, but the process can also be applied to experimental systems. Ideally, a benchmarking system is used to facilitate the benchmarking process by providing a structured entrypoint to design, coordinate, execute, and store standardized benchmarks. We describe a novel benchmarking system, Omnibenchmark, that facilitates benchmark formalization and execution in both solo and community efforts. Omnibenchmark provides a flexible benchmark plan syntax (i.e., a configuration YAML file), dynamic workflow generation based on Snakemake, S3-compatible storage handling, and reproducible software environments using environment modules, Apptainer or Conda. Such a setup provides an unprecedented flexibility such that existing benchmark designs can be forked and extended, run separately or collaboratively, giving versioned and standardized result outputs and therefore much-needed transparency to the analysis and interpretation of benchmark results. Tutorials and installation instructions are available from https://omnibenchmark.org.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…