RAGRouter-Bench: A Dataset and Benchmark for Adaptive RAG Routing
Abstract
Retrieval-augmented generation (RAG) has evolved into a family of paradigms with distinct performance profiles and resource demands, turning paradigm selection into a multi-criteria, context-dependent decision problem. Nevertheless, existing studies largely focus on isolated method improvements or query-only benchmarking, without systematically examining how RAG paradigms behave across diverse query-corpus contexts and effectiveness-efficiency trade-offs. In this work, we introduce RAGRouter-Bench, the first dataset and benchmark for adaptive RAG routing. Grounded in query-corpus compatibility, the benchmark integrates three canonical query types, fine-grained corpus indicators capturing structural and semantic properties, and a unified protocol for evaluating both generation quality and resource consumption. Then, we implement standardized RAG paradigms with multiple backbone LLMs across all query-corpus combinations, constructing a comprehensive benchmark with quantitative metrics and LLM-as-a-Judge evaluations to inform context-aware and cost-effective RAG routing decisions. We further formulate routing as context-dependent paradigm selection and benchmark a range of query-corpus routers on the constructed dataset. Extensive experiments demonstrate that no one-size-fits-all paradigm exists across query-corpus pairs, and that adaptive routing yields more favorable effectiveness-efficiency trade-offs than fixed paradigm selection. These findings establish query-corpus compatibility as a central principle for adaptive RAG routing and position RAGRouter-Bench as a systematic testbed for next-generation RAG systems.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.