EviRerank: Adaptive Evidence Construction for Long-Document LLM Reranking
Abstract
Decoder-only LLM rerankers struggle with long documents: inference is costly and relevance signals can be diluted by irrelevant context. Motivated by a diagnostic attention analysis suggesting that appended irrelevant context can weaken query-focused interactions, we propose EviRerank, an evidence-based long-document reranking framework for decoder-only LLMs. EviRerank first scores document blocks with a lightweight selector, such as BM25, a bi-encoder, or a cross-encoder. It then constructs a compact reranking context under a hard token cap by dynamically budgeting evidence blocks with Adaptive Evidence Budgeting (AEB) and adding a compact global cue via Summary Augmentation (SA). Finally, the compact evidence context is reranked with a decoder-only LLM. Across TREC DL'19, DL'22, DL'23, and MLDR-zh, EviRerank consistently outperforms full-document LLM reranking and strong block-selection baselines while reducing input length. RankZephyr-7B validation further confirms transfer to listwise reranking. On TREC DL'19, EviRerank reaches up to 0.744 nDCG@10 and 0.307 MAP, improving over RankLLaMA while using a compact evidence context.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.