PAC-MCTS: Bias-Aware Pruning for Robust LLM-Guided Search and Planning
Abstract
As search depth increases in autonomous reasoning and embodied planning, candidate action spaces expand exponentially, often exhausting computational budgets. While heuristic pruning is a critical countermeasure, existing approaches lack formal safety guarantees when guided by surrogate evaluators such as Large Language Models (LLMs), which exhibit systematic biases. We formulate node expansion as a localized Best-Arm Identification (BAI) problem under bounded bias L and derive a sample complexity upper bound of O((-4L)-2), identifying > 4L as the regime where safe elimination is feasible. We further establish an information-theoretic lower bound of ((-2L)-2) that characterizes the structural limits of biased exploration. Motivated by these results, we propose PAC-MCTS, a bias-aware pruning framework that dynamically adapts confidence bounds during search. Experiments on Blocksworld and ALFWorld demonstrate that PAC-MCTS consistently improves robustness and search efficiency over strong pruning baselines, achieving up to 78\% fewer API evaluations and over 3× higher sample efficiency under strict compute budgets. Ablation studies further validate the predicted degradation behavior as evaluator bias increases.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.