Compositional Consistency-Guided Decoding for Three-Way Logical Question Answering

Abstract

Three-way logical question answering (QA) assigns one of True, False, or Unknown to a hypothesis H given a premise set S. We study this task as a compact compositional inference problem: predictions for H and for a mechanically negated hypothesis H should agree under a deterministic negation map. Despite this simple structure, large language models (LLMs) can exhibit two practical failure modes: (i) negation inconsistency, where answers to H and H violate the required label mapping, and (ii) epistemic Unknown, where the model abstains even when one side is entailed. We introduce CGD-PD, a lightweight, training-free test-time layer that combines neural 3-way classification, symbolic negation-consistency projection, and targeted binary entailment probes. On one validation split of FOLIO's first-order logic fields, CGD-PD improves accuracy by 4.4 points on GPT-5.2 and 6.8 points on Claude Sonnet 4.5, while reducing Unknown predictions and epistemic abstention. These results provide a controlled proof of concept that simple logical composition at inference time can help evaluate and improve LLM reasoning reliability; they do not, by themselves, establish robustness beyond this formal benchmark setting.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…