DISTINCT: A Description-Guided Branch-Consistency Analysis Framework for Non-Regressive Test Case Generation

Kainan Li

DISTINCT: A Description-Guided Branch-Consistency Analysis Framework for Non-Regressive Test Case Generation

Abstract

Automated test-generation research overwhelmingly assumes the correctness of focal methods, yet practitioners routinely face non-regression scenarios where the focal method may be defective. A baseline evaluation of EVOSUITE and two leading Large Language Model (LLM)-based generators, namely CHATTESTER and CHATUNITEST, on defective focal methods reveals that, despite achieving up to 83% branch coverage, none of the generated tests expose defects, due to a lack of awareness of developer intent. To resolve this problem, we first construct two new benchmarks, namely Defects4J-Desc and QuixBugs-Desc, for experiments, where each focal method is equipped with an additional Natural Language Description (NLD) to support code functionality understanding. Subsequently, we propose DISTINCT, a description-guided branch-consistency analysis framework that transforms LLMs into fault-aware test generators. DISTINCT carries three iterative components: (1) a Generator that derives initial tests based on the NLDs and the focal method, (2) a Validator that iteratively fixes uncompilable tests using compiler diagnostics, and (3) an Analyzer that iteratively aligns test behavior with NLD semantics via branch-level analysis. Extensive experiments confirm the effectiveness of our approach. Compared to state-of-the-art approaches, DISTINCT achieves an average improvement of 14.64% in Compilation Success Rate (CSR), 6.66% in Passing Rate (PR), and particularly 95.22% in Defect Detection Rate (DDR) across both benchmarks. In terms of code coverage, DISTINCT improves Statement Coverage (SC) by an average of 3.77% and Branch Coverage (BC) by 5.36%. These results set a new baseline for non-regressive test generation and highlight how description-driven reasoning enables LLMs to move beyond coverage chasing toward effective defect detection.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…