CodeGENCAT: Generative Computerized Adaptive Testing for Open-ended Coding Problems
Abstract
Existing Computerized Adaptive Testing (CAT) frameworks typically select questions based on the predicted likelihood that the student will answer correctly. This design ignores information contained in students' open-ended responses, especially in domains such as programming education, where code structures and bugs contain rich information on student knowledge. In this work, we propose Code GENerative CAT (CodeGENCAT), a generative CAT framework that selects questions using predicted student code responses. First, we develop a Generative Item Response Theory (GIRT) model that generates code responses conditioned on estimated student knowledge, trained with supervised fine-tuning followed by direct preference optimization for knowledge-response alignment. Second, we introduce three question-selection algorithms that measure uncertainty, coding style diversity, and information from predicted student code responses. Experiments on two real-world programming education datasets show that CodeGENCAT outperforms all CAT baselines, achieving an AUC improvement of up to 4.32\% over the strongest baseline in the early stages of adaptive testing.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.