CornerCase: Automated Extremal Testing of Protocol Implementations using LLMs

Abstract

Many software bugs in network protocol implementations arise near specification boundaries, such as inputs just within or outside allowed ranges, or messages that are valid in isolation but invalid in a given state. From the SSL Heartbleed exploit to TCP Christmas Tree packets, boundary inputs have repeatedly exposed critical weaknesses, yet remain under-tested by existing techniques such as fuzzing and model-based testing. We present CornerCase, an automated extremal testing approach that systematically targets such boundary behaviors. Our key idea is to decompose test generation into two stages: first, large language models (LLMs) extract explicit validity constraints from protocol specifications (e.g., RFCs) in a structured, section-by-section manner; second, extremal test cases are generated at or near the boundary of each constraint. These tests are executed across multiple implementations, and differential testing identifies inconsistencies. We evaluate CornerCase on widely used implementations of HTTP, DNS, BGP, SMTP, and QUIC, uncovering many previously unknown bugs. For example, the HTTP server h2o enters a redirect loop when processing URLs containing encoded null bytes. Overall, we used CornerCase to identify and file 42 anomalies; to date 26 have been acknowledged as bugs and 18 fixed, with others under active investigation

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…