SecGoal: A Benchmark for Extracting Formalizable Security Goals from Protocol Documents
Abstract
Formal verification provides rigorous guarantees for cryptographic security, yet extracting formalizable security goals from natural-language protocol documents remains largely manual. We introduce SecGoal, a dedicated expert-annotated dataset and benchmark for extracting formalizable security goal statements from protocol documents, covering 15 widely deployed protocols, together with AIFG, a schema- and flow-conditioned framework for structured formal security property generation. Our evaluation shows that frontier and large LLMs achieve high property recall but low extraction precision because they often fail to distinguish formalizable security goals from non-goal protocol content. In contrast, SecGoal fine-tuning makes smaller open-source LLMs substantially more selective extractors of formalizable security goals. On the held-out test protocols, Gemma2-9B-FT improves extraction precision from 24.0\% to 66.6\% and reaches 97.6\% property recall, outperforming larger prompted LLMs and encoder baselines. In a controlled setting, AIFG shows that concise goal inputs can support high-recall structured property generation, while expert-vetted extracted inputs reveal over-generation as the main remaining bottleneck. Together, SecGoal and AIFG provide a dataset, benchmark, and framework for specification-grounded security goal extraction and property generation.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.