GitReq: A Gold Standard Dataset for Software Quality Requirements
Abstract
GitHub issue trackers contain millions of developer-written quality concerns, including performance bottlenecks and security vulnerabilities, yet no publicly available GitHub dataset classifies these into fine-grained software quality categories. We construct and release GitReq GitHub Requirement Issue, comprising 6,302 expert-validated requirements mined from 55,588 raw GitHub candidates across 4,080 repositories, labeled across eight ISO/IEC 25010:2011-aligned categories: Performance, Security, Portability, Availability, Fault-tolerance, Scalability, Maintainability, and a Functional baseline. Dataset construction involved category-specific triple-signal GitHub mining, separate non-functional requirement (NFR) and functional requirement (FR) preprocessing pipelines with per-category parameters, and expert human annotation achieving substantial inter-annotator agreement (Fleiss' Kappa~=~0.72). Zero-shot evaluation with four large language models (LLMs) establishes baselines, with GPT-5.2 reaching the highest macro-averaged F1 of 0.641. GitReq is publicly released with full materials to advance research in automated requirement classification and software quality analysis.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.