Psyzkaller: Learning from Historical and On-the-Fly Execution Data for Smarter Seed Generation in OS kernel Fuzzing
Abstract
OS Kernel fuzzers such as Syzkaller often struggle to generate syscall sequences that respect intrinsic Syscall Dependency Relations (SDRs), resulting in seeds that either violate kernel constraints or fail to reach deep execution paths. We propose leveraging an N-gram model to learn SDRs from both kernel execution history and ongoing fuzzing results. This enables the fuzzer to capture dependencies in similar kernel versions while adapting to target-specific behaviors, thereby improving the validity of generated seeds. Additionally, we introduce a bidirectional Random Walk strategy to enhance the diversity of generated seeds. We implement this approach in a prototype, Psyzkaller, on top of Syzkaller. Experiments show that, trained with the large-scale DongTing dataset and continuously updated with ongoing fuzzing results, Psyzkaller improves Syzkaller's code coverage by 4.6%-7.0%, triggers 110.4%-187.2% more crashes, and discovers eight previously unknown kernel vulnerabilities. Furthermore, Psyzkaller outperforms state-of-the-art fuzzers such as ACTOR and SyzDescribe in both coverage and crashes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.