Reducing Information Overload: Because Even Security Experts Need to Blink
Abstract
Computer Emergency Response Teams (CERTs) face increasing challenges processing the growing volume of security-related information. Daily manual analysis of threat reports, security advisories, and vulnerability announcements leads to information overload, contributing to burnout and attrition among security professionals. This work evaluates 196 combinations of clustering algorithms and embedding models across five security-related datasets to identify optimal approaches for automated information consolidation. We demonstrate that clustering can reduce information processing requirements by over 90% while maintaining semantic coherence, with deep clustering achieving homogeneity of 0.88 for security bug report (SBR) and partition-based clustering reaching 0.51 for advisory data. Our solution requires minimal configuration, preserves all data points, and processes new information within five minutes on consumer hardware. The findings suggest that clustering approaches can significantly enhance CERT operational efficiency, potentially saving over 3.750 work hours annually per analyst while maintaining analytical integrity. However, complex threat reports require careful parameter tuning to achieve acceptable performance, indicating areas for future optimization. The code is made available at https://github.com/PEASEC/reducing-information-overload.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.