A Novel Approach to Identify Security Controls in Source Code
Abstract
Secure by Design has become the mainstream development approach ensuring that software systems are not vulnerable to cyberattacks. Architectural security controls need to be carefully monitored over the software development life cycle to avoid critical design flaws. Unfortunately, functional requirements usually get in the way of the security features, and the development team may not correctly address critical security requirements. Identifying tactic-related code pieces in a software project enables an efficient review of the security controls' implementation as well as a resilient software architecture. This paper enumerates a comprehensive list of commonly used security controls and creates a dataset for each one of them by pulling related and unrelated code snippets from the open API of the StackOverflow question and answer platform. It uses the state-of-the-art NLP technique Bidirectional Encoder Representations from Transformers (BERT) and the Tactic Detector from our prior work to show that code pieces that implement security controls could be identified with high confidence. The results show that our model trained on tactic-related and unrelated code snippets derived from StackOverflow is able to identify tactic-related code pieces with F-Measure values above 0.9.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.