Recommending Code Improvements Based on Stack Overflow Answer Edits
Abstract
Background: Sub-optimal code is prevalent in software systems. Developers may write low-quality code due to many reasons, such as lack of technical knowledge, lack of experience, time pressure, management decisions, and even unhappiness. Once sub-optimal code is unknowingly (or knowingly) integrated into the codebase of software systems, its accumulation may lead to large maintenance costs and technical debt. Stack Overflow is a popular website for programmers to ask questions and share their code snippets. The crowdsourced and collaborative nature of Stack Overflow has created a large source of programming knowledge that can be leveraged to assist developers in their day-to-day activities. Objective: In this paper, we present an exploratory study to evaluate the usefulness of recommending code improvements based on Stack Overflow answers' edits. Method: We propose Matcha, a code recommendation tool that leverages Stack Overflow code snippets with version history and code clone search techniques to identify sub-optimal code in software projects and suggest their optimised version. By using SOTorrent and GitHub datasets, we will quali-quantitatively investigate the usefulness of recommendations given by Matcha to developers using manual categorisation of the recommendations and acceptance of pull-requests to open-source projects.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.