Handgun detection using combined human pose and weapon appearance

Abstract

Closed-circuit television (CCTV) systems are essential nowadays to prevent security threats or dangerous situations, in which early detection is crucial. Novel deep learning-based methods have allowed to develop automatic weapon detectors with promising results. However, these approaches are mainly based on visual weapon appearance only. For handguns, body pose may be a useful cue, especially in cases where the gun is barely visible. In this work, a novel method is proposed to combine, in a single architecture, both weapon appearance and human pose information. First, pose keypoints are estimated to extract hand regions and generate binary pose images, which are the model inputs. Then, each input is processed in different subnetworks and combined to produce the handgun bounding box. Results obtained show that the combined model improves the handgun detection state of the art, achieving from 4.23 to 18.9 AP points more than the best previous approach.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…