Attentioned Convolutional LSTM InpaintingNetwork for Anomaly Detection in Videos
Abstract
We propose a semi-supervised model for detecting anomalies in videos inspiredby the Video Pixel Network [van den Oord et al., 2016]. VPN is a probabilisticgenerative model based on a deep neural network that estimates the discrete jointdistribution of raw pixels in video frames. Our model extends the Convolutional-LSTM video encoder part of the VPN with a novel convolutional based attentionmechanism. We also modify the Pixel-CNN decoder part of the VPN to a frameinpainting task where a partially masked version of the frame to predict is given asinput. The frame reconstruction error is used as an anomaly indicator. We test ourmodel on a modified version of the moving mnist dataset [Srivastava et al., 2015]. Our model is shown to be effective in detecting anomalies in videos. This approachcould be a component in applications requiring visual common sense.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.