Semi-Supervised Hierarchical Semantic Object Parsing

Abstract

Models based on Convolutional Neural Networks (CNNs) have been proven very successful for semantic segmentation and object parsing that yield hierarchies of features. Our key insight is to build convolutional networks that take input of arbitrary size and produce object parsing output with efficient inference and learning. In this work, we focus on the task of instance segmentation and parsing which recognizes and localizes objects down to a pixel level base on deep CNN. Therefore, unlike some related work, a pixel cannot belong to multiple instances and parsing. Our model is based on a deep neural network trained for object masking that supervised with input image and follow incorporates a Conditional Random Field (CRF) with end-to-end trainable piecewise order potentials based on object parsing outputs. In each CRF unit we designed terms to capture the short range and long range dependencies from various neighbors. The accurate instance-level segmentation that our network produce is reflected by the considerable improvements obtained over previous work at high APr thresholds. We demonstrate the effectiveness of our model with extensive experiments on challenging dataset subset of PASCAL VOC2012.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…