Attribute-aware Pedestrian Detection in a Crowd
Abstract
Pedestrian detection is an initial step to perform outdoor scene analysis, which plays an essential role in many real-world applications. Although having enjoyed the merits of deep learning frameworks from the generic object detectors, pedestrian detection is still a very challenging task due to heavy occlusion and highly crowded group. Generally, the conventional detectors are unable to differentiate individuals from each other effectively under such a dense environment. To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion. Besides the typical semantic features, center position, target's scale and offset, we introduce a pedestrian-oriented attribute feature to encode the high-level semantic differences among the crowd. Moreover, a novel attribute-feature-based Non-Maximum Suppression~(NMS) is proposed to distinguish the person from a highly overlapped group by adaptively rejecting the false-positive results in a very crowd settings. Furthermore, a novel ground truth target is designed to alleviate the difficulties caused by the attribute configuration and extremely class imbalance issues during training. Finally, we evaluate our proposed attribute-aware pedestrian detector on two benchmark datasets including CityPersons and CrowdHuman. The experimental results show that our approach outperforms state-of-the-art methods at a large margin on pedestrian detection.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.