Human Detection
Progressive Refinement Network for Occludded Pedestrian Detection (PRNet)
Research: Our work is inspired by recent advances in both two-stage and single-stage detectors. Two-stage detectors (such as Faster RCNN) are popular but slow, because they suffer from complex computation in obtaining proposals One-stage detectors (such as SSD and YOLO) are faster but less accurate, because the predefined anchors tend to be dense and produce many false positives. To achieve both fast and accurate detection, we design an E2E trainable single-stage framework using adaptive anchor initialization. Particularly, we initialize anchors by a visible-part detector, which is later calibrated and fine tuned to full body prediction. Compared to a regular full body detector that predicts boxes with occlusions, the visible part detector offers more confident prediction and thus a more reliable initialization. We found our design largely improves use cases with small to heavy occlusion, and achieves a relative 10% better miss rate than the SoTA 2-stage detectors and matches the speed of SoTA single-stage detectors. Check our paper at ECCV20 and code.
Application: Our detection model can generalize to detect pedestrians with different occlusion ratio, such as a busy street or a crowded mall, no matter indoor or outdoor.
AB-Net: Enhancing Anchor-Free Human Detection through Cascade Design and Bi-Center Strategy
The quality and quantity of object proposals greatly impact the effectiveness of human detection. While anchor-based methods have shown superior performance via dense anchors or two-stage detectors, anchor-free detectors often struggle with sparse proposals and lack of cascade designs due to the absence of anchors. In this paper, we propose Alternating Bi-center Network (AB-Net), a cascade design for anchor-free human detection. AB-Net leverages a ``bi-center’’ strategy to combine head and body centers to generate diverse proposals, and an alternating architecture with three new task aggregation modules to enhance the complementary interaction between the head and body centers. To address the issue of box suppression in conventional Greedy NMS, we present an identity-based NMS guided by a pairwise identity loss that incorporates both head and body information. Through comprehensive ablation studies on CityPersons and CrowdHuman datasets, we demonstrate that our approach achieved comparable or better performance with 2-8X lower computation cost than anchor-based detectors. We submitted our results in Pattern Recognition.