Crowd Counting
Research: Crowd counting encounters challenges due to varying congestion levels, which necessitates the model to detect heads with different sizes. Our research reveals that expanding and deepening the ConvNet significantly enhances the robust detection in diverse crowded situations. Specifically, we broaden the ConvNet by aggregating features from multi-channel convolutions and deepen the model by incorporating dilated convolutions tailored to the ConvNet architecture. Our model achieves a Mean Absolute Error (MAE) of 10 in scenarios with approximately 200 people, utilizing only 0.82 million parameters, which is 20 times smaller than the 16.2 million parameters used in state-of-the-art (SoTA) methods. The idea has been granted a patent in China.
Application: The preliminary model has been implemented and deployed within Docomo AI platform. Please refer to the following demo where the numbers represent the number of crowds, and each Gaussian map is centered on head localizations. Our crowd counting model can be applied to different people densities, such as a sparse scene on the beach, a very dense scene in the famous Shibuya cross road, and in a packed shopping mall.