Estimating the total number of people in a crowded situation is a challenging task due to numerous occlusions and perspective changes existing in crowd images. To address this issue, the authors have proposed a new de...
详细信息
Estimating the total number of people in a crowded situation is a challenging task due to numerous occlusions and perspective changes existing in crowd images. To address this issue, the authors have proposed a new deep learning framework for accurate and efficient crowd counting here. Inspired by multi-column convolutional neural network (MCNN) and contextual pyramid convolutional neural network (CP-CNN), the authors use a combination of a two branches, convolutional neutral network (CNN) and transposed convolutional layers, to generate a high-quality density map. The two-branch CNN for feature extraction generates a density map that is only a quarter of size of the original image Then a set of transposed convolutional layers and convolutional layers are combined with the network to make up for the detail loss of the density map conducted by stacked pooling. Compared with MCNN and CP-CNN, the authors' approach employs fewer branches and simpler architecture. Experimental result shows that their approach achieves MAE 80.7 and MSE 131.2 in shanghaitech PartA dataset, MAE 15.6 and MSE 26.8 in shanghaitech partb dataset, and MAE Average 7.1 in WorldExpo'10 dataset.
暂无评论