Point cloud processing plays an increasingly essential role in three-dimensional (3D) computer vision target, scene parsing, environmental perception, etc. Compared with using aligned point cloud data for classificati...
详细信息
Point cloud processing plays an increasingly essential role in three-dimensional (3D) computer vision target, scene parsing, environmental perception, etc. Compared with using aligned point cloud data for classification and segmentation, the strictly rotation-invariant representations show enough robustness. Inspired by the great success of deep learning, we propose a novel neural network called Multi-head attentional Point Cloud Classification and Segmentation Using Strictly Rotation-invariant Representations. Our research focuses on processing the point cloud rotated in any direction effectively and precisely. First of all, the strictly rotation-invariant point cloud representations are obtained through point projection. Then we apply a multi-head attentional convolution layer (MACL) using attention coding to develop the performance of point cloud feature extraction. Finally, our network assigns different responses and recognizes the overall geometry well through a key point descriptor, adding to the global feature. Our method can explore more in-depth information for accuracy enhancement with attention pooling and multi-layer perceptron (MLP) based on an advanced DenseNet. Our network enjoys 90.63% and 87.50% classification accuracy testing on ModelNet10 and ModelNet40, and 75.15% intersection over union metric (mIoU) evaluating on ShapeNet Part dataset, remaining under any rotation. Rotating experimental results indicate that our framework realizes better point cloud classification and segmentation performance than most state-of-the-art methods.
In this paper, we propose a deep learning based approach for facial action unit (AU) detection by enhancing and cropping regions of interest of face images. The approach is implemented by adding two novel nets (a.k.a....
详细信息
In this paper, we propose a deep learning based approach for facial action unit (AU) detection by enhancing and cropping regions of interest of face images. The approach is implemented by adding two novel nets (a.k.a. layers): the enhancing layers and the cropping layers, to a pretrained convolutional neural network (CNN) model. For the enhancing layers (noted as E-Net), we have designed an attention map based on facial landmark features and apply it to a pretrained neural network to conduct enhanced learning. For the cropping layers (noted as C-Net), we crop facial regions around the detected landmarks and design individual convolutional layers to learn deeper features for each facial region. We then combine the E-Net and the C-Net to construct a so-called Enhancing and Cropping Net (EAC-Net), which can learn both features enhancing and region cropping functions effectively. The EAC-Net integrates three important elements, i.e., learning transfer, attention coding, and regions of interest processing, making our AU detection approach more efficient and more robust to facial position and orientation changes. Our approach shows a significant performance improvement over the state-of-the-art methods when tested on the BP4D and DISFA AU datasets. The EAC-Net with a slight modification also shows its potentials in estimating accurate AU intensities. We have also studied the performance of the proposed EAC-Net under two very challenging conditions: (1) faces with partial occlusion and (2) faces with large head pose variations. Experimental results show that (1) the EAC-Net learns facial AUs correlation effectively and predicts AUs reliably even with only half of a face being visible, especially for the lower half;(2) Our EAC-Net model also works well under very large head poses, which outperforms significantly a compared baseline approach. It further shows that the EAC-Net works much better without a face frontalization than with face frontalization through image warping
暂无评论