In recent years, the development of deep learning has led to some advances in face synthesis approaches, but the significant pose remains one of the factors that are difficult to overcome. Benefiting from the proposal...
详细信息
In recent years, the development of deep learning has led to some advances in face synthesis approaches, but the significant pose remains one of the factors that are difficult to overcome. Benefiting from the proposal and development of the generative adversarial network, the level of face frontalization technology has reached new heights. In this paper, we propose a deep generative adversarial network based on the multi-attention mechanism for multi-pose face frontalization. Specifically, we add a deep feature encoder based on the attention mechanism and residual blocks in the generator. Meanwhile, to carry the global and local facial information, the discriminator of our model consists of four independent discriminators. The results from quantitative and qualitative experiments on CAS-PEAL-R1 dataset show that our model proves effective. The recognition of our model exceeds or equals the highest recognition rate of other models at some angles, such as 100% at beta = 0?degrees, alpha = 15?degrees and 99.78% at beta = 30?degrees, alpha = 0?degrees.
Background: Skin feature tracking enables quantification of human motion in an explainable way, making it suitable for clinical assessments. Accuracy is crucial, but no study has investigated state-of-the-art deep neu...
详细信息
ISBN:
(纸本)9798350360875;9798350360868
Background: Skin feature tracking enables quantification of human motion in an explainable way, making it suitable for clinical assessments. Accuracy is crucial, but no study has investigated state-of-the-art deep neural network-based point tracking models such as Cotracker. Cotracker jointly tracks points and has been shown to have better 3-pixel accuracy than five other state-of-the-art deep learning methods on the two most commonly used datasets for evaluation of single target point tracking. In 2021, Chang and Nordling introduced the deep feature encoder (DFE) and demonstrated skin feature tracking so accurate that the errors cannot be excluded to stem from the manual labeling of the videos based on a chi(2)-test. Problem: How accurately can different methods track skin features and how to avoid the intrinsic weaknesses of the methods? Methods: We use videos of the Unified Parkinson's Disease Rating Scale postural tremor test recorded at two hospitals for benchmarking. DFE utilizes the encoder part of an autoencoder consisting of a five-layer convolutional neural network trained to reproduce skin crops without supervision. The residual squared error of the latent features of the encoder is then compared with crops to obtain a predicted position. We also propose Cotracker-DFE, using Cotracker to obtain an approximate position and subsequently cropping a small area that is fed to DFE to obtain a position predicted with a lower mean pixel error. Results: The mean Euclidean distance errors of Cotracker, DFE, and Cotracker-DFE are 1.2, 0.8, and 0.8 pixels, respectively. DFE requires time-consuming computations, making it 35 times slower than Cotracker. Conclusion: The old school DFE provided more accurate skin feature tracking, while combining DFE with Cotracker provides the best overall performance, circumventing the lack of labeled data and computational resources required to fine-tune Cotracker.
暂无评论