Alzheimer’s disease (AD), recognized as the second-most impactful neurological disorder and currently incurable, stands as the leading cause of dementia. An imperative research focus is efficiently diagnosing the sta...
详细信息
This work proposes a novel generative model, FF-GAN (Frontal Face Generative Adversarial Network), for generating high-quality and diverse frontal faces. FF-GAN utilizes contrastive learning to effectively learn the u...
详细信息
This work proposes a novel generative model, FF-GAN (Frontal Face Generative Adversarial Network), for generating high-quality and diverse frontal faces. FF-GAN utilizes contrastive learning to effectively learn the underlying representation of frontal faces from unlabeled image datasets. This approach allows the model to capture the essential characteristics of frontal faces without the need for explicit pose annotations. We evaluate the performance of FF-GAN using established metrics like FID (Fréchet Inception Distance), IS (Inception Score), and SSIM (Structural Similarity Index Measure). The results demonstrate that FF-GAN achieves superior performance compared to existing methods, generating highly realistic and visually appealing frontal faces with exceptional structural coherence. This research contributes to the field of facial image generation by introducing an effective unsupervised learning approach based on contrastive learning for generating high-quality frontal faces.
Human voice is an ideal data source for identifying people in many applications. Because of the increasing need for security in different public places, voice biometrics may be a good solution, as we can easily take v...
详细信息
Since the smartphone is adapted to the ambient intelligence and the smart home systems are remotely accessed through the smartphones, there is a need for a secure authentication system based on some biometrics proprie...
详细信息
In this manuscript, an 8 × 1 rectangular 'U' slotted patch antenna array designed for future 5G communications is presented. The proposed antenna is designed using the 3D EM CSTv18 microwave studio based ...
详细信息
The evolution of the coding unit module from the High Efficiency Video Coding (HEVC) video standards to the Joint Exploration Model (JEM) extensively enhanced compression performance while severely increases coding co...
详细信息
ISBN:
(纸本)9781665427159
The evolution of the coding unit module from the High Efficiency Video Coding (HEVC) video standards to the Joint Exploration Model (JEM) extensively enhanced compression performance while severely increases coding complexity caused by the brute force search built on Rate Distortion Optimization (RDO). Effectively, compared to the predecessor HEVC standard that makes use of the quad-tree (QT) block partitioning module, the novel quad-tree binary-tree (QTBT) block partitioning process proposed within the JEM encoder uses additional block sizes and shapes which induces additional flexibility. In this paper, we suggest a deep convolutional neural network (CNN) based method to reduce the block partitioning module complexity for both HEVC and JEM encoders at all intra-configurations. The results show that the CNN approach leads to better optimization performance with the HEVC encoder reaching up to 59%. However, the CNN model is more robust with several JEM versions.
Human voice is an ideal data source for identifying people in many applications. Because of the increasing need for security in different public places, voice biometrics may be a good solution, as we can easily take v...
Human voice is an ideal data source for identifying people in many applications. Because of the increasing need for security in different public places, voice biometrics may be a good solution, as we can easily take voice records. This paper provides a brief overview of the approaches utilized in recognizing speakers, and then presents a novel approach for recognizing speakers in degraded smart-home conditions. The suggested approach includes a pre-processing phase, a feature extraction phase, and a classification phase, where the feature extraction phase consists of formant extraction to get the spectrum energy maxima of speech audio, dynamic time warping (DTW)to find an optimal alignment between two provided temporal sequences under definite restrictions, and refinement process to improve the results of the DTW system output. The experiments are carried out on a database containing 1,248 samples in order to validate the suggested approach. The latter has good results as regards the state of the art with 94.5% accuracy.
Since the smartphone is adapted to the ambient intelligence and the smart home systems are remotely accessed through the smartphones, there is a need for a secure authentication system based on some biometrics proprie...
Since the smartphone is adapted to the ambient intelligence and the smart home systems are remotely accessed through the smartphones, there is a need for a secure authentication system based on some biometrics proprieties that can be taken from a smartphone. The identification of persons through ear and voice print is one of the basic biometric matters. The earlier research in ear recognition have shown that human ear is one of the representative human biometrics with uniqueness and stability. Indeed, the human voice is a perfect source of data for person identification in many applications. In this paper, we propose a fusion between the ear and voice biometrics in degraded conditions in a smart home context at 3 levels (feature, score, and decision). The experiments are conducted on the EVDDC database and a chimeric database (TIMIT and USTB-I). The best results are obtained with the feature level fusion (95.8%) with the KNN classifier.
ABSTRACTReal-time stereo matching with high accuracy is a dynamic research topic; it is attractive in diverse computer vision applications. This paper presents a stereo-matching algorithm that produces high-quality di...
详细信息
ABSTRACTReal-time stereo matching with high accuracy is a dynamic research topic; it is attractive in diverse computer vision applications. This paper presents a stereo-matching algorithm that produces high-quality disparity map while maintaining real-time performance. The proposed stereo-matching method is based on three per-pixel difference measurements with adjustment elements. The absolute differences and the gradient matching are combined with a colour-weighted extension of complete rank transform to reduce the effect of radiometric distortion. The disparity calculation is realized using improved dynamic programming that optimizes along and across all scanlines. It solves the inter-scanline inconsistency problem and increases the matching accuracy. The proposed algorithm is implemented on parallel high-performance graphic hardware using the Compute Unified Device Architecture to reach over 240 million disparity evaluations per second. The processing speed of our algorithm reaches 98 frames per second on 240 × 320-pixel images and 32 disparity levels. Our method ranks fourth in terms of accuracy and runtime for quarter-resolution images in the Middlebury stereo benchmark.
The diabetic retinopathy is one of the most frequent causes of visual damage and vision loss. It can cause blindness in the absence of the diagnosis and the treatment. The automatic detection of the hard exudate in co...
详细信息
ISBN:
(数字)9781728175133
ISBN:
(纸本)9781728175140
The diabetic retinopathy is one of the most frequent causes of visual damage and vision loss. It can cause blindness in the absence of the diagnosis and the treatment. The automatic detection of the hard exudate in color fundus retinal images is an important task to early diagnosis the diabetic retinopathy. In this paper, a hard exudate detection algorithm is proposed. It is based on the application of a learning method to retinal image with removed optic disk. This paper proposes the use of Random Forest algorithm with a specific parameter from which a binary mask of exudate is obtained after intensity thresholding. It achieves 91.40% for sensitivity and 94.38% for the accuracy.
暂无评论