Facial expression recognition using deep neural networks has become very popular due to their successful performances. However, the datasets used during the development and testing of these methods lack a balanced dis...
详细信息
Facial expression recognition using deep neural networks has become very popular due to their successful performances. However, the datasets used during the development and testing of these methods lack a balanced distribution of races among the sample images. This leaves a possibility of the methods being biased toward certain races. Therefore, a concern about fairness arises, and the lack of research aimed at investigating racial bias only increases the concern. On the other hand, such bias in the method would decrease the real-world performance due to the wrong generalization. For these reasons, in this study, we investigated the racial bias within popular state-of-the-art facial expression recognition methods such as Deep Emotion, Self-Cure Network, ResNet50, InceptionV3, and DenseNet121. We compiled an elaborated dataset with images of different races, cross-checked the bias for methods trained, and tested on images of people of other races. We observed that the methods are inclined towards the races included in the training data. Moreover, an increase in the performance increases the bias as well if the training dataset is imbalanced. Some methods can make up for the bias if enough variance is provided in the training set. However, this does not mitigate the bias completely. Our findings suggest that an unbiased performance can be obtained by adding the missing races into the training data equally.
Brain skull stripping is an essential step before performing the segmentation. It leads to better performance and less computational load on the model due to the elimination of redundant features from the image. Howev...
详细信息
Brain skull stripping is an essential step before performing the segmentation. It leads to better performance and less computational load on the model due to the elimination of redundant features from the image. However, the preparation of the skull-stripped ground truth brain images by the experts is a very tedious task and may lead to human errors. In this article, a fully unsupervised approach to brain extraction has been proposed. The cascaded loss function is used and is tuned for better segmentation. The number of connected components is also tuned in each type of dataset. The cascaded loss function is a combination of focal loss (FL) and dice loss (DL). To address the class imbalance issue, the leaky ReLU activation function is used. Enhancement of the brain image has been performed before extraction which yielded better performance. In comparison with other methods, the proposed backpropagation-based convolutional neural network (BCNN) work gives better qualitative and quantitative outcomes on four out of seven parameters. The dice similarity coefficient (DSC) of the proposed model is 0.89 which is the highest as compared to the other models for brain extraction. The specificity of the model is 0.998 with 97.21 % accuracy. The undersegmentation error is reduced to 1.2002 using the cascaded loss function. The proposed work has been evaluated on four brain image datasets. It has been found that the proposed model extracts the brain from the skull and also the white matter, gray matter, and cerebrospinal fluid. Therefore, it has been noted that the proposed model will be beneficial for efficient brain skull stripping without the availability of ground truth data.
作者:
Wang, RuizhePang, JiaojiaoHan, XiaoleXiang, MinNing, XiaolinBeihang Univ
Sch Instrumentat & Optoelect Engn Key Lab Ultraweak Magnet Field Measurement Technol Minist Educ Beijing 100191 Peoples R China Beihang Univ
Hangzhou Innovat Inst Zhejiang Prov Key Lab Ultraweak Magnet Field Space Hangzhou 310051 Zhejiang Peoples R China Beihang Univ
Hangzhou Inst Natl Extremely Weak Magnet Field Inf Hangzhou 310028 Zhejiang Peoples R China Shandong Univ
Inst Magnet Field Free Med & Funct Imaging Shandong Key Lab Magnet Field Free Med & Funct Ima Jinan Peoples R China Shandong Univ
Shandong Prov Clin Res Ctr Emergency & Crit Care M Dept Emergency Med Qilu Hosp Jinan Peoples R China Shandong Univ
Natl Innovat Platform Ind Educ Intearat Med Engn I Jinan Peoples R China Hefei Natl Lab
Hefei 230088 Anhui Peoples R China
Objective: This study developed a fast and accurate automated method for magnetocardiography (MCG) classification. Approach: We propose a deformable convolutional block attention module (DCBAM)-based method for classi...
详细信息
Objective: This study developed a fast and accurate automated method for magnetocardiography (MCG) classification. Approach: We propose a deformable convolutional block attention module (DCBAM)-based method for classifying coronary artery disease (CAD) using MCG. After preprocessing, the raw MCG data were segmented into individual heartbeat segments and encoded into image representations using the Hilbert curve to convert the temporal features into spatial image features. We combined DCBAM with convolutional neural networks (CNNs) for MCG classification. DCBAM incorporated a deformable convolutional architecture along with temporal and spatial attention mechanisms to capture representative and correlative features of the image representation MCG along the temporal and spatial multichannel dimensions. We performed ablation experiments to evaluate the rationality and validity of the proposed model structure. Additionally, we performed an interpretability analysis to investigate the model's region of interest for CAD diagnosis. Results: The proposed method achieved an average accuracy of 93.57%, precision of 94.71%, sensitivity of 92.56%, specificity of 94.68%, and average F1-score of 93.60%. In contrast to existing methods, our proposed model achieved superior diagnostic classification results in MCG with fewer parameters. Significance: Integrating DCBAM with image-representation MCG establishes a novel feature extraction method that enhances the clinical utility of MCG and effectively addresses long-range dependencies and spatiotemporal inconsistencies in time-series signal analysis.
Hyperspectral image (HSI) is characterized by large number of bands with a high spectral resolution where continuous spectrum is measured for each pixel. This high volume therefore leads to challenges in processing th...
详细信息
Hyperspectral image (HSI) is characterized by large number of bands with a high spectral resolution where continuous spectrum is measured for each pixel. This high volume therefore leads to challenges in processing the dataset. Objective of Dimensionality Reduction (DR) algorithms is to identify and eliminate statistical redundancies of hyperspectral data while keeping as much spectral information as possible. Combining spectral and spatial information offers a more comprehensive classification approach. Convolutional neural network (CNN) has the potential to extract complex spatial and spectral features embedded in Hyperspectral data. Wavelet transform belongs to the family of multi-scale transformation where the input signal is analyzed at different levels of granularity. Attention mechanism is a method in neural networks to guide the algorithm to focus on the important information in the data. In this paper, we use Multi-head Transformer-based Attention (Vaswani et al. in Attention is all you Need, 2017) technique for Channel attention which captures the long-range spectral dependencies. The experimental results show that the proposed algorithm MT-CW Band Selection-based multi-head transformer for dimensionality reduction and Wavelet CNN-based algorithm for feature extraction yields impressive results in terms of information conservation and class separability.
In recent years, remote sensing image classification tasks have garnered widespread attention and have been extensively studied by researchers. Most current studies focus on improving classification accuracy, leading ...
详细信息
In recent years, remote sensing image classification tasks have garnered widespread attention and have been extensively studied by researchers. Most current studies focus on improving classification accuracy, leading to overly large and complex networks with high computational costs that are challenging to deploy for real-time remote sensing tasks. To address this issue, neural network pruning has emerged as an effective solution. However, existing pruning methods typically prune along a single dimension, and as the pruning ratio increases, important weights in that dimension often suffer from over-pruning, resulting in significant accuracy loss. This paper proposes a novel pruning method for remote sensing scene classification-Multidimensional Space Pruning (MSP). MSP performs stereoscopic pruning of filters along both channel and depth dimensions, simultaneously removing redundant information across two different dimensions. This prevents excessive pruning of important weights in a single dimension, thereby significantly reducing model complexity while maintaining accuracy. As a novel pruning method, MSP achieves remarkable results. At a pruning ratio of 0.4, MSP-pruned VGG-16 and ResNet-34 models on the NWPU-RESISC45 dataset show accuracy drops of only 1.05 % and 0.71 %, respectively, while achieving compression ratios of 92.52 % and 93.19 %. Similarly, on the AID dataset, the accuracy drops are merely 0.26 % and 0.54 %, with compression ratios reaching 96.23 % and 88.56 %, respectively. Experimental results on two public remote sensing image datasets demonstrate that compared to existing methods, MSP achieves higher compression ratios while maintaining model accuracy, showcasing superior model compression performance.
imagesignalprocessing (ISP) pipeline plays a fundamental role in digital cameras, which converts raw Bayer sensor data to RGB images. However, ISP-generated images usually suffer from imperfections due to the compou...
详细信息
ISBN:
(纸本)9798350353006
imagesignalprocessing (ISP) pipeline plays a fundamental role in digital cameras, which converts raw Bayer sensor data to RGB images. However, ISP-generated images usually suffer from imperfections due to the compounded degradations that stem from sensor noises, demosaicing noises, compression artifacts, and possibly adverse effects of erroneous ISP hyperparameter settings such as ISO and gamma values. In a general sense, these ISP imperfections can be considered as degradations. The highly complex mechanisms of ISP degradations, some of which are even unknown, pose great challenges to the generalization capability of deep neural networks (DNN) for image restoration and to their adaptability to downstream tasks. To tackle the issues, we propose a novel DNN approach to learn degradation-independent representations (DiR) through the refinement of a self-supervised learned baseline representation. The proposed DiR learning technique has remarkable domain generalization capability and consequently, it outperforms state-of-the-art methods across various downstream tasks, including blind image restoration, object detection, and instance segmentation, as verified in our experiments.
Though relatively good effect has been achieved by the image de-blurring method based on deep learning, the existing methods still suffer from the problem of unclear restoration of the edges. Therefore, brain-inspired...
详细信息
Though relatively good effect has been achieved by the image de-blurring method based on deep learning, the existing methods still suffer from the problem of unclear restoration of the edges. Therefore, brain-inspired image restoration model based on human attention and "fine vision" is proposed to improve the blind restoration quality of the image in this paper according to the response mechanism of the different cerebral cortices for high and low spatial resolutions. The designed brain-inspired model consists of dual-channel network available to realize the function of feature merger for low and high resolutions, which is used to extract the image edges with detailed information filtered out. Confirmatory experiment is implemented based on the blurred image in the data set of GOPRO, LIVE and set14. As per the result, the model proposed is available for relatively good restoration of blurred image and super-resolution, as well as looking results by visual inspection.
With the advancement of deep learning, the accuracy of image classification continues to rise. However, the generalization capability of deep learning methods, such as those applied in image classification, lags signi...
详细信息
Bilevel optimization enjoys a wide range of applications in emerging machine learning and signalprocessing problems such as hyper-parameter optimization, image reconstruction, meta-learning, adversarial training, and...
详细信息
Bilevel optimization enjoys a wide range of applications in emerging machine learning and signalprocessing problems such as hyper-parameter optimization, image reconstruction, meta-learning, adversarial training, and reinforcement learning. However, bilevel optimization problems are traditionally known to be difficult to solve. Recent progress on bilevel algorithms mainly focuses on bilevel optimization problems through the lens of the implicit-gradient method, where the lower-level objective is either strongly convex or unconstrained. In this work, we tackle a challenging class of bilevel problems through the lens of the penalty method. We show that under certain conditions, the penalty reformulation recovers the (local) solutions of the original bilevel problem. Further, we propose the penalty-based bilevel gradient descent (PBGD) algorithm and establish its finite-time convergence for the constrained bilevel problem with lower-level constraints yet without lower-level strong convexity. Experiments on synthetic and real datasets showcase the efficiency of the proposed PBGD algorithm. The code for implementing this algorithm is publicly available on GitHub.
Emerging Learned image Compression (LC) achieves significant improvements in coding efficiency by end-to-end training of neural networks for compression. An important benefit of this approach over traditional codecs i...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Emerging Learned image Compression (LC) achieves significant improvements in coding efficiency by end-to-end training of neural networks for compression. An important benefit of this approach over traditional codecs is that any optimization criteria can be directly applied to the encoder-decoder networks during training. Perceptual optimization of LC to comply with the Human Visual System (HVS) is among such criteria, which has not been fully explored yet. This paper addresses this gap by proposing a novel framework to integrate Just Noticeable Distortion (JND) principles into LC. Leveraging existing JND datasets, three perceptual optimization methods are proposed to integrate JND into the LC training process: (1) Pixel-Wise JND Loss (PWL) prioritizes pixel-by-pixel fidelity in reproducing JND characteristics, (2) image-Wise JND Loss (IWL) emphasizes on overall imperceptible degradation levels, and (3) Feature-Wise JND Loss (FWL) aligns the reconstructed image features with perceptually significant features. Experimental evaluations demonstrate the effectiveness of JND integration, highlighting improvements in rate-distortion performance and visual quality, compared to baseline methods. The proposed methods add no extra complexity after training.
暂无评论