Medical image segmentation plays an important role in medical diagnosis, and has received extensive attention in recent years. A large number of convolutional neural network based methods have been proposed to achieve...
详细信息
Medical image segmentation plays an important role in medical diagnosis, and has received extensive attention in recent years. A large number of convolutional neural network based methods have been proposed to achieve accurate segmentation results. Dice loss is the most popular loss function for medical image segmentation tasks. However, we found that Dice loss suffers from abnormal gradient changes, which causes the loss function to be unstable and difficult to converge. Therefore, we propose an gradient-optimized Dice loss (GODC) to solve this problem. GODC corrects the abnormal gradient changes in the segmentation loss, which accelerates the model convergence and can achieve better segmentation performance. Next, we propose a lateral feature alignment module (LFAM). LFAM adopts deformable convolutional network to align the features of different layers on the shortcut connections of U-Net to improve the segmentation performance. Finally, our method achieves state-of-the-art results on the LiTS dataset as well as our collected pancreatic tumor datasets.
Multi image Super-resolution (MISR) refers to the task of enhancing the spatial resolution of a stack of low- resolution (LR) images representing the same scene. Although many deep learning-based single image super- r...
详细信息
Multi image Super-resolution (MISR) refers to the task of enhancing the spatial resolution of a stack of low- resolution (LR) images representing the same scene. Although many deep learning-based single image super- resolution (SISR) technologies have recently been developed, deep learning has not been widely exploited for MISR, even though it can achieve higher reconstruction accuracy because more information can be extracted from the stack of LR images. One of the primary obstacles encountered by deep networks when addressing the MISR problem is the variability in the number of LR images that act as input to the network. This impedes the feasibility of adopting an end-to-end learning approach, because the varying number of input images makes it difficult to construct a training dataset for the network. Another challenge arises from the requirement to align the LR input images to generate high-resolution (HR) image of high quality, which requires complex and sophisticated methods. In this paper, we propose a self-learning based method that can simultaneously perform super-resolution and sub-pixel registration of multiple LR images. The proposed method trains a neural network with only the LR images as input and without any true target HR images;i.e., the proposed method requires no extra training dataset. Therefore, it is easy to use the proposed method to deal with different numbers of input images. To our knowledge this is the first time that a neural network is trained using only LR images to perform a joint MISR and sub-pixel registration. Experimental results confirmed that the HR images generated by the proposed method achieved better results in both quantitative and qualitative evaluations than those generated by other deep learning-based methods.
Advanced machine learning methods, and more prominently neural networks, have become standard to solve inverse problems over the last years. However, the theoretical recovery guarantees of such methods are still scarc...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
Advanced machine learning methods, and more prominently neural networks, have become standard to solve inverse problems over the last years. However, the theoretical recovery guarantees of such methods are still scarce and difficult to achieve. Only recently did unsupervised methods such as the Deep image Prior (DIP) get equipped with convergence and recovery guarantees for generic loss functions when trained through gradient flow with an appropriate initialization. In this paper, we extend these results by proving that these guarantees hold true when using gradient descent with an appropriately chosen step-size/learning rate. We also show that the discretization only affects the overparametrization bound for a two-layer DIP network by a constant and thus that the different guarantees found for the gradient flow will hold for gradient descent.
Vital signs such as blood pressure, heart rate, and respiration rate are continuously monitored in intensive care unit patients to assess their condition. Various methods are available for the continuous monitoring of...
详细信息
Vital signs such as blood pressure, heart rate, and respiration rate are continuously monitored in intensive care unit patients to assess their condition. Various methods are available for the continuous monitoring of these vital parameters. To extract parameters, current techniques place multiple sensors on the patient's body. Patients dealing with medical issues may find it challenging and uncomfortable to have multiple electrodes placed on their bodies. To avoid placing multiple sensors on a patient's body, the proposed method aims to extract three vital parameters-respiration rate (RR), blood pressure, and heart rate-from a single photoplethysmography sensor, using a unified deep learning model to analyze the photoplethysmographic (PPG) signal. The proposed deep learning framework combines a Convolutional neural Network (CNN) with Bidirectional Long Short-Term Memory (Bi-LSTM) and an attention mechanism. This model effectively extracts features by integrating spatial and temporal correlations within the signal, focusing on the most relevant features necessary for estimating multiple parameters from a PPG signal. Optimized through hyperparameter tuning, the CNN-Bi-LSTM architecture achieved a prediction accuracy of 95.67%. The performance of the proposed method is evaluated using the publicly available Multiparameter Intelligent Monitoring in Intensive Care Database and compared to existing methods. The model demonstrated an average mean absolute error (MAE) +/- standard deviation (SD) of 0.084 +/- 0.20 for heart rate, 0.034 +/- 0.23 for blood pressure, and 0.009 +/- 0.05 for respiration rate.
Remote sensing (RS) image change detection (CD) methods based on deep learning, such as convolutional neural networks (CNNs) and transformers, are still spatial domain-based imageprocessingmethods by nature, and the...
详细信息
Remote sensing (RS) image change detection (CD) methods based on deep learning, such as convolutional neural networks (CNNs) and transformers, are still spatial domain-based imageprocessingmethods by nature, and their detection accuracy is strongly affected by chromatic aberration due to imaging time, shadows caused by lighting conditions, and object confusion and other disturbances. In this study, we revisit CD from a signalprocessing perspective, framing it as the task of consistency detection of the distributional features of two 2-D signals. We aim to extract the primary components of the two signals while suppressing interfering noises. To address this, we propose a novel CD method called DFNet, which leverages a dual-frequency learnable encoder. First, we construct a dual-frequency feature encoder Siamese framework to capture local high-frequency signals and global low-frequency signals using CNN and attention mechanisms after dividing the input RS imagesignals into two channels. Second, we introduce the frequency explicit visual center module as a part of the multifrequency-domain dense interaction (MFDDI) module at the decoder stage, allowing long-distance dependency to be established between high-low frequency components in the same layer as well as signal aggregation in regions of small edge variations. In addition, the MFDDI module adopts a layer-by-layer interactive fusion approach to synthesize discriminative information in a wide frequency-domain range, enhancing the characterization capability of frequency-domain signals. We conduct comparison experiments with the current mainstream methods on the land cover dataset SYSU-CD and two building datasets, LEVIR-CD and WHU-CD, and the results show that our method is not only resistant to interference but also outperforms all the comparison methods.
Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identification for Bayesian decision systems, we demo...
详细信息
Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identification for Bayesian decision systems, we demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive Bayesian utility maximizers, a generative model used extensively in economics for human decision-making. Our claim is based on approximately 500 numerical experiments on 5 widely used neural network architectures. The parameters of the resulting interpretable model are computed efficiently via convex feasibility algorithms. As a practical application, we also illustrate how the reconstructed interpretable model can predict the classification performance of deep CNNs with high accuracy. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.
Surface electromyography-based gesture recognition was widely applied in human-computer interaction, hand rehabilitation, prosthetic control, and other fields. Electromyography (EMG) signals-based gesture classificati...
详细信息
Surface electromyography-based gesture recognition was widely applied in human-computer interaction, hand rehabilitation, prosthetic control, and other fields. Electromyography (EMG) signals-based gesture classification usually relies on handcrafted feature extraction with intense subjectivity or convolutional neural networks with redundant structures to extract features. This paper converts the raw EMG signals into Gramian Angular Difference Field (GADF) and Gramian Angular Summation Field images. Four models were used to classify the pictures: K-nearest Neighbors (KNN), Generalized Learning Systems, Binary Trees, and Convolutional neural Networks using MobileNetv1, and the proposed method was verified by using the public dataset NinaproDB2. Experimental results: When the window size is 300 ms, the step size is 10 ms, and KNN are used as the classification model, the average accuracy of EMG signals classification based on the GADF method is 98.17%, and the accuracy of exercises B, C and D was 96.65%, 95.53%, and 98.02%, respectively. The recognition accuracy was 7.92%, 14.25%, and 4.279% higher than the provided baseline.
Single image super-resolution aims to restore high-resolution images from low-resolution images. Recently, many methods have tackled image super-resolution by leveraging local or global features to boost performance. ...
详细信息
Single image super-resolution aims to restore high-resolution images from low-resolution images. Recently, many methods have tackled image super-resolution by leveraging local or global features to boost performance. However, they fail to combine both feature types and often have high parameter counts. We propose a Lightweight Self-Attention Guidance Network (LSAGNet) to address the aforementioned issues. We designed a simple and efficient dynamic local attention (DLA) module to effectively extract local features. Existing Transformer networks often rely on query-key similarities for feature aggregation. However, blindly using these similarities hinders super-resolution reconstruction by failing to retain strong correlations and introducing weak ones. To address this issue, we propose a global self-attention (GSA) mechanism based on a soft-thresholding operation, designed to retain strongly correlated information. Experimental results demonstrate that the proposed LSAGNet achieves an excellent balance between performance and parameter efficiency while also achieving competitive accuracy compared to state-of-the-art methods.
Road damage detection is a crucial task of road inspection systems. Although traditional object detection models achieve promising performance, the presence of shadows exacerbates the difficulty of road damage detecti...
详细信息
Road damage detection is a crucial task of road inspection systems. Although traditional object detection models achieve promising performance, the presence of shadows exacerbates the difficulty of road damage detection in practical scenarios. To tackle these challenges, we introduce a novel shadow-image enhancement network named global-local enhancement network and joint it with the YOLOv7-tiny detection network augmented with components by us to craft an end-to-end detection framework. We integrate deep neural networks with conventional methods and propose the global statistical texture enhancement module to enhance global statistical texture information. We propose the local enhancement module to enhance road damage edge information in shadow regions. Furthermore, we craft a shadow region loss to optimize the enhancement models and employ dynamic snake convolution to replace certain traditional convolution in detection network. We evaluate our method on shadow linear road damage datasets, SRoad and DRoad, which comprise road images from different perspectives in Beijing, China. The results demonstrate that our approach surpasses the performance of low-light enhancement models and low-light detection models. The method achieves mAP of 71.2% and FPS of 98.8 on SRoad dataset while reaching mAP of 79.7% and FPS of 103.2 on DRoad dataset. The proposed model optimizes performance and model size, meeting the requirements for real-time processing in industrial applications.
Glioblastoma is the most common subtype of malignant tumors of the central nervous system. Segmentation of the brain tumor image is crucial for accelerating the diagnosis and treatment of a patient. In the paper, an a...
详细信息
Glioblastoma is the most common subtype of malignant tumors of the central nervous system. Segmentation of the brain tumor image is crucial for accelerating the diagnosis and treatment of a patient. In the paper, an advanced neural network ensemble based on a fuzzy ranking approach for tumor segmentation is presented using a combination of convolutional neural network (CNN) architectures, namely SegResNet, UNETR, and SwinUNETR. The proposed method uses fuzzy rank-based unification of deep learners by considering two nonlinear functions in decision-making, which helps to take into account the confidence in the predictions of the three base models. The proposed method is evaluated using the BRATS 2023 MRI dataset and outperforms the state-of-the-art methods, achieving an average Dice score of 0.885 +/- 0.134. The statistical significance of the differences between the models and the ensemble is confirmed based on the Wilcoxon signed rank test, and the p-value is below 0.005.
暂无评论