Coronavirus Disease 2019 (Covid-19) overtook the worldwide in early 2020, placing the world's health in threat. Automated lung infection detection using Chest X-ray images has a ton of potential for enhancing the ...
详细信息
Coronavirus Disease 2019 (Covid-19) overtook the worldwide in early 2020, placing the world's health in threat. Automated lung infection detection using Chest X-ray images has a ton of potential for enhancing the traditional covid-19 treatment strategy. However, there are several challenges to detect infected regions from Chest X-ray images, including significant variance in infected features similar spatial characteristics, multi-scale variations in texture shapes and sizes of infected regions. Moreover, high parameters with transfer learning are also a constraints to deploy deep convolutional neural network(CNN) models in real time environment. A novel covid-19 lightweight CNN(LW-CovidNet) method is proposed to automatically detect covid-19 infected regions from Chest X-ray images to address these challenges. In our proposed hybrid method of integrating Standard and Depth-wise Separable convolutions are used to aggregate the high level features and also compensate the information loss by increasing the Receptive Field of the model. The detection boundaries of disease regions representations are then enhanced via an Edge-Attention method by applying heatmaps for accurate detection of disease regions. Extensive experiments indicate that the proposed LW-CovidNet surpasses most cutting-edge detection methods and also contributes to the advancement of state-of-the-art performance. It is envisaged that with reliable accuracy, this method can be introduced for clinical practices in the future.
The challenge of handling vast amounts of high-resolution satellite imagery is driven by onboard memory and bandwidth limitations. As spatial and spectral resolutions increase, image compression, particularly deep-lea...
详细信息
The challenge of handling vast amounts of high-resolution satellite imagery is driven by onboard memory and bandwidth limitations. As spatial and spectral resolutions increase, image compression, particularly deep-learning-based methods, is essential to overcome these limitations. This paper presents hybrid autoencoder models that combine convolutional neural networks, long short-term memory networks, and attention mechanisms for spatial and spectral feature extraction. The proposed architectures, including sparse and variational autoencoder counterparts, form a comprehensive image compression framework with quantization, and various entropy coders are applied to the EuroSat dataset (RGB and multispectral). The experimental results show the models' superiority over the JPEG family and recent state-of-the-art methods, achieving up to 3.3, 1.4, and 0.6% improvements in the peak signal-to-noise ratio, structural similarity index, and multiscale structural similarity index, respectively. Moreover, performance analysis in terms of computational complexity, processing time, and memory usage highlights the efficiency of the proposed models. A case study conducted on a real scene from the Sentinel-2 satellite further validates the compatibility of the proposed models with modern artificial intelligence chipsets.
Over the recent years, deep convolutional neural networks based models have been absolutely attractive in image denoising field due to their favorable performance. However, many existing deep neural network based imag...
详细信息
Over the recent years, deep convolutional neural networks based models have been absolutely attractive in image denoising field due to their favorable performance. However, many existing deep neural network based image denoising models lack flexibility for spatially variant or real-world noise, which restricts the application of these models in real denoising scenes. In this paper, we propose a flexible and effective U-shaped network (FEUNet), which is effective in a wide range of noise levels, and can deal with spatially variant noise. The adjustable noise level map is used as the input of the FEUNet to enhance its flexibility. The U-Net is utilized to enhance the effectiveness of the proposed model. Experimental results have verified that the proposed FEUNet can obtain competitive denoising performances on many denoising tasks compared with the state-of-the-art denoising methods, which makes the proposed FEUNet well suited for the practical image denoising tasks.
With the development of audio playback devices and fast data transmission, the demand for high sound quality is rising for both entertainment and communications. In this quest for better sound quality, challenges emer...
详细信息
With the development of audio playback devices and fast data transmission, the demand for high sound quality is rising for both entertainment and communications. In this quest for better sound quality, challenges emerge from distortions and interferences originating at the recording side or caused by an imperfect transmission pipeline. To address this problem, audio restoration methods aim to recover clean sound signals from the corrupted input data. We present here audio restoration algorithms based on diffusion models, with a focus on speech enhancement and music restoration tasks. Traditional approaches, often grounded in handcrafted rules and statistical heuristics, have shaped our understanding of audio signals. In the past decades, there has been a notable shift toward data-driven methods that exploit the modeling capabilities of deep neural networks (DNNs). Deep generative models, and among them diffusion models, have emerged as powerful techniques for learning complex data distributions. However, relying solely on DNN-based learning approaches carries the risk of reducing interpretability, particularly when employing end-to-end models. Nonetheless, data-driven approaches allow more flexibility in comparison to statistical model-based frameworks, whose performance depends on distributional and statistical assumptions that can be difficult to guarantee. Here, we aim to show that diffusion models can combine the best of both worlds and offer the opportunity to design audio restoration algorithms with a good degree of interpretability and a remarkable performance in terms of sound quality. In this article, we review the use of diffusion models for audio restoration. We explain the diffusion formalism and its application to the conditional generation of clean audio signals. We believe that diffusion models open an exciting field of research with the potential to spawn new audio restoration algorithms that are natural-sounding and remain robust in difficult acoust
While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned image Compression (LIC) codecs represent a promising alternative. In detail, integrating attentio...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. This work proposes a Graph-based Attention Block for image Compression (GABIC), a method to reduce feature redundancy based on a k-Nearest Neighbors enhanced attention mechanism. Our experiments show that GABIC outperforms comparable methods, particularly at high bit rates, enhancing compression performance.
Emotion recognition is one of the most interesting subjects in machine learning and computer vision fields, which is recognized by body language, speech, and face. Automatic emotion recognition is used in a variety of...
详细信息
Emotion recognition is one of the most interesting subjects in machine learning and computer vision fields, which is recognized by body language, speech, and face. Automatic emotion recognition is used in a variety of applications. In practice, recognizing human emotions with high accuracy is a challenging task. For this purpose, in this paper, we have recognized emotion from facial images using convolutional neural network architecture as one of the deep learning networks that used inception modules and dense blocks. The new proposed architecture is represented as GA-Dense-FaceliveNet, in which a genetic algorithm is expressed to tune the hyperparameters of the deep convolutional neural network. The proposed model is evaluated using three well-known datasets: CK + (extended Cohn-Kanade), JAFFE (Japanese Female Facial Expression), and KDEF (Karolinska Directed Emotional Faces). In the experiment, the accuracy of using CK + , JAFFE, and KDEF datasets is 99.96%, 98.92%, and 99.17%, respectively. The results demonstrate that the proposed method has higher performance compared to the state-of-the-art methods.
This study focuses on the vital difficulty of burn assessment in medical image retrieval from grafted burn specimens particularly in resource-constrained contexts where speedy and precise diagnoses are required. Our s...
详细信息
This study focuses on the vital difficulty of burn assessment in medical image retrieval from grafted burn specimens particularly in resource-constrained contexts where speedy and precise diagnoses are required. Our solution combines sophisticated machine learning techniques, namely an Artificial neural Network (ANN), with the Contrast Limited Adaptive Histogram Equalisation (CLAHE) algorithm in an image Reclamation system. The statistical assessments of kurtosis value (K-CLAHE=144.83) compared to the query image (K-query=131.17) indicate a distribution with more pronounced tails in the CLAHE image, enhancing specific image features. Additionally, increased skewness in the CLAHE image (S-CLAHE=5.92) suggests a shift toward higher intensity levels compared to the query image (S-query=4.47), further enhancing discernible image features. Through this incorporation, we carefully retain picture boundaries, boost local contrast, and minimize noise, hence enhancing burn diagnostic accuracy. Statistical analyses, such as kurtosis and skewness analysis, verify the improvements in visible picture aspects, offering significant insights into fundamental texture properties. We increase picture retrieval efficiency by using Bhattacharya coefficients and unique bin analysis, resulting in substantial enhancements in the retrieving score of matched images The ANN successfully differentiates between photos that require grafts and those that do not, providing a speedy and accurate diagnosis for acute burn injuries. This comprehensive technique greatly improves burn diagnosis, especially during emergencies, and shows promise for improving medical procedures. Our study helps to raise patient care standards in difficult medical situations by combining automated evaluation tools, powerful methods for imageprocessing, and machine learning.
Medical image segmentation (MIS) is a key technique in computer-aided diagnosis. With the development of deep learning, especially convolutional neural networks, the performance of MIS has been significantly improved,...
详细信息
Medical image segmentation (MIS) is a key technique in computer-aided diagnosis. With the development of deep learning, especially convolutional neural networks, the performance of MIS has been significantly improved, however, some mainstream convolution-based methods still suffer from inaccurate target boundaries and imprecise segmentation results. At the same time, transformer-based methods have gradually achieved better segmentation results. To overcome the challenges of traditional methods, an accurate MIS model (CascadeMedSeg) is proposed in this paper, which combines a pyramid vision transformer (PVT) and multi-scale fusion. This network model follows a standard encoder-decoder segmentation architecture, where PVT is used as an encoder. PVT, designed as a pure Transformer backbone for pixel-level dense prediction tasks, can consistently generate a global receptive field and, as an encoder, flexibly learn multi-scale features of medical images. Two additional modules, namely Enhanced Attention Fusion (EAF) and Edge-Enhanced Segmentation (EES) are introduced. The EAF module fuses up-sampled and skip-connected features using an attention mechanism that enhances the perception of channel and positional information. The EES module enhances the boundary features of the network through the aggregation of multi-level features of the encoder and a dynamic boundary detection operator used to obtain a boundary mask and embed it into the decoder. Extensive experiments on five datasets show that CascadeMedSeg exhibits improved performance over several state-of-the-art methods. The MIoU values for the Kvasir-SEG, CVC-ClinicDB, ISIC 2018, and BUSI datasets are 88.16, 89.79, 86.32, and 66.69%, respectively.
Medical image segmentation is a critical task in healthcare diagnostics and treatment planning. Among various approaches, the U-Net neural network has emerged as a revolutionary and efficient method, particularly well...
详细信息
ISBN:
(纸本)9798400717499
Medical image segmentation is a critical task in healthcare diagnostics and treatment planning. Among various approaches, the U-Net neural network has emerged as a revolutionary and efficient method, particularly well-suited for medical imaging applications. Its unique architecture facilitates robust feature extraction and demonstrates superior adaptability to limited datasets, which often present challenges such as indistinct anatomical boundaries. This study proposes an enhanced U-Net model that incorporates a residual module and a comprehensive skip connection mechanism. This innovative design enables the decoder to integrate multi-scale feature maps: fine-grained information from the encoder, equivalent-scale features, and broader contextual data from the decoder itself. Experimental results demonstrate that this modified U-Net architecture achieves higher segmentation accuracy and improved training efficiency compared to conventional methods.
Improper exposures greatly degenerate the visual quality of images. Correcting various exposure errors in a unified framework is challenging as it requires simultaneously handling global attributes and local details u...
详细信息
Improper exposures greatly degenerate the visual quality of images. Correcting various exposure errors in a unified framework is challenging as it requires simultaneously handling global attributes and local details under different exposure conditions. In this paper, we propose a conditional Laplacian pyramid network (CLPN) for correcting different exposure errors in the same framework. It applies Laplacian pyramid to decompose an improperly exposed image into a low-frequency (LF) component and several high-frequency (HF) components, and then enhances the decomposed components in a coarse-to-fine manner. To consistently correct a wide range of exposure errors, a conditional feature extractor is designed to extract the conditional feature from the given image. Afterwards, the conditional feature is used to guide the refinement of LF features, so that a precisely correction for illumination, contrast and color tone can be obtained. As different frequency components exhibit pixel-wise correlations, the frequency components in lower pyramid layers are used to support the reconstruction of the HF components in higher layers. By doing so, fine details can be effectively restored, while noises can be well suppressed. Extensive experiments show that our method is more effective than state-of-the-art methods on correcting various exposure conditions ranging from severe underexposure to intense overexposure.
暂无评论