signal representation in time-frequency (TF) domain is valuable in many applications including radar imaging and inverse synthetic aperture radar. TF representation allows us to identify signal components or features ...
详细信息
signal representation in time-frequency (TF) domain is valuable in many applications including radar imaging and inverse synthetic aperture radar. TF representation allows us to identify signal components or features in a mixed time and frequency plane. There are several well-known tools, such as Wigner-Ville Distribution (WVD), short-time Fourier transform and various other variants for such a purpose. The main requirement for a TF representation tool is to give a high-resolution view of the signal such that the signal components or features are identifiable. A commonly used method is the reassignment process which reduces the cross-terms by artificially moving smoothed WVD values from their actual location to the center of the gravity for that region. In this article, we propose a novel reassignment method using the conditional generative adversarial network (CGAN). We train a CGAN to perform the reassignment process. Through examples, it is shown that the method generates high-resolution TF representations which are better than the current reassignment methods.
Most existing infrared image enhancement algorithms focus on detail and contrast enhancement of ordinary infrared images, and when applied to low-light infrared images, detail and target texture are often severely los...
详细信息
Most existing infrared image enhancement algorithms focus on detail and contrast enhancement of ordinary infrared images, and when applied to low-light infrared images, detail and target texture are often severely lost. The reason is that most algorithms process images in a single scale and have difficulty coping with the degradation of image features while enhancing brightness. To solve this problem, we propose a multi-layer and multi-scale feature fusion network (MMFF-Net). It can improve the brightness of low-light infrared images in the absence of normal-light reference samples and keep the image details consistent with the source image. In this paper, features at different layers of the image are extracted using an adaptively modified deep network. A multi-scale adaptive feature fusion module (MAFFM) is designed to preserve and fuse multi-scale information from different convolutional layer features. The fusion features are passed to the iterative function as pixel-wise parameters for image brightness enhancement. We also propose the local feature fusion module (LFFM), which reconstructs images after fusing multiple features, including brightness enhancement images and source images. Finally, in order to implement the training of the whole network, a set of loss functions is carefully designed in this paper. After extensive experiments, it is shown that the algorithm in this paper can effectively enhance low-light infrared images and perform well in subjective visual tests and quantitative tests compared to existing methods.
In recent years, due to UV human exposure, the number of skin cancers 'subjects' cases have been increased, therefore, the accurate detection of malign skin cancer at early stage is considered as very crucial ...
详细信息
ISBN:
(纸本)9798350351491;9798350351484
In recent years, due to UV human exposure, the number of skin cancers 'subjects' cases have been increased, therefore, the accurate detection of malign skin cancer at early stage is considered as very crucial for patients' therapy and to increase the survival rates. Melanomas is considered as the most frequent and dangerous type of skin cancer. Even a huge number of deep-learning (DL) and Machine Learning (ML) based-classification methods have been introduced in the literature, there have been suspected cases during the clinical diagnosis of malignant lesions. This paper investigates and explores various DL-based models for an accurate diagnosis and detection of malign and benign skin lesions. Basically, Transfer learning (TL) techniques are adapted to efficient and accurate pre-trained models, mainly EfficientNet-B0-V2 and Vision Transformers ViT-b16, on the image-Net datasets. Furthermore, a modified Convolutional neural Network (CNN) model have been adopted and trained from scratch. A publicly available benchmark dataset has been used in order to evaluate the proposed models 'performances and to compare their effectiveness with state-of-the-arts exiting methods. The obtained results are respectively 79,70%, 86,52%, and 86.97% respectively for CNN, EfficientNet-B0-V2, and ViT-b16 models. The experiments have revealed the effectiveness of our proposed models compared to exiting DL and ML models for classification into benign and malignant skin lesions.
X-ray is widely used in security inspection systems for nondestructive testing, aiding inspection staff in identifying dangerous goods. Higher signal-to-noise ratio and resolution in radiation images can significantly...
详细信息
X-ray is widely used in security inspection systems for nondestructive testing, aiding inspection staff in identifying dangerous goods. Higher signal-to-noise ratio and resolution in radiation images can significantly enhance the staff's ability to detect potential threats. This article introduces a novel data augmentation (DA) method aimed at improving the performance of neural network models for noise reduction and super-resolution (SR) reconstruction (such as EDSR, RCAN, and so on). The proposed method enables the models to be region-aware, allowing adaptive processing of different regions with varying degrees of noise or blurriness. Compared with traditional DA methods, the proposed method can effectively prevent the output image from being too smooth or producing artifacts while improving the performance of the models. Experiments indicate that the performance of the models trained with the proposed method shows consistent improvements, with the highest to be 0.26 dB in peak signal-to-noise ratio (PSNR).
In prior studies, domain adversarial neural networks (DANNs) are used to align image-level features regardless of foreground and background. However, the conventional discriminator in DANNs may leads the feature extra...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In prior studies, domain adversarial neural networks (DANNs) are used to align image-level features regardless of foreground and background. However, the conventional discriminator in DANNs may leads the feature extractor to disregard cross-domain features rather than aligning them. This phenomenon negatively impact classifier performance. We propose a novel loss reweighting technique that mitigates the optimization conflict between discriminator and classifier. The classification loss is reweighted based on the prediction uncertainty that is measured by two different bottleneck layers. This reweighting approach guides the model in determining which features should be activated or aligned, resulting in significantly improved adaptation performance. Additionally, we introduce a novel construction method of bottleneck layer based on pseudo label of target domain and differentiable architecture search to support our approach. Our method is rigorously evaluated across multiple benchmark datasets and outperforms state-of-the-art (SOTA) methods.
Objective: Computer methods related to the diagnosis of COVID-19 disease have progressed significantly in recent years. Chest X-ray analysis supported by artificial intelligence is one of the most important parts of t...
详细信息
Objective: Computer methods related to the diagnosis of COVID-19 disease have progressed significantly in recent years. Chest X-ray analysis supported by artificial intelligence is one of the most important parts of the diagnosis. Unfortunately, there is no digital tool dedicated to post-acute pulmonary changes related to COVID-19 and modern diagnostic tools are needed. methods: This paper introduces a novel neural network architecture for chest X-ray analysis, which consists of two parts. The first is an Inception architecture that captures global features, and the second is a combination of Inception modules and a Vision Transformer network to analyze the local features. Considering that several diseases can occur in X-ray images together, a specific loss function for multilabel classification was applied - asymmetric loss function (ASL), which we modified for our purpose. In contrast to other works, we focus only on the subgroup of 9 diseases from the chestX-ray14 dataset, which can appear as a consequence of COVID-19. Results: This work proves the effectiveness of the proposed neural network architecture combined with the asymmetric loss function on post-COVID-related diseases. The results were compared with several wellknown classification architectures, such as VGG19, DenseNet121, EfficientNetB4, InceptionV3 and ResNet101. According to the results, the proposed method outperforms the mentioned models with AUC - 0.819, accuracy - 0.736, sensitivity - 0.7683, and specificity - 0.7221. Significance: Our work is the first one, which focuses on the diagnosis of post-COVID-19 related pulmonary diseases from X-ray images that uses deep learning. The proposed neural network reaches better accuracy than existing well-known architectures.
Recent years have witnessed the success of deep networks in compressed sensing (CS), which allows for a significant reduction in sampling cost and has gained growing attention since its inception. In this paper, we pr...
详细信息
Recent years have witnessed the success of deep networks in compressed sensing (CS), which allows for a significant reduction in sampling cost and has gained growing attention since its inception. In this paper, we propose a new practical and compact network dubbed PCNet for general image CS. Specifically, in PCNet, a novel collaborative sampling operator is designed, which consists of a deep conditional filtering step and a dual-branch fast sampling step. The former learns an implicit representation of a linear transformation matrix into a few convolutions and first performs adaptive local filtering on the input image, while the latter then uses a discrete cosine transform and a scrambled block- diagonal Gaussian matrix to generate under-sampled measurements. Our PCNet is equipped with an enhanced proximal gradient descent algorithm-unrolled network for reconstruction. It offers flexibility, interpretability, and strong recovery performance for arbitrary sampling rates once trained. Additionally, we provide a deployment-oriented extraction scheme for single-pixel CS imaging systems, which allows for the convenient conversion of any linear sampling operator to its matrix form to be loaded onto hardware like digital micro-mirror devices. Extensive experiments on natural image CS, quantized CS, and self-supervised CS demonstrate the superior reconstruction accuracy and generalization ability of PCNet compared to existing state-of-the-art methods, particularly for high-resolution images.
Recently, single image super-resolution based on convolutional neural network (CNN) has achieved considerable improvements against traditional methods. However, it is still challenging for most CNN-based methods to ob...
详细信息
Recently, single image super-resolution based on convolutional neural network (CNN) has achieved considerable improvements against traditional methods. However, it is still challenging for most CNN-based methods to obtain satisfactory reconstruction quality for large-scale factors. To solve the issues, we propose a progressive residual multi-dilated aggregation network (PRMAN), which performs multi-level x2 upsampling to reconstruct images with large-scale factors. Specially, we design a residual multi-dilated aggregation block to simplify the model and supply enriched features with different receptive fields. Simultaneously, the channel attention mechanism is adopted to select informative features. Furthermore, to speed up the convergence and attain better performance, we train the model with two-stage training strategy. Extensive experimental results show that our proposed PRMAN exceeds the state-of-the-art methods in most cases.
Low-latency configurable speech transmission presents significant challenges in modern communication systems. Traditional methods rely on separate source and channel coding, which often degrades performance under low-...
详细信息
Low-latency configurable speech transmission presents significant challenges in modern communication systems. Traditional methods rely on separate source and channel coding, which often degrades performance under low-latency constraints. Moreover, non-configurable systems require separate training for each condition, limiting their adaptability in resource-constrained scenarios. This paper proposes a configurable low-latency deep Joint Source-Channel Coding (JSCC) system for speech transmission. The system can be configured for varying signal-to-noise ratios (SNR), wireless channel conditions, or bandwidths. A joint source-channel encoder based on deep neural networks (DNN) is used to compress and transmit analog-coded information, while a configurable decoder reconstructs speech from noisy compressed signals. The system latency is adaptable based on the input speech length, achieving a minimum latency of 2 ms, with a lightweight architecture of 25 k parameters, significantly fewer than state-of-the-art systems. The simulation results demonstrate that the proposed system outperforms conventional separate source-channel coding systems in terms of speech quality and intelligibility, particularly in low-latency and noisy channel conditions. It also shows robustness in fixed configured scenarios, though higher latency conditions and better channel environments favor traditional coding systems.
Magnetic Resonance Imaging (MRI) is essential for high-resolution soft-tissue imaging but suffers from long acquisition times, limiting its clinical efficiency. Accelerating MRI through undersampling k-space data lead...
详细信息
Magnetic Resonance Imaging (MRI) is essential for high-resolution soft-tissue imaging but suffers from long acquisition times, limiting its clinical efficiency. Accelerating MRI through undersampling k-space data leads to ill-posed inverse problems, introducing noise and artifacts that degrade image quality. Conventional deep learning models, including conditional and unconditional approaches, often face challenges in generalization, particularly with variations in imaging operators or domain shifts. In this study, we propose PINN-DADif, a Physics-Informed neural Network integrated with deep adaptive diffusion priors, to address these challenges in MRI reconstruction. PINN-DADif employs a two-phase inference strategy: an initial rapid-diffusion phase for fast preliminary reconstructions, followed by an adaptive phase where the diffusion prior is refined to ensure consistency with MRI physics and data fidelity. The inclusion of physics-based regularization through PINNs enhances the model's adherence to k-space constraints and gradient smoothness, leading to more accurate reconstructions. This adaptive approach reduces the number of iterations required compared to traditional diffusion models, improving both speed and image quality. We validated PINN-DADif on a private MRI dataset and the public fastMRI dataset, where it outperformed state-of-the-art methods. The model achieved PSNR values of 41.2, 39.5, and 41.5, and SSIM values of 98.7, 98.0, and 98.5 for T1, T2, and Proton Density-weighted images at R = 4x on the private dataset. Similar high performance was observed on the fastMRI dataset, even in scenarios involving domain shifts. PINN-DADif marks a significant advancement in MRI reconstruction by providing an efficient, adaptive, and physics-informed solution.
暂无评论