Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological ***,due to the complexity of preparation technology and the presence of many uncontrollable factors...
详细信息
Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological ***,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice ***,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice *** further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable *** multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal *** with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image *** theOn2-level time complexitymakes the proposed algorithm *** proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.
Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio rec...
详细信息
Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio recordings can be used in many strategic areas, detecting such tampering and manipulation of audio data is critical. Although the literature demonstrates the lack of any accurate, integrated system for detecting copy-move forgery, the field shows great promise for research. Thus, our proposed method seeks to support the detection of the passive technique of audio copy-move forgery. For our study, forgery audio data were obtained from the TIMIT dataset, and 4378 audio recordings were used: 2189 of original audio and 2189 of audio created by copy-move forgery. After the voiced and unvoiced regions in the audio signal were determined by the yet another algorithm for pitch tracking, the features were obtained from the signals using Mel frequency cepstrum coefficients (MFCCs), delta (Delta) MFCCs, and Delta Delta MFCCs coefficients together, along with linear prediction coefficients (LPCs). In turn, those features were classified using artificial neural networks. Our experimental results demonstrate that the best results were 75.34% detection with the MFCC method, 73.97% detection with the Delta MFCC method, 72.37% detection with the Delta Delta MFCC method, 76.48% detection with the MFCC + Delta MFCC + Delta Delta MFCC method, and 74.77% detection with the LPC method. Using the MFCC + Delta MFCC + Delta Delta MFCC method, in which the features are used together, we determined that the models give far superior results even with relatively few epochs. The proposed method is also more robust than other methods in the literature because it does not use threshold values.
Diabetic retinopathy, an eye complication that causes retinal damage, can impair the vision and even result in blindness, if not treated on time. Regular eye screening is essential for patients with diabetics because ...
详细信息
Diabetic retinopathy, an eye complication that causes retinal damage, can impair the vision and even result in blindness, if not treated on time. Regular eye screening is essential for patients with diabetics because diabetic retinopathy advances significantly without symptoms. Exudates are a primary symptom of diabetic retinopathy, and their automatic recognition can help in early diagnosis. The convolution operation which concentrates mostly on extracting the local features provides less emphasis on global information resulting the long-range dependencies to be addressed while traversing through multiple layers. The proposed segmentation model utilizes both the channel and spatial attention mechanisms to effectively establish the long-range dependencies at various levels of feature extraction. The proposed methodology also utilizes the convolutional long- and short-term memory algorithm during the propagation from input-to-state and from the state-to-state to take into account the spatiotemporal dependencies and the residual extended skip block for widening the network's receptive zone. Implementing the potentials of neural networks, this study excels at identifying complex patterns and minute features in retinal images. The effectiveness of the proposed method has been verified by conducting experiments on various retinal image datasets, such as IDRiD, MESSIDOR, DIARETDB0, and DIARETDB1, which clearly indicates the superiority of this method over other existing methods across a wide range of evaluation metrics, namely specificity, F1-score, accuracy, sensitivity, and intersection-over-union. Additionally, the model's ability to achieve an overall accuracy of 97.7% makes it a viable application that can provide clinicians important insights into the diagnosis and treatment of diabetic retinopathy.
Synthetic Aperture Radar (SAR) altimeter can provide highly accurate terrain data. In complex environments such as mountainous regions, terrain classification can improve data accuracy and reliability. However, classi...
详细信息
Perceptual quality metrics derived from deep features have led to a boost in modelling the Human Visual System (HVS) to perceive the quality of visual content. In this work, we study the effectiveness of fine-tuning t...
详细信息
ISBN:
(纸本)9798350350463;9798350350456
Perceptual quality metrics derived from deep features have led to a boost in modelling the Human Visual System (HVS) to perceive the quality of visual content. In this work, we study the effectiveness of fine-tuning three standard convolutional neural networks (CNNs) viz. ResNet50, VGG16 and MobileNetV2 to predict the quality of stereoscopic images in the no-reference setting. This work also aims to understand the impact of using disparity maps for quality prediction. Interestingly, our experiments demonstrate that disparity maps do not significantly contribute to improving perceptual quality estimation in the deep learning framework. To the best of our knowledge, this is the first study that explores the impact of disparity along with the chosen models for Stereoscopic image Quality Assessment. We present a detailed study of our experiments with various architectural configurations on the LIVE Phase I and II datasets. Further, our results demonstrate the innate capability of deep features for quality prediction. Finally, the simple fine-tuning of the models results in solutions that compete with state-of-the-art patch-based stereoscopic image quality assessment methods.
Intelligent transportation systems (ITS) with surveillance cameras capture traffic images or videos. However, images or videos in ITS often encounter blurs due to various reasons. Considering resource limitations, alt...
详细信息
Intelligent transportation systems (ITS) with surveillance cameras capture traffic images or videos. However, images or videos in ITS often encounter blurs due to various reasons. Considering resource limitations, although recent technologies make progress in image-deblurring, there are still challenges in applying image-deblurring models in practical transportation systems: the model size and the running time. This work proposes an artful variant-depth network (VDN) to address the challenges. We design variant-depth sub-networks in a coarse-to-fine manner to improve the deblurring effect. We also adopt a new connection namely stack connection to connect all sub-networks to reduce the running time and model size while maintaining high deblurring quality. We evaluate the proposed VDN with the state-of-the-art (SOTA) methods on several typical datasets. Results on Peak signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) show that the VDN outperforms SOTA image-deblurring methods. Furthermore, the VDN also has the shortest running time and the smallest model size.
People of all countries, developed and developing alike endure cancer-related fatal diseases. The rate of breast cancer in females is increasing daily, partly due to ignorance and misdiagnosis in the early stages. Dia...
详细信息
People of all countries, developed and developing alike endure cancer-related fatal diseases. The rate of breast cancer in females is increasing daily, partly due to ignorance and misdiagnosis in the early stages. Diagnosis of breast cancer accurately during its earlier stages of development can result in proper initial treatment for breast cancer. Artificial intelligence can aid in the acceleration and automation of breast cancer detection. Deep learning is decisive in effectively recognizing and classifying cancer on large datasets of medical images. In this paper, we propose a novel computer-aided classification approach, Mammo-Light for breast cancer prediction. Preprocessing strategies have been utilized to eradicate the noise and enhance mammogram lesions. Photometric augmentation techniques adapted to the preprocessed classes to balance and increase the size of the dataset. After that, a lightweight yet intuitive convolutional neural network is applied to classify breast cancer on the publicly available dataset CBIS-DDSM. For further validation of the proposed approach, we have used the MIAS dataset. Mammo-Light attained a 99.17% and 98.42% test accuracy respectively for CBISDDSM and MIAS datasets and outperformed state-of-the-art methods in terms of accuracy and other metrics. Due to being the lightweight model, Mammo-Light performs exceptionally well with fewer parameters and computational time, which can potentially contribute to the field of breast cancer early diagnosis and enable fast treatment.
Producing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks hav...
详细信息
Producing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks have been reported to be poorly calibrated compared with those from a decade ago. The stochastic gradient Langevin dynamics (SGLD) algorithm offers a tractable approximate Bayesian inference applicable to DNN, providing a principled method for learning the uncertainty. A recent benchmark study showed that SGLD could produce a more robust model to covariate shifts than other competing methods. However, vanilla SGLD is also known to be slow, and preconditioning can improve SGLD efficacy. This paper proposes eigenvalue-corrected Kronecker factorization (EKFAC) preconditioned SGLD (EKSGLD), in which a novel second-order gradient approximation is employed as a preconditioner for the SGLD algorithm. This approach is expected to bring together the advantages of both second-order optimization and the approximate Bayesian method. Experiments were conducted to compare the performance of EKSGLD with existing preconditioning methods and showed that it could achieve higher predictive accuracy and better calibration on the validation set. EKSGLD improved the best accuracy by 3.06% on CIFAR-10 and 4.15% on MNIST, improved the best negative log-likelihood by 16.2% on CIFAR-10 and 11.4% on MNIST, and improved the best thresholded adaptive calibration error by 4.05% on CIFAR-10.
The coarse-to-fine image defogging strategy has been widely used in the structural design of individual image defogging networks. In the traditional method, multi-scale input image subnets are superimposed, so that th...
详细信息
The coarse-to-fine image defogging strategy has been widely used in the structural design of individual image defogging networks. In the traditional method, multi-scale input image subnets are superimposed, so that the sharpness of the image is gradually improved from the bottom subnet to the top subnet, which inevitably leads to the loss of image details. Toward a fast and accurate dehazing network design, we revisit the coarse-to-fine strategy and present a multi-input and multi-scale U-Net (MIMS-UNet). The MIMS-UNet has two distinct features. On the one hand, the single-encoder of MIMS-UNet adopts multi-input and multi-scale image, which increases the computation amount but greatly improves the network performance. On the other hand, codec structures with context blocks are used to capture context information and recover more details. The experimental results show that the proposed method achieves good results in both quantification and visualization. Compared with the existing methods, the proposed network can achieve ideal results of defogging and effectively avoid color distortion after defogging.
In recent years, many fusion algorithms based on multi-scale transform or neural networks have been proposed to improve medical image fusion (MIF) performance. However, there is still enormous potential to explore the...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In recent years, many fusion algorithms based on multi-scale transform or neural networks have been proposed to improve medical image fusion (MIF) performance. However, there is still enormous potential to explore the combination of different fusion theories. In this paper, we propose a novel MIF framework to integrate powerful feature representation abilities of the deep learning model and accurate frequency decomposition characteristics of discrete wavelet transform (DWT). Firstly, a multi-scale encoder-decoder network is well-trained to extract feature information in different scales and achieve efficient image reconstruction. In particular, DWT is introduced into each scale to decompose the extracted features into high- and low-frequency sub-bands for information preservation during down-sampling. An elaborate feature fusion process is designed to achieve multi-scale fusion while merging different frequency sub-bands. Experiment results on benchmark datasets demonstrate that the proposed fusion framework outperforms current state-of-the-art methods with comparable time complexity in both objective and subjective evaluation.
暂无评论