检索结果-内蒙古大学图书馆

Multi-Layer Deep Sparse Representation for Biological Slice image Inpainting

Computers, Materials & Continua 2023年第9期76卷 3813-3832页

作者： Haitao Hu Hongmei Ma Shuli Mei College of Information and Electrical Engineering China Agricultural UniversityBeijing100000China Yantai Research Institute China Agricultural UniversityYantai264670China

Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological ***,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice ***,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice *** further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable *** multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal *** with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image *** theOn2-level time complexitymakes the proposed algorithm *** proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.

关键词： Deep sparse representation image inpainting convolutional sparse modelling deep neural network

来源：评论

学校读者我要写书评

暂无评论

Detecting audio copy-move forgery with an artificial neural network

引用

signal image AND VIDEO processing 2024年第3期18卷 2117-2133页

作者： Akdeniz, Fulya Becerikli, Yasar Kocaeli Univ Dept Comp Engn Kocaeli Turkiye

Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio recordings can be used in many strategic areas, detecting such tampering and manipulation of audio data is critical. Although the literature demonstrates the lack of any accurate, integrated system for detecting copy-move forgery, the field shows great promise for research. Thus, our proposed method seeks to support the detection of the passive technique of audio copy-move forgery. For our study, forgery audio data were obtained from the TIMIT dataset, and 4378 audio recordings were used: 2189 of original audio and 2189 of audio created by copy-move forgery. After the voiced and unvoiced regions in the audio signal were determined by the yet another algorithm for pitch tracking, the features were obtained from the signals using Mel frequency cepstrum coefficients (MFCCs), delta (Delta) MFCCs, and Delta Delta MFCCs coefficients together, along with linear prediction coefficients (LPCs). In turn, those features were classified using artificial neural networks. Our experimental results demonstrate that the best results were 75.34% detection with the MFCC method, 73.97% detection with the Delta MFCC method, 72.37% detection with the Delta Delta MFCC method, 76.48% detection with the MFCC + Delta MFCC + Delta Delta MFCC method, and 74.77% detection with the LPC method. Using the MFCC + Delta MFCC + Delta Delta MFCC method, in which the features are used together, we determined that the models give far superior results even with relatively few epochs. The proposed method is also more robust than other methods in the literature because it does not use threshold values.

关键词： Digital multimedia forensics Audio forensics Audio tampering Audio copy-move forgery Artificial neural network Digital multimedia security

来源：评论

学校读者我要写书评

暂无评论

An attention enriched encoder-decoder architecture with CLSTM and RES unit for segmenting exudate in retinal images

引用

signal image AND VIDEO processing 2024年第4期18卷 3329-3339页

作者： Maiti, Souvik Maji, Debasis Dhara, Ashis Kumar Sarkar, Gautam Univ Engn & Management Inst Engn & Management Newtown Kolkata 700160 India Haldia Inst Technol Dept Elect Engn Haldia 721657 West Bengal India Natl Inst Technol Dept Elect Engn Durgapur 713209 West Bengal India Jadavpur Univ Dept Elect Engn Kolkata 700032 India

Diabetic retinopathy, an eye complication that causes retinal damage, can impair the vision and even result in blindness, if not treated on time. Regular eye screening is essential for patients with diabetics because diabetic retinopathy advances significantly without symptoms. Exudates are a primary symptom of diabetic retinopathy, and their automatic recognition can help in early diagnosis. The convolution operation which concentrates mostly on extracting the local features provides less emphasis on global information resulting the long-range dependencies to be addressed while traversing through multiple layers. The proposed segmentation model utilizes both the channel and spatial attention mechanisms to effectively establish the long-range dependencies at various levels of feature extraction. The proposed methodology also utilizes the convolutional long- and short-term memory algorithm during the propagation from input-to-state and from the state-to-state to take into account the spatiotemporal dependencies and the residual extended skip block for widening the network's receptive zone. Implementing the potentials of neural networks, this study excels at identifying complex patterns and minute features in retinal images. The effectiveness of the proposed method has been verified by conducting experiments on various retinal image datasets, such as IDRiD, MESSIDOR, DIARETDB0, and DIARETDB1, which clearly indicates the superiority of this method over other existing methods across a wide range of evaluation metrics, namely specificity, F1-score, accuracy, sensitivity, and intersection-over-union. Additionally, the model's ability to achieve an overall accuracy of 97.7% makes it a viable application that can provide clinicians important insights into the diagnosis and treatment of diabetic retinopathy.

关键词： Fundus image Diabetic retinopathy Exudate Residual extended skip

来源：评论

学校读者我要写书评

暂无评论

On the Value of Terrain Classification Using SAR Altimeter Delay Doppler image 2

On the Value of Terrain Classification Using SAR Altimeter D...

引用

2nd IEEE International Conference on signal, Information and Data processing, ICSIDP 2024

作者： Liu, Lin Wei, Yu Hu, Fengming Xu, Feng Shanghai200433 China

ISBN: (纸本)9798331515669

Synthetic Aperture Radar (SAR) altimeter can provide highly accurate terrain data. In complex environments such as mountainous regions, terrain classification can improve data accuracy and reliability. However, classical methods rely on altimetry sequences, which may result in inaccuracies in environments with interference or when there are deviations in the prior Digital Elevation Model (DEM). This paper proposes a terrain classification method using SAR altimeter Delay Doppler image (DDI). First, we introduce the SAR altimeter echo model, which can be used to generate DDIs. Then, typical terrain factors are selected for initial classification, and a Principal Component Analysis (PCA)-based model is constructed to select the main features, which are then used to choose samples. Finally, a Convolutional neural Network (CNN) model is trained to achieve classification of DDIs. This process enables precise terrain classification. Experiments using simulation data verify that the proposed method can effectively classify DDIs. This approach has the potential for improving terrain assessment in various applications. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Are Standard CNNs Good Enough for No-Reference Stereoscopic image Quality Assessment? 15

Are Standard CNNs Good Enough for No-Reference Stereoscopic ...

引用

15th International Conference on signal processing and Communications (SPCOM)

作者： Bardhan, Ishita Channappayya, Sumohana Banerjee, Abhik Forkan, Abdur Rahim Mohammad Jayaraman, Prem Kumar, Abhinav Indian Inst Technol Hyderabad Dept Artificial Intelligence Kandi Telangana India Swinburne Univ Technol Dept Comp Technol Melbourne Vic Australia

ISBN: (纸本)9798350350463;9798350350456

Perceptual quality metrics derived from deep features have led to a boost in modelling the Human Visual System (HVS) to perceive the quality of visual content. In this work, we study the effectiveness of fine-tuning three standard convolutional neural networks (CNNs) viz. ResNet50, VGG16 and MobileNetV2 to predict the quality of stereoscopic images in the no-reference setting. This work also aims to understand the impact of using disparity maps for quality prediction. Interestingly, our experiments demonstrate that disparity maps do not significantly contribute to improving perceptual quality estimation in the deep learning framework. To the best of our knowledge, this is the first study that explores the impact of disparity along with the chosen models for Stereoscopic image Quality Assessment. We present a detailed study of our experiments with various architectural configurations on the LIVE Phase I and II datasets. Further, our results demonstrate the innate capability of deep features for quality prediction. Finally, the simple fine-tuning of the models results in solutions that compete with state-of-the-art patch-based stereoscopic image quality assessment methods.

关键词： Stereocenters

来源：评论

学校读者我要写书评

暂无评论

Variant-Depth neural Networks for Deblurring Traffic images in Intelligent Transportation Systems

引用

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2023年第6期24卷 5792-5802页

作者： Wang, Qian Guo, Cai Dai, Hong-Ning Xia, Min Macau Univ Sci & Technol Fac Innovat Engn Taipa Macau Peoples R China Hanshan Normal Univ Sch Comp & Informat Engn Chaozhou 521000 Peoples R China Hong Kong Baptist Univ Dept Comp Sci Hong Kong Peoples R China Univ Lancaster Dept Engn Lancaster LA1 4YW England

Intelligent transportation systems (ITS) with surveillance cameras capture traffic images or videos. However, images or videos in ITS often encounter blurs due to various reasons. Considering resource limitations, although recent technologies make progress in image-deblurring, there are still challenges in applying image-deblurring models in practical transportation systems: the model size and the running time. This work proposes an artful variant-depth network (VDN) to address the challenges. We design variant-depth sub-networks in a coarse-to-fine manner to improve the deblurring effect. We also adopt a new connection namely stack connection to connect all sub-networks to reduce the running time and model size while maintaining high deblurring quality. We evaluate the proposed VDN with the state-of-the-art (SOTA) methods on several typical datasets. Results on Peak signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) show that the VDN outperforms SOTA image-deblurring methods. Furthermore, the VDN also has the shortest running time and the smallest model size.

关键词： Cameras image restoration Videos Surveillance Kernel image recognition Traffic control Intelligent transportation systems (ITS) traffic image processing Index Terms image deblurring variant-depth neural networks

来源：评论

学校读者我要写书评

暂无评论

Mammo-Light: A lightweight convolutional neural network for diagnosing breast cancer from mammography images

引用

BIOMEDICAL signal processing AND CONTROL 2024年 94卷

作者： Raiaan, Mohaimenul Azam Khan Fahad, Nur Mohammad Mukta, Md Saddam Hossain Shatabda, Swakkhar United Int Univ Dept Comp Sci & Engn Plot 2United CityMadani Ave Dhaka 1212 Bangladesh Lappeenranta Lahti Univ Technol LUT Sch Engn Sci Lappeenranta 53850 Finland

People of all countries, developed and developing alike endure cancer-related fatal diseases. The rate of breast cancer in females is increasing daily, partly due to ignorance and misdiagnosis in the early stages. Diagnosis of breast cancer accurately during its earlier stages of development can result in proper initial treatment for breast cancer. Artificial intelligence can aid in the acceleration and automation of breast cancer detection. Deep learning is decisive in effectively recognizing and classifying cancer on large datasets of medical images. In this paper, we propose a novel computer-aided classification approach, Mammo-Light for breast cancer prediction. Preprocessing strategies have been utilized to eradicate the noise and enhance mammogram lesions. Photometric augmentation techniques adapted to the preprocessed classes to balance and increase the size of the dataset. After that, a lightweight yet intuitive convolutional neural network is applied to classify breast cancer on the publicly available dataset CBIS-DDSM. For further validation of the proposed approach, we have used the MIAS dataset. Mammo-Light attained a 99.17% and 98.42% test accuracy respectively for CBISDDSM and MIAS datasets and outperformed state-of-the-art methods in terms of accuracy and other metrics. Due to being the lightweight model, Mammo-Light performs exceptionally well with fewer parameters and computational time, which can potentially contribute to the field of breast cancer early diagnosis and enable fast treatment.

关键词： Mammograms image preprocessing Breast cancer classification Data augmentation Lightweight CNN Transfer learning models

来源：评论

学校读者我要写书评

暂无评论

On Performance and Calibration of Natural Gradient Langevin Dynamics

引用

IEEE ACCESS 2023年 11卷 53919-53931页

作者： Robbani, Hanif Amal Bustamam, Alhadi Adnan, Risman Ahmad, Shandar Univ Indonesia Dept Math Depok 16424 Indonesia Kalbe Digital Lab Jakarta 10510 Indonesia Jawaharlal Nehru Univ Sch Computat & Integrat Sci Delhi 110067 Delhi India

Producing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks have been reported to be poorly calibrated compared with those from a decade ago. The stochastic gradient Langevin dynamics (SGLD) algorithm offers a tractable approximate Bayesian inference applicable to DNN, providing a principled method for learning the uncertainty. A recent benchmark study showed that SGLD could produce a more robust model to covariate shifts than other competing methods. However, vanilla SGLD is also known to be slow, and preconditioning can improve SGLD efficacy. This paper proposes eigenvalue-corrected Kronecker factorization (EKFAC) preconditioned SGLD (EKSGLD), in which a novel second-order gradient approximation is employed as a preconditioner for the SGLD algorithm. This approach is expected to bring together the advantages of both second-order optimization and the approximate Bayesian method. Experiments were conducted to compare the performance of EKSGLD with existing preconditioning methods and showed that it could achieve higher predictive accuracy and better calibration on the validation set. EKSGLD improved the best accuracy by 3.06% on CIFAR-10 and 4.15% on MNIST, improved the best negative log-likelihood by 16.2% on CIFAR-10 and 11.4% on MNIST, and improved the best thresholded adaptive calibration error by 4.05% on CIFAR-10.

关键词： Calibration Benchmark testing Bayes methods Uncertainty Deep learning Approximation algorithms Optimization Predictive methods Natural gradient second-order optimization Bayesian deep learning Langevin dynamics confidence calibration predictive uncertainty

来源：评论

学校读者我要写书评

暂无评论

image defogging based on multi-input and multi-scale UNet

引用

signal image AND VIDEO processing 2023年第4期17卷 1143-1151页

作者： Lin, Zhengchun Luo, Qingxing Jiang, Yunzhi Wang, Jing Li, Siyuan Cheng, Gongwen Genrang, Zheng Guangdong Polytech Normal Univ Guangzhou 510000 Peoples R China Guangdong Prov Key Lab Intellectual Property & Bi Guangzhou 510000 Peoples R China Shenzhen Chace Network Informat Technol Co Ltd Shenzhen 518000 Peoples R China Zhongshan Polytech Zhongshan 528400 Peoples R China

The coarse-to-fine image defogging strategy has been widely used in the structural design of individual image defogging networks. In the traditional method, multi-scale input image subnets are superimposed, so that the sharpness of the image is gradually improved from the bottom subnet to the top subnet, which inevitably leads to the loss of image details. Toward a fast and accurate dehazing network design, we revisit the coarse-to-fine strategy and present a multi-input and multi-scale U-Net (MIMS-UNet). The MIMS-UNet has two distinct features. On the one hand, the single-encoder of MIMS-UNet adopts multi-input and multi-scale image, which increases the computation amount but greatly improves the network performance. On the other hand, codec structures with context blocks are used to capture context information and recover more details. The experimental results show that the proposed method achieves good results in both quantification and visualization. Compared with the existing methods, the proposed network can achieve ideal results of defogging and effectively avoid color distortion after defogging.

关键词： image dehazing Convolutional neural network Feature fusion image restoration image processing

来源：评论

学校读者我要写书评

暂无评论

A NOVEL MEDICAL image FUSION FRAMEWORK INTEGRATING MULTI-SCALE ENCODER-DECODER WITH DISCRETE WAVELET DECOMPOSITION 49

A NOVEL MEDICAL IMAGE FUSION FRAMEWORK INTEGRATING MULTI-SCA...

引用

49th IEEE International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Liu, Renhe Liu, Yu Wang, Han Hu, Kai Du, Shan Tianjin Univ Sch Microelect Tianjin Peoples R China Univ British Columbia Dept Comp Sci Math Phys & Stat Okanagan Campus Vancouver BC Canada

ISBN: (纸本)9798350344868;9798350344851

In recent years, many fusion algorithms based on multi-scale transform or neural networks have been proposed to improve medical image fusion (MIF) performance. However, there is still enormous potential to explore the combination of different fusion theories. In this paper, we propose a novel MIF framework to integrate powerful feature representation abilities of the deep learning model and accurate frequency decomposition characteristics of discrete wavelet transform (DWT). Firstly, a multi-scale encoder-decoder network is well-trained to extract feature information in different scales and achieve efficient image reconstruction. In particular, DWT is introduced into each scale to decompose the extracted features into high- and low-frequency sub-bands for information preservation during down-sampling. An elaborate feature fusion process is designed to achieve multi-scale fusion while merging different frequency sub-bands. Experiment results on benchmark datasets demonstrate that the proposed fusion framework outperforms current state-of-the-art methods with comparable time complexity in both objective and subjective evaluation.

关键词： medical image fusion discrete wavelet transform multi-scale encoder-decoder high- and low-frequency sub-bands feature fusion

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：