检索结果-内蒙古大学图书馆

17th International Congress on image and signal processing, BioMedical Engineering and Informatics, CISP-BMEI 2024

作者： Lian, Yaoze Zhou, Liang School of Computer Science and Technology Nanjing University of Aeronautics and Astronautics China

ISBN: (纸本)9798331507398

With the rise of deep learning, numerous methods based on convolutional neural networks have emerged in various fields of image restoration. neural networks and recent advancements like transformers have performed exceptionally well in many visual tasks. In this paper, we analyze the strengths and weaknesses of convolutional methods and transformer approaches. We integrate a Multi-scale Enhanced Convolution module with dilation mechanisms and a Multi-shape Feature Enhanced Unit with various attention combinations into the MixBlock, effectively achieving both a large receptive field and local information interaction. Additionally, our model leverages a U-shaped architecture, enabling it to effectively capture multiscale representations. We introduce the network as EMNet-a highly efficient CNN architecture specifically designed for image restoration while ensuring computational efficiency. Extensive experiments demonstrate that our network achieves state-of-the-art performance on three benchmark datasets, including image dehazing, desnowing, and motion deblurring. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Adaptive Surveillance Video Compression With Background Hyperprior

引用

IEEE signal processing LETTERS 2025年 32卷 456-460页

作者： Zhao, Yu Tang, Song Ye, Mao Univ Elect Sci & Technol China Sch Comp Sci & Engn Chengdu 611731 Peoples R China Univ Shanghai Sci & Technol Shanghai 200093 Peoples R China

neural surveillance video compression methods have demonstrated significant improvements over traditional video compression techniques. In current surveillance video compression frameworks, the first frame in a Group of Pictures (GOP) is usually compressed fully as an I frame, and the subsequent P frames are compressed by referencing this I frame at Low Delay P (LDP) encoding mode. However, this compression approach overlooks the utilization of background information, which limits its adaptability to different scenarios. In this paper, we propose a novel Adaptive Surveillance Video Compression framework based on background hyperprior, dubbed as ASVC. This background hyperprior is related with side information to assist in coding both the temporal and spatial domains. Our method mainly consists of two components. First, the background information from a GOP is extracted, modeled as hyperprior and is compressed by exiting methods. Then these hyperprior is used as side information to compress both I frames and P frames. ASVC effectively captures the temporal dependencies in the latent representations of surveillance videos by leveraging background hyperprior for auxiliary video encoding. The experimental results demonstrate that applying ASVC to traditional and learning based methods significantly improves performance.

关键词： Surveillance Video compression Encoding image reconstruction Redundancy image coding Video sequences Training Gaussian distribution Bit rate Surveillance compression learned surveillance coding background hyperprior video coding

来源：评论

学校读者我要写书评

暂无评论

MMFF-NET: Multi-layer and multi-scale feature fusion network for low-light infrared image enhancement

引用

signal image AND VIDEO processing 2024年第2期18卷 1089-1097页

作者： Zhu, Ge Chen, Yuhan Wang, Xianquan Zhang, Yiheng Chongqing Univ Technol Sch Mech Engn Chongqing 400054 Peoples R China Chongqing Univ Technol Engn Res Ctr Mech Testing Technol & Equipment Minist Educ Chongqing 400054 Peoples R China Chongqing Univ Technol Chongqing Key Lab Time Grating Sensing & Adv Testi Chongqing 400054 Peoples R China

Most existing infrared image enhancement algorithms focus on detail and contrast enhancement of ordinary infrared images, and when applied to low-light infrared images, detail and target texture are often severely lost. The reason is that most algorithms process images in a single scale and have difficulty coping with the degradation of image features while enhancing brightness. To solve this problem, we propose a multi-layer and multi-scale feature fusion network (MMFF-Net). It can improve the brightness of low-light infrared images in the absence of normal-light reference samples and keep the image details consistent with the source image. In this paper, features at different layers of the image are extracted using an adaptively modified deep network. A multi-scale adaptive feature fusion module (MAFFM) is designed to preserve and fuse multi-scale information from different convolutional layer features. The fusion features are passed to the iterative function as pixel-wise parameters for image brightness enhancement. We also propose the local feature fusion module (LFFM), which reconstructs images after fusing multiple features, including brightness enhancement images and source images. Finally, in order to implement the training of the whole network, a set of loss functions is carefully designed in this paper. After extensive experiments, it is shown that the algorithm in this paper can effectively enhance low-light infrared images and perform well in subjective visual tests and quantitative tests compared to existing methods.

关键词： Infrared image Feature fusion image enhancement Deep neural network

来源：评论

学校读者我要写书评

暂无评论

End-to-end speech-denoising deep neural network based on residual-attention gated linear units

引用

ELECTRONICS LETTERS 2024年第20期60卷

作者： Kim, Seon Man Hanshin Univ Sch Comp & Artificial Intelligence Gyeonggi 18101 South Korea

In this letter, an improved gated linear unit (GLU) structure for end-to-end (E2E) speech enhancement is proposed. In the U-Net structure, which is widely used as the foundational architecture for E2E deep neural network-based speech denoising, the input noisy speech signal undergoes multiple layers of encoding and is compressed into essential potential representative information at the bottleneck. The latent information is then transmitted to the decoder stage for the restoration of the target clean speech. Among these approaches, CleanUNet, a prominent state-of-the-art (SOTA) method, enhances temporal attention in latent space by employing multi-head self-attention. However, unlike the approach of applying the attention mechanism to the potentially compressed representative information of the bottleneck layer, the proposed method instead assigns the attention module to the GLU of each encoder/decoder block layer. The proposed method is validated by measuring short-term objective speech intelligibility and sound quality. The objective evaluation results indicated that the proposed method using residual-attention GLU outperformed existing methods using SOTA models such as FAIR-denoiser and CleanUNet across signal-to-noise ratios ranging from 0 to 15 dB. The current gated linear unit (GLU) utilizes half of the signal as a gating signal, applying it to the main signal portion corresponding to the speech feature map of the other half. In contrast, the proposed residual-attention GLU employs a residual-attention network to improve the channel and temporal context within the signal, enhancing the noise-robust feature map in the main signal part. image

关键词： speech enhancement speech intelligibility speech processing

来源：评论

学校读者我要写书评

暂无评论

Cross-image siamese graph convolutional network for Fine-Grained image retrieval in diabetic retinopathy

引用

BIOMEDICAL signal processing AND CONTROL 2024年 92卷

作者： Chen, Fang Zhao, Weiling Zhou, Xiaobo Tongji Univ Sch Elect & Informat Engn Shanghai 201804 Peoples R China Tongji Univ Key Lab Embedded Syst & Serv Comp Minist Educ Shanghai 201804 Peoples R China Univ Texas Hlth Sci Ctr Houston Sch Biomed Informat Houston TX 77030 USA

The categories of diabetic retinopathy (DR) are interrelated, and different ophthalmologists often give different results for the same fundus image. Automatic cross image retrieval of DR can provide an effective diagnostic solution for ophthalmologists and is of great significance in clinical practice. Cross-image (i.e. left and right fundus images for a patient) information is highly correlated and complementary and can be harnessed to improve various computer vision tasks such as image classification, object detection, image segmentation, and image retrieval. Previous studies did not explore the correlation between lesion areas in left and right fundus images of patients, limiting the effective diagnosis of DR. In this study, we proposed a cross-image siamese graph convolutional network(CIS-GCN) to retrieve fine-grained diabetic retinopathy fundus images. First, we constructed a global-specific structure to obtain the specific features of the left and right eyes. Then, we passed the specific features through the pathological localization network to obtain the location features of the lesion. Finally, a graph convolutional neural network was introduced to construct node sets for the left and right eyes, respectively, to represent relatively consistent regions in the fundus images of patients and learn their correlations. We tested our method using Diabetic Retinopathy Detection datasets and the results showed that our algorithm outperforms other state-of-the-art methods by 2.2 % similar to 3.7 % in image data retrieval.

关键词： Diabetic Retinopathy Fine-grained image retrieval Siamese network Graph convolutional network

来源：评论

学校读者我要写书评

暂无评论

Enhancing pneumonia diagnosis with ensemble-modified classifier and transfer learning in deep-CNN based classification of chest radiographs

引用

BIOMEDICAL signal processing AND CONTROL 2024年 93卷

作者： Rajeashwari, S. Arunesh, K. Madurai Kamarajar Univ Sri S Ramasamy Naidu Mem Coll Dept Comp Sci Sattur India

Pneumonia is a common and sometimes fatal lung infection that continues to be a major global health concern. The prediction of pneumonia has become a crucial factor in saving people's lives and improving their quality of life. For this purpose, traditional clinical procedures are considered time-consuming. In addition, researchers have used various algorithms to forecast pneumonia due to advances in image processing techniques. However, these algorithms have proven ineffective in terms of feature extraction, which negatively impacts prediction rates. This research aims to predict pneumonia in people worldwide and address the problem of low accuracy. This work introduces a novel method for pneumonia prediction using a deep CNN (Deep Convolutional neural Network) and an InceptionV3 model for feature extraction. Additionally, it introduces an entropy-normalized Neighbourhood Component Analysis (NCA) technique, complemented by Ensemble-Modified Classifiers (EMC) with Naive Bayes, XGBoost, and Random Forest for classification to enhance predictive accuracy. Accurate pneumonia diagnosis is crucial for patient care, but misdiagnoses and delays in diagnosis are not uncommon. This research establishes a robust framework for pneumonia prediction based on deep learning, capable of identifying both normal and atypical pneumonia patterns in medical images. To enhance feature extraction and improve model generalization, the proposed approach combines entropy normalization techniques. This method includes an NCA-based reduction in dimensionality, resulting in more efficient and discriminative feature representations. Furthermore, an ensemble-modified classifier is introduced to refine predictions and improve the model's ability to differentiate between pneumonia and non-pneumonia cases. Experimental results demonstrate that the proposed model surpasses existing methods in terms of accuracy, sensitivity, and specificity. The effectiveness of the proposed system has been confirmed b

关键词： Pneumonia prediction Deep convolutional neural networks InceptionV3 Entropy normalization Neighbourhood component analysis (NCA) XGBoost Random forest Naive Bayes algorithm Chest X-ray images Medical image analysis

来源：评论

学校读者我要写书评

暂无评论

RECOGNITION-GUIDED DIFFUSION MODEL FOR SCENE TEXT image SUPER-RESOLUTION 49

RECOGNITION-GUIDED DIFFUSION MODEL FOR SCENE TEXT IMAGE SUPE...

引用

49th IEEE International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Zhou, Yuxuan Gao, Liangcai Tang, Zhi Wei, Baole Peking Univ Wangxuan Inst Comp Technol Beijing Peoples R China

ISBN: (纸本)9798350344868;9798350344851

Scene Text image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images, consequently elevating recognition accuracy in Scene Text Recognition (STR). Previous methods predominantly employ discriminative Convolutional neural Networks (CNNs) augmented with diverse forms of text guidance to address this issue. Nevertheless, they remain deficient when confronted with severely blurred images, due to their insufficient generation capability when little structural or semantic information can be extracted from original images. Therefore, we introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios. Moreover, we propose a Recognition-Guided Denoising Network, to guide the diffusion model generating LR-consistent results through succinct semantic guidance. Experiments on the TextZoom dataset demonstrate the superiority of RGDiffSR over prior state-of-the-art methods in both text recognition accuracy and image fidelity.

关键词： scene text image super-resolution diffusion model attention mechanism scene text recognition

来源：评论

学校读者我要写书评

暂无评论

A hyperdimensional framework: Unveiling the interplay of RBP and GSN within CNNs for ultra-precise brain tumor classification

引用

BIOMEDICAL signal processing AND CONTROL 2024年 96卷

作者： Ramalakshmi, K. Rajagopal, Sivakumar Kulkarni, Madhusudan B. Poddar, Harshit PSR Engn Coll Sivakasi 626140 Tamil Nadu India Vellore Inst Technol Sch Elect Engn Vellore 632014 Tamil Nadu India Univ Wisconsin Madison Dept Med Phys Madison WI 53705 USA

This study presents the RBP-CNN model, a convolutional neural network specifically designed for the precise classification of brain tumors in medical imaging. Conventional methods often encounter difficulties in extracting image noise and texture features, which has led to the incorporation of regional binary patterns (RBP) and Gray Standard Normalization (GSN) preprocessing techniques in CNN. The research addresses fundamental inquiries regarding the impact of the model on accuracy, false classifications, and efficiency. The novelty of RBPCNN lies in its distinctive approach to extracting texture features, which involves optimizing pixel values through GSN preprocessing and generating regional binary patterns based on integral images. The objective of this research is to bridge a critical gap by providing a more accurate and efficient model for classifying brain tumors. The key findings reveal the exceptional performance of RBP-CNN, achieving a classification accuracy of 96% with a reduced false classification ratio of 7% across a dataset of 3000 samples. Comparative analyses position RBP-CNN as superior to alternative models in terms of accuracy, false classification rates, and efficiency. The structural insights and hyperparameter values of the model, as well as its application to the FigShare dataset, demonstrate its robustness and scalability. RBP-CNN emerges as an innovative and effective solution, advancing the field of medical image categorization. The findings of this study contribute a novel methodology, paving the way for future exploration in hyperspectral image applications and positioning RBP-CNN as a potential state-ofthe-art tool for medical image analysis.

关键词： Brain Tumor Classification Convolutional neural Network (CNN) Regional Binary Pattern (RBP) Gray Standard Normalization (GSN)

来源：评论

学校读者我要写书评

暂无评论

Biometric Face Identification: Utilizing Soft Computing methods for Feature-Based Recognition

引用

TRAITEMENT DU signal 2024年第5期41卷 2721-2728页

作者： Singh, Mahesh K. Kumar, Sanjeev Nandan, Durgesh Aditya Engn Coll Dept ECE Surampalem 533437 India SR Univ Sch CS & AI Warangal 506371 Telangana India

Biometric facial identification presents a distinct and reliable method for distinguishing individuals based on unique physical or behavioral characteristics. Unlike traditional security measures such as passwords, facial features offer a level of security that cannot be shared, replicated, or forgotten. This study focuses on the application of facial biometrics for person identification, leveraging the advantages of non-contact biometrics like facial features over other methods such as fingerprint or palm recognition. Facial recognition in this work is predicated on the geometric shapes or facial characteristics. Emphasis is placed on three fundamental views of the face: upward, frontal, and downward. For each of these views, specific regions are extracted for processing, including the right-eye region and its width. Simultaneously, the dimensions of the mouth, both height and width, are extracted in a similar manner. Training and evaluation of the proposed system are accomplished using three soft computing models: an Artificial neural Network (ANN), a Particle Swarm Optimization neural Network (PSO-NN) model, and an Adaptive Neuro-Fuzzy Inference System (ANFIS) model. Each model employs a dataset constructed for each view. Optimization of the models is achieved by adjusting parameters like the number of neurons used in the hidden layer for recognition in neural network-based procedures. Performance evaluation of the proposed system is conducted by computing the mean square error, obtained by random data division. The models demonstrated a training set accuracy of 97.20% and a testing data set accuracy of 90.86%. These results indicate the effectiveness of the proposed system for both individual and combined face views, underscoring the potential of facial biometrics in secure identification applications.

关键词： face recognition features extraction ANN identification image processing

来源：评论

学校读者我要写书评

暂无评论

An reconstruction bidirectional recurrent neural network -based deinterleaving method for known radar signals in open-set scenarios

引用

IET RADAR SONAR AND NAVIGATION 2024年第6期18卷 965-981页

作者： Zheng, Haiping Xie, Kai Zhu, Yingshen Lin, Jinjian Wang, Lihong Sun Yat Sen Univ Sch Elect & Commun Engn Shenzhen Peoples R China Beijing Inst Radio Measurement Beijing Peoples R China

In electronic warfare, radar signal deinterleaving is a critical task. While many researchers have applied deep learning and utilised known radar classes to construct interleaved pulse sequences training sets for deinterleaving models, these models face challenges in distinguishing between known and unknown radar classes in open-set scenarios. To address this challenge, the authors propose a novel model, the Reconstruction Bidirectional Recurrent neural Network (RBi-RNN). RBi-RNN utilises input reconstruction and employs a joint training strategy incorporating cross-entropy loss, reconstruction loss, and centre loss. These strategies aim to maximise inter-class latent representation distances while minimising intra-class disparities. By incorporating an open-set recognition method based on extreme value theory, RBi-RNN adapts to open-set scenarios. Simulation results demonstrate the superiority of RBi-RNN over conventional models in both closed-set and open-set scenarios. In open-set scenarios, it successfully discriminates between known and unknown radar signals within interleaved pulse sequences, deinterleaving known radar classes with high stability. The authors lay the foundation for future unsupervised deinterleaving methods designed specifically for unknown radar pulses. In electronic warfare, radar signal deinterleaving is crucial. Existing deinterleaving deep-learning models trained with known radar classes struggle in open-set scenarios. The authors introduce reconstruction bidirectional recurrent neural network (RBi-RNN), a novel model incorporating input reconstruction and a joint training strategy. RBi-RNN can distinguish known and unknown radar signals within interleaved pulse sequences and deinterleave the known radar classes. image

关键词： electronic countermeasures electronic warfare radar radar signal processing signal classification

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：