检索结果-内蒙古大学图书馆

Integration of Physics-Based and Data-Driven Models for Hyperspectral image Unmixing: A summary of current methods

IEEE signal processing MAGAZINE 2023年第2期40卷 61-74页

作者： Chen, Jie Zhao, Min Wang, Xiuheng Richard, Cedric Rahardja, Susanto Northwestern Polytech Univ Sch Marine Sci & Technol Xian 710072 Peoples R China Univ Nice Sophia Antipolis Nice France Univ Michigan Ann Arbor MI USA Northwestern Polytech Univ Xian 710072 Peoples R China Univ Cote Azur F-06000 Nice France Singapore Inst Technol Singapore Singapore

Spectral unmixing is central when analyzing hyperspectral data. To accomplish this task, physics-based methods have become popular because, with their explicit mixing models, they can provide a clear interpretation. Nevertheless, because of their limited modeling capabilities, especially when analyzing real scenes with unknown complex physical properties, these methods may not be accurate. On the other hand, data-driven methods using deep learning in particular have developed rapidly in recent years, thanks to their superior capability in modeling complex nonlinear systems. Simply transferring these methods as black boxes to perform unmixing may lead to low interpretability and poor generalization ability. To bring together the best of two worlds, recent research efforts have focused on combining the advantages of both physics-based models and data-driven methods. In this article, we present an overview of recent advances on this topic from various perspectives, including deep neural network (DNN) design, prior capturing, and loss selection. We summarize these methods within a common optimization framework and discuss ways of enhancing our understanding of these methods. The related source codes are made publicly available at http://***/xiuheng-wang/awesome-hyperspectral-image-unmixing.

关键词： Deep learning Analytical models Source coding neural networks Closed box Task analysis Nonlinear systems

来源：评论

学校读者我要写书评

暂无评论

Radar signal processing and Its Impact on Deep Learning-Driven Human Activity Recognition

引用

SENSORS 2025年第3期25卷 724页

作者： Ayaz, Fahad Alhumaily, Basim Hussain, Sajjad Imran, Muhammad Ali Arshad, Kamran Assaleh, Khaled Zoha, Ahmed Univ Glasgow James Watt Sch Engn Glasgow City G12 8QQ England Ajman Univ Coll Engn & Informat Technol Dept Elect & Comp Engn POB 346 Ajman U Arab Emirates Ajman Univ Artificial Intelligence Res Ctr POB 346 Ajman 346 U Arab Emirates

Human activity recognition (HAR) using radar technology is becoming increasingly valuable for applications in areas such as smart security systems, healthcare monitoring, and interactive computing. This study investigates the integration of convolutional neural networks (CNNs) with conventional radar signal processing methods to improve the accuracy and efficiency of HAR. Three distinct, two-dimensional radar processing techniques, specifically range-fast Fourier transform (FFT)-based time-range maps, time-Doppler-based short-time Fourier transform (STFT) maps, and smoothed pseudo-Wigner-Ville distribution (SPWVD) maps, are evaluated in combination with four state-of-the-art CNN architectures: VGG-16, VGG-19, ResNet-50, and MobileNetV2. This study positions radar-generated maps as a form of visual data, bridging radar signal processing and image representation domains while ensuring privacy in sensitive applications. In total, twelve CNN and preprocessing configurations are analyzed, focusing on the trade-offs between preprocessing complexity and recognition accuracy, all of which are essential for real-time applications. Among these results, MobileNetV2, combined with STFT preprocessing, showed an ideal balance, achieving high computational efficiency and an accuracy rate of 96.30%, with a spectrogram generation time of 220 ms and an inference time of 2.57 ms per sample. The comprehensive evaluation underscores the importance of interpretable visual features for resource-constrained environments, expanding the applicability of radar-based HAR systems to domains such as augmented reality, autonomous systems, and edge computing.

关键词： human activity classification radar domain representations deep learning computational cost transfer learning

来源：评论

学校读者我要写书评

暂无评论

Speech emotion recognition based on multimodal and multiscale feature fusion

引用

signal image AND VIDEO processing 2025年第1期19卷 1-9页

作者： Hu, Huangshui Wei, Jie Sun, Hongyu Wang, Chuhang Tao, Shuo Changchun Univ Technol Coll Comp Sci & Engn Changchun Peoples R China Changchun Normal Univ Coll Comp Sci & Technol Changchun Peoples R China

Conventional feature extraction methods for speech emotion recognition often suffer from unidimensionality and inadequacy in capturing the full range of emotional cues, limiting their effectiveness. To address these challenges, this paper introduces a novel network model named Multi-Modal Speech Emotion Recognition Network (MMSERNet). This model leverages the power of multimodal and multiscale feature fusion to significantly enhance the accuracy of speech emotion recognition. MMSERNet is composed of three specialized sub-networks, each dedicated to the extraction of distinct feature types: cepstral coefficients, spectrogram features, and textual features. It integrates audio features derived from Mel-frequency cepstral coefficients and Mel spectrograms with textual features obtained from word vectors, thereby creating a rich, comprehensive representation of emotional content. The fusion of these diverse feature sets facilitates a robust multimodal approach to emotion recognition. Extensive empirical evaluations of the MMSERNet model on benchmark datasets such as IEMOCAP and MELD demonstrate not only significant improvements in recognition accuracy but also an efficient use of model parameters, ensuring scalability and practical applicability.

关键词： Speech emotion recognition Feature extraction Dilated convolutional neural network Feature fusion

来源：评论

学校读者我要写书评

暂无评论

Unsupervised Multiclass Change Detection and Mapping Using Deep neural Network 7

Unsupervised Multiclass Change Detection and Mapping Using D...

引用

IEEE 7th International Conference on Advanced Technologies, signal and image processing (ATSIP)

作者： Kheddam, Radja Tahraoui, Ahmed USTHB Fac Elect Engn Image Proc & Radiat Lab Algiers Algeria

ISBN: (纸本)9798350351491;9798350351484

This work deals with the task of land use and land cover (LULC) change detection using multi-temporal multispectral remote sensing images. In the last few years, deep learning-based change detection methods have been successfully implemented for automatic LULC change detection. Although these elaborated methodologies achieved a very high score in the detection accuracy, they are able to provide only binary change mapping where are illustrated change regions and non-change regions without any specification of the ground classes nature. In order to get a multiclass change detection leading to a more accurate change mapping, we proposed in this paper a fully unsupervised three steps methodology where during the first step, a binary change mapping is obtained using k-MAD (kernel multivariate alteration detection) components combined with a Chi-2 test thresholding, during the second step, the change and non-change regions are iteratively classified due to the AP (affinity propagation) clustering algorithm to reach multiclass non-change area and "from-to" change area, finally in the third step, samples from the changed and unchanged classes are involved in a DNN (deep neural network) architecture to provide a multiclass land change mapping. Co-registered bi-temporal multispectral images taken over the northeastern region of Algiers, Algeria, between 1997 and 2001 by the American LANDSAT-TM satellite are used to verify the effectiveness of the proposed scheme. The obtained "from to" land change map validated by means of the spectral signature analysis is more informative on the change and non-change pixels comparing to the binary land change map.

关键词： Unsupervised change detection kernel MAD Affinity Propagation (AP) Deep neural Network (DNN)

来源：评论

学校读者我要写书评

暂无评论

A Garbage Classification and Environmental Impact Assessment Model Based on image Recognition and Artificial Intelligence

引用

TRAITEMENT DU signal 2024年第6期41卷 3001-3010页

作者： Lin, Rong Anhui Normal Univ Sch Law Wuhu 241000 Peoples R China

With the rapid urbanization process, waste management has become a significant environmental issue globally. Waste sorting, as an effective method of resource recycling and environmental protection, has gradually become a key solution to the waste pollution problem. Traditional waste classification methods rely on manual labor, which is inefficient and prone to errors, making them inadequate for modern urban waste management. In recent years, image recognition and artificial intelligence (AI)-based methods for waste classification have gained widespread attention, with deep learning techniques, particularly Convolutional neural Networks (CNNs), showing great potential in waste sorting. However, existing research on waste classification models faces challenges such as imperfect network structures, insufficient training data, and poor environmental adaptability, which limit their application in complex environments. This study proposes a waste classification model based on image recognition and AI to enhance classification accuracy and efficiency. First, an improved PCANet and SDenseNet network structure is combined to propose aAnew feature extraction and representation method, enhancing the model's feature learning ability. Secondly, a layered learning strategy, combined with the traditional backpropagation algorithm, is used to optimize the training process and improve learning efficiency. Finally, experimental results demonstrate that the proposed waste classification model significantly outperforms traditional models in classification accuracy and processing capability in various environments, providing aAnew solution for the advancement of waste classification technologies.

关键词： waste classification image recognition Artificial Intelligence (AI) deep learning network structure backpropagation

来源：评论

学校读者我要写书评

暂无评论

Intra-modality masked image modeling: A self-supervised pre-training method for brain tumor segmentation

引用

BIOMEDICAL signal processing AND CONTROL 2024年 95卷

作者： Qi, Liangce Shi, Weili Miao, Yu Li, Yonghui Feng, Guanyuan Jiang, Zhengang Changchun Univ Sci & Technol Dept Comp Sci & Technol Changchun 130022 Jilin Peoples R China

Despite the great success of deep neural networks in brain tumor segmentation, it is challenging to obtain sufficient annotated images due to the requirement of clinical expertise. Masked image modeling recently achieved competitive performance compared with supervised training by learning rich representations from unlabeled data. However, it is originally designed for vision transformers and its effectiveness has not been well -studied in the medical domain, usually for limited unlabeled data and small convolutional network scenarios. In this paper, we propose a self -supervised learning framework to pre -train U -Net for brain tumor segmentation. Our goal is to learn modality -specific and modality -invariant representations from multimodality magnetic resonance images. This is motivated by the fact that different modalities indicate the same organs and tissues but have various appearances. To achieve this, we design a new pretext task that reconstructs the masked patches of each modality based on the partial observation of other modalities. We evaluate our method by transfer performance on BraTS 2020 dataset. The experimental results demonstrate our method outperforms other self -supervised learning methods and improves the performance of a strong fully supervised baseline. The source codes are available at https://***/mobiletomb/IS-MIM.

关键词： Masked image modeling Self-supervised learning Brain tumor segmentation

来源：评论

学校读者我要写书评

暂无评论

Enhanced synthetic aperture radar image autofocus and classification using 2D SARNet framework

引用

JOURNAL OF APPLIED REMOTE SENSING 2024年第2期18卷

作者： Sakr, Mohamed Saleh, Ahmed AbdElkader, Fathy Amer, Ghada AboElenean, Mohamed Mil Tech Coll Elect Engn Cairo Egypt October 6 Univ Fac Informat & Comp Sci Giza Egypt MUST Univ Fac Engn Dept Elect Engn Giza Egypt

A synthetic aperture radar (SAR) system is a notable source of information, recognized for its capability to operate day and night and in all weather conditions, making it essential for various applications. SAR image formation is a pivotal step in radar imaging, essential for transforming complex raw radar data into interpretable and utilizable imagery. Nowadays, advancements in SAR sensor design, resulting in very wide swaths, generate a massive volume of data, necessitating extensive processing. Traditional methods of SAR image formation often involve resource-intensive and time-consuming postprocessing. There is a vital need to automate this process in near-real-time, enabling fast responses for various applications, including image classification and object detection. We present an SAR processing pipeline comprising a complex 2D autofocus SARNet, followed by a CNN-based classification model. The complex 2D autofocus SARNet is employed for image formation, utilizing an encoder-decoder architecture, such as U-Net and a modified version of ResU-Net. Meanwhile, the image classification task is accomplished using a CNN-based classification model. This framework allows us to obtain near real-time results, specifically for quick image viewing and scene classification. Several experiments were conducted using real-SAR raw data collected by the European remote sensing satellite to validate the proposed pipeline. The performance evaluation of the processing pipeline is conducted through visual assessment as well as quantitative assessment using standard metrics, such as the structural similarity index and the peak-signal-to-noise ratio. The experimental results demonstrate the processing pipeline's robustness, efficiency, reliability, and responsivity in providing an integrated neural network-based SAR processing pipeline.

关键词： synthetic aperture radar (SAR) deep learning CNN-based model ResU-Net SAR autofocus

来源：评论

学校读者我要写书评

暂无评论

Fine-Grained image Generation Network With Radar Range Profiles Using Cross-Modal Visual Supervision

引用

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES 2024年第2期72卷 1339-1352页

作者： Bao, Jiacheng Li, Da Li, Shiyong Zhao, Guoqiang Sun, Houjun Zhang, Yi Beijing Inst Technol Sch Integrated Circuits & Elect Beijing Key Lab Millimeter Wave & Terahertz Tech Beijing 100081 Peoples R China

Electromagnetic imaging methods mainly utilize converted sampling, dimensional transformation, and coherent processing to obtain spatial images of targets, which often suffer from accuracy and efficiency problems. Deep neural network (DNN)-based high-resolution imaging methods have achieved impressive results in improving resolution and reducing computational costs. However, previous works exploit single modality information from electromagnetic data;thus, the performances are limited. In this article, we propose an electromagnetic image generation network (EMIG-Net), which translates electromagnetic data of multiview 1-D range profiles (1DRPs), directly into bird-view 2-D high-resolution images under cross-modal supervision. We construct an adversarial generative framework with visual images as supervision to significantly improve the imaging accuracy. Moreover, the network structure is carefully designed to optimize computational efficiency. Experiments on self-built synthetic data and experimental data in the anechoic chamber show that our network has the ability to generate high-resolution images, whose visual quality is superior to that of traditional imaging methods and DNN-based methods, while consuming less computational cost. Compared with the backprojection (BP) algorithm, the EMIG-Net gains a significant improvement in entropy (72%), peak signal-to-noise ratio (PSNR;150%), and structural similarity (SSIM;153%). Our work shows the broad prospects of deep learning in radar data representation and high-resolution imaging and provides a path for researching electromagnetic imaging based on learning theory.

关键词： Cross-modal supervision deep neural network (DNN) electromagnetic imaging generative adversarial network (GAN) radar range profile

来源：评论

学校读者我要写书评

暂无评论

Localized Binarization of Document images Based on Suprathreshold stochastic Resonance 4

Localized Binarization of Document Images Based on Suprathre...

引用

4th International Conference on Electronic Information Engineering and Computer Technology, EIECT 2024

作者： Yan, Xiaoyue Mu, Dazhong School of Information Engineering Beijing Institute Of Graphic Communication Beijing China

ISBN: (纸本)9798331528850

When processing text images with traditional binarization methods, the image background noise often causes the results to become blurred or leads to the loss of edge details. To solve this problem, this paper proposes an image binarization method based on stochastic resonance theory. First, we divide the image into sub-blocks and set a binarization threshold based on the statistical properties of the pixels in each sub-block. Next, the image signal is converted into a one-dimensional time series signal using Hilbert scanning. The processed signals are input into a threshold array system, which amplifies the weak edge information in the input signals through the stochastic resonance phenomenon. Subsequently, we performed modulation and inverse scanning on the output signals of the system to generate the binary image for each sub-block. Finally, all sub-block binary images were combined to complete the binarization of the overall image. Experimental results show that the method proposed in this paper can effectively retain the detailed information of document images and significantly outperforms the traditional binarization method regarding image quality. © 2024 IEEE.

关键词： stochastic systems

来源：评论

学校读者我要写书评

暂无评论

Improved residual attention convolutional neural network for rotating machinery fault diagnosis in the presence of strong noise

引用

signal image AND VIDEO processing 2025年第6期19卷 1-13页

作者： Meng, Xianglong Li, Jinfeng Zhang, Yan Ma, Songhua Shandong Univ Sch Mech Engn Jinan Peoples R China Shandong Univ Key Lab High Efficiency & Clean Mech Manufacture Minist Educ Jinan Peoples R China Basic Mfg Technol Grp Co Ltd Acad Machinery Sci & Technol Grp Yanqi Lake Inst Beijing Peoples R China

Fault diagnosis in rotating machinery faces significant challenges in strong noise environments. Especially under extremely high noise intensity and unknown noise types, existing methods struggle to maintain accuracy. We propose the Improved Residual Attention Convolutional neural Network (IRA-CNN) to address strong noise problem. IRA-CNN integrates the interconnected multi-branch structure and the mixed attention mechanism specially designed for vibration signals. Unlike previous studies that only consider Gaussian noise and signal-to-ratio > - 6, we evaluate the model's noise robustness by extensive experiments across three datasets, three noise types, and six noise intensity levels. The results reveal that the noise type significantly impacts model performance which has often been overlooked in previous studies. IRA-CNN outperforms state-of-the-art models in both accuracy and generalization. These findings establish a highly effective solution for fault diagnosis in challenging strong noise environments.

关键词： Fault diagnosis Multi-branch structure Attention mechanism Anti-noise

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：