检索结果-内蒙古大学图书馆

Towards Accurate Medical image Segmentation With Gradient-Optimized Dice Loss

IEEE signal processing LETTERS 2024年 31卷 191-195页

作者： Ming, Qi Xiao, Xiaowu Beijing Inst Technol Sch Automat Beijing 100081 Peoples R China Liaoning Univ Sch Informat Resource Management Shenyang 110036 Peoples R China

Medical image segmentation plays an important role in medical diagnosis, and has received extensive attention in recent years. A large number of convolutional neural network based methods have been proposed to achieve accurate segmentation results. Dice loss is the most popular loss function for medical image segmentation tasks. However, we found that Dice loss suffers from abnormal gradient changes, which causes the loss function to be unstable and difficult to converge. Therefore, we propose an gradient-optimized Dice loss (GODC) to solve this problem. GODC corrects the abnormal gradient changes in the segmentation loss, which accelerates the model convergence and can achieve better segmentation performance. Next, we propose a lateral feature alignment module (LFAM). LFAM adopts deformable convolutional network to align the features of different layers on the shortcut connections of U-Net to improve the segmentation performance. Finally, our method achieves state-of-the-art results on the LiTS dataset as well as our collected pancreatic tumor datasets.

关键词： Medical image segmentation dice loss convolutional neural network gradient descent algorithm

来源：评论

学校读者我要写书评

暂无评论

Self-learning based joint multi image super-resolution and sub-pixel registration

引用

DIGITAL signal processing 2025年 156卷

作者： Kim, Hansol Lee, Sukho Kang, Moon Gi Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea Dongseo Univ Dept Comp Engn Busan 47011 South Korea

Multi image Super-resolution (MISR) refers to the task of enhancing the spatial resolution of a stack of low- resolution (LR) images representing the same scene. Although many deep learning-based single image super- resolution (SISR) technologies have recently been developed, deep learning has not been widely exploited for MISR, even though it can achieve higher reconstruction accuracy because more information can be extracted from the stack of LR images. One of the primary obstacles encountered by deep networks when addressing the MISR problem is the variability in the number of LR images that act as input to the network. This impedes the feasibility of adopting an end-to-end learning approach, because the varying number of input images makes it difficult to construct a training dataset for the network. Another challenge arises from the requirement to align the LR input images to generate high-resolution (HR) image of high quality, which requires complex and sophisticated methods. In this paper, we propose a self-learning based method that can simultaneously perform super-resolution and sub-pixel registration of multiple LR images. The proposed method trains a neural network with only the LR images as input and without any true target HR images;i.e., the proposed method requires no extra training dataset. Therefore, it is easy to use the proposed method to deal with different numbers of input images. To our knowledge this is the first time that a neural network is trained using only LR images to perform a joint MISR and sub-pixel registration. Experimental results confirmed that the HR images generated by the proposed method achieved better results in both quantitative and qualitative evaluations than those generated by other deep learning-based methods.

关键词： Multi image super-resolution Sub-pixel registration Deep learning Self-learning

来源：评论

学校读者我要写书评

暂无评论

Recovery Guarantees of Unsupervised neural Networks for Inverse Problems trained with Gradient Descent 32

Recovery Guarantees of Unsupervised Neural Networks for Inve...

引用

32nd European signal processing Conference (EUSIPCO)

作者： Buskulic, Nathan Queau, Yvain Fadili, Jalal Normandie Univ CNRS UNICAEN ENSICAENGREYC Caen France

ISBN: (纸本)9789464593617;9798331519773

Advanced machine learning methods, and more prominently neural networks, have become standard to solve inverse problems over the last years. However, the theoretical recovery guarantees of such methods are still scarce and difficult to achieve. Only recently did unsupervised methods such as the Deep image Prior (DIP) get equipped with convergence and recovery guarantees for generic loss functions when trained through gradient flow with an appropriate initialization. In this paper, we extend these results by proving that these guarantees hold true when using gradient descent with an appropriately chosen step-size/learning rate. We also show that the discretization only affects the overparametrization bound for a two-layer DIP network by a constant and thus that the different guarantees found for the gradient flow will hold for gradient descent.

关键词： Inverse problems Deep image/Inverse Prior Overparametrization Gradient descent Unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

Estimation of vital parameters from photoplethysmography using deep learning architecture

引用

signal image AND VIDEO processing 2025年第1期19卷 1-15页

作者： Sulochana, C. Helen Dharshini, S. L. Siva Blessy, S. A. Praylin Selva St Xaviers Catholic Coll Engn Dept Elect & Commun Engn Chunkankadai 629003 Tamil Nadu India Mepco Schlenk Engn Coll Dept Bio Med Engn Sivakasi Tamil Nadu India Bethlahem Inst Engn Dept Elect & Commun Engn Ulaganvillai 629157 Tamil Nadu India

Vital signs such as blood pressure, heart rate, and respiration rate are continuously monitored in intensive care unit patients to assess their condition. Various methods are available for the continuous monitoring of these vital parameters. To extract parameters, current techniques place multiple sensors on the patient's body. Patients dealing with medical issues may find it challenging and uncomfortable to have multiple electrodes placed on their bodies. To avoid placing multiple sensors on a patient's body, the proposed method aims to extract three vital parameters-respiration rate (RR), blood pressure, and heart rate-from a single photoplethysmography sensor, using a unified deep learning model to analyze the photoplethysmographic (PPG) signal. The proposed deep learning framework combines a Convolutional neural Network (CNN) with Bidirectional Long Short-Term Memory (Bi-LSTM) and an attention mechanism. This model effectively extracts features by integrating spatial and temporal correlations within the signal, focusing on the most relevant features necessary for estimating multiple parameters from a PPG signal. Optimized through hyperparameter tuning, the CNN-Bi-LSTM architecture achieved a prediction accuracy of 95.67%. The performance of the proposed method is evaluated using the publicly available Multiparameter Intelligent Monitoring in Intensive Care Database and compared to existing methods. The model demonstrated an average mean absolute error (MAE) +/- standard deviation (SD) of 0.084 +/- 0.20 for heart rate, 0.034 +/- 0.23 for blood pressure, and 0.009 +/- 0.05 for respiration rate.

关键词： Photoplethysmography (PPG) Heart rate Respiration rate Blood pressure Convolutional neural network Bidirectional long short-term memory Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Rethinking Building Change Detection: Dual-Frequency Learnable Visual Encoder With Multiscale Integration Network

引用

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2024年 17卷 6174-6188页

作者： Xu, Chuan Yu, Haonan Mei, Liye Wang, Ying Huang, Jian Du, Wenying Jin, Shuangtong Li, Xinliu Yu, Minglin Yang, Wei Li, Xinghua Hubei Univ Technol Sch Comp Sci Wuhan 430068 Peoples R China Wuhan Univ Inst Technol Sci Wuhan 430072 Peoples R China Wuchang Shouyi Univ Sch Informat Sci & Engn Wuhan 430032 Peoples R China China Univ Geosci Natl Engn Res Ctr Geog Informat Syst Wuhan 430074 Peoples R China Wuhan Univ Sch Remote Sensing Informat Engn Wuhan 430072 Peoples R China

Remote sensing (RS) image change detection (CD) methods based on deep learning, such as convolutional neural networks (CNNs) and transformers, are still spatial domain-based image processing methods by nature, and their detection accuracy is strongly affected by chromatic aberration due to imaging time, shadows caused by lighting conditions, and object confusion and other disturbances. In this study, we revisit CD from a signal processing perspective, framing it as the task of consistency detection of the distributional features of two 2-D signals. We aim to extract the primary components of the two signals while suppressing interfering noises. To address this, we propose a novel CD method called DFNet, which leverages a dual-frequency learnable encoder. First, we construct a dual-frequency feature encoder Siamese framework to capture local high-frequency signals and global low-frequency signals using CNN and attention mechanisms after dividing the input RS image signals into two channels. Second, we introduce the frequency explicit visual center module as a part of the multifrequency-domain dense interaction (MFDDI) module at the decoder stage, allowing long-distance dependency to be established between high-low frequency components in the same layer as well as signal aggregation in regions of small edge variations. In addition, the MFDDI module adopts a layer-by-layer interactive fusion approach to synthesize discriminative information in a wide frequency-domain range, enhancing the characterization capability of frequency-domain signals. We conduct comparison experiments with the current mainstream methods on the land cover dataset SYSU-CD and two building datasets, LEVIR-CD and WHU-CD, and the results show that our method is not only resistant to interference but also outperforms all the comparison methods.

关键词： Frequency-domain analysis Feature extraction Convolutional neural networks Noise Convolution Transformers image segmentation Building change detection (BCD) dual-frequency feature encoder (DFFE) frequency domain high-low frequency information remote sensing (RS) image processing

来源：评论

学校读者我要写书评

暂无评论

Interpretable Deep image Classification Using Rationally Inattentive Utility Maximization

引用

IEEE JOURNAL OF SELECTED TOPICS IN signal processing 2024年第2期18卷 168-183页

作者： Pattanayak, Kunal Krishnamurthy, Vikram Jain, Adit Cornell Univ Sch Elect & Comp Engn Ithaca NY 14853 USA

Can deep convolutional neural networks (CNNs) for image classification be interpreted as utility maximizers with information costs? By performing set-valued system identification for Bayesian decision systems, we demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive Bayesian utility maximizers, a generative model used extensively in economics for human decision-making. Our claim is based on approximately 500 numerical experiments on 5 widely used neural network architectures. The parameters of the resulting interpretable model are computed efficiently via convex feasibility algorithms. As a practical application, we also illustrate how the reconstructed interpretable model can predict the classification performance of deep CNNs with high accuracy. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.

关键词： Biological system modeling Bayes methods Costs Computational modeling Predictive models Convolutional neural networks Training Interpretable machine learning Bayesian revealed preference rational inattention deep neural networks image classification

来源：评论

学校读者我要写书评

暂无评论

Gesture recognition based on Gramian angular difference field and multi-stream fusion methods

引用

signal image AND VIDEO processing 2025年第1期19卷 1-8页

作者： Bian, Huarui Zhang, Lei Xian Polytech Univ Sch Mech & Elect Engn Xian Peoples R China

Surface electromyography-based gesture recognition was widely applied in human-computer interaction, hand rehabilitation, prosthetic control, and other fields. Electromyography (EMG) signals-based gesture classification usually relies on handcrafted feature extraction with intense subjectivity or convolutional neural networks with redundant structures to extract features. This paper converts the raw EMG signals into Gramian Angular Difference Field (GADF) and Gramian Angular Summation Field images. Four models were used to classify the pictures: K-nearest Neighbors (KNN), Generalized Learning Systems, Binary Trees, and Convolutional neural Networks using MobileNetv1, and the proposed method was verified by using the public dataset NinaproDB2. Experimental results: When the window size is 300 ms, the step size is 10 ms, and KNN are used as the classification model, the average accuracy of EMG signals classification based on the GADF method is 98.17%, and the accuracy of exercises B, C and D was 96.65%, 95.53%, and 98.02%, respectively. The recognition accuracy was 7.92%, 14.25%, and 4.279% higher than the provided baseline.

关键词： Electromyographic signals Gramian angular field K-nearest neighbors

来源：评论

学校读者我要写书评

暂无评论

LSAGNet: lightweight self-attention guidance network for image super-resolution

引用

signal image AND VIDEO processing 2025年第6期19卷 1-9页

作者： Ye, Shutong Zhu, Yi Zhang, Mingming Dai, Xinyan Zhao, Shengyu Xie, Chao Nanjing Forestry Univ Coll Mech & Elect Engn Nanjing 210037 Jiangsu Peoples R China

Single image super-resolution aims to restore high-resolution images from low-resolution images. Recently, many methods have tackled image super-resolution by leveraging local or global features to boost performance. However, they fail to combine both feature types and often have high parameter counts. We propose a Lightweight Self-Attention Guidance Network (LSAGNet) to address the aforementioned issues. We designed a simple and efficient dynamic local attention (DLA) module to effectively extract local features. Existing Transformer networks often rely on query-key similarities for feature aggregation. However, blindly using these similarities hinders super-resolution reconstruction by failing to retain strong correlations and introducing weak ones. To address this issue, we propose a global self-attention (GSA) mechanism based on a soft-thresholding operation, designed to retain strongly correlated information. Experimental results demonstrate that the proposed LSAGNet achieves an excellent balance between performance and parameter efficiency while also achieving competitive accuracy compared to state-of-the-art methods.

关键词： Efficient single image super-resolution Convolutional neural networks Dynamic local attention and global self-attention Transformer networks

来源：评论

学校读者我要写书评

暂无评论

GLE-YOLO: intelligent detection of road linear damage in shadow condition

引用

signal image AND VIDEO processing 2025年第3期19卷 1-11页

作者： Zhang, Zhaopeng Xu, Zhijie Zhang, Jianqin Beijing Univ Civil Engn & Architecture Sch Sci Beijing 102616 Peoples R China Beijing Univ Civil Engn & Architecture Sch Geomat & Urban Spatial Informat Beijing 102616 Peoples R China

Road damage detection is a crucial task of road inspection systems. Although traditional object detection models achieve promising performance, the presence of shadows exacerbates the difficulty of road damage detection in practical scenarios. To tackle these challenges, we introduce a novel shadow-image enhancement network named global-local enhancement network and joint it with the YOLOv7-tiny detection network augmented with components by us to craft an end-to-end detection framework. We integrate deep neural networks with conventional methods and propose the global statistical texture enhancement module to enhance global statistical texture information. We propose the local enhancement module to enhance road damage edge information in shadow regions. Furthermore, we craft a shadow region loss to optimize the enhancement models and employ dynamic snake convolution to replace certain traditional convolution in detection network. We evaluate our method on shadow linear road damage datasets, SRoad and DRoad, which comprise road images from different perspectives in Beijing, China. The results demonstrate that our approach surpasses the performance of low-light enhancement models and low-light detection models. The method achieves mAP of 71.2% and FPS of 98.8 on SRoad dataset while reaching mAP of 79.7% and FPS of 103.2 on DRoad dataset. The proposed model optimizes performance and model size, meeting the requirements for real-time processing in industrial applications.

关键词： Road inspection systems image enhancement network Road linear damage detection Yolov7 model Real-time processing

来源：评论

学校读者我要写书评

暂无评论

A fuzzy rank-based ensemble of CNN models for MRI segmentation

引用

BIOMEDICAL signal processing AND CONTROL 2025年 102卷

作者： Valenkova, Daria Lyanova, Asya Sinitca, Aleksandr Sarkar, Ram Kaplun, Dmitrii St Petersburg Electrotech Univ LETI 5 Prof Popov St St Petersburg 197022 Russia Jadavpur Univ Kolkata 700032 India China Univ Min & Technol Artificial Intelligence Res Inst 1 Daxue Rd Xuzhou 221116 Peoples R China

Glioblastoma is the most common subtype of malignant tumors of the central nervous system. Segmentation of the brain tumor image is crucial for accelerating the diagnosis and treatment of a patient. In the paper, an advanced neural network ensemble based on a fuzzy ranking approach for tumor segmentation is presented using a combination of convolutional neural network (CNN) architectures, namely SegResNet, UNETR, and SwinUNETR. The proposed method uses fuzzy rank-based unification of deep learners by considering two nonlinear functions in decision-making, which helps to take into account the confidence in the predictions of the three base models. The proposed method is evaluated using the BRATS 2023 MRI dataset and outperforms the state-of-the-art methods, achieving an average Dice score of 0.885 +/- 0.134. The statistical significance of the differences between the models and the ensemble is confirmed based on the Wilcoxon signed rank test, and the p-value is below 0.005.

关键词： Brain tumor Medical image segmentation MRI data Deep learning Glioblastoma Fuzzy ranking Ensemble method

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：