检索结果-内蒙古大学图书馆

An encoder-decoder with a Residual network for Fusing Hyperspectral and Panchromatic Remote Sensing Images

REMOTE SENSING 2022年第9期14卷

作者： Zhao, Rui Du, Shihong Peking Univ Inst Remote Sensing & GIS Beijing 100091 Peoples R China

For many urban studies it is necessary to obtain remote sensing images with high hyperspectral and spatial resolution by fusing the hyperspectral and panchromatic remote sensing images. In this article, we propose a deep learning model of an encoder-decoder with a residual network (EDRN) for remote sensing image fusion. First, we combined the hyperspectral and panchromatic remote sensing images to circumvent the independence of the hyperspectral and panchromatic image features. Second, we established an encoder-decoder network for extracting representative encoded and decoded deep features. Finally, we established residual networks between the encoder network and the decoder network to enhance the extracted deep features. We evaluated the proposed method on six groups of real-world hyperspectral and panchromatic image datasets, and the experimental results confirmed the superior performance of the proposed method versus six other methods.

关键词： image fusion hyperspectral panchromatic deep learning encoder-decoder network residual network

来源：评论

学校读者我要写书评

暂无评论

FEEDNet: a feature enhanced encoder-decoder LSTM network for nuclei instance segmentation for histopathological diagnosis

引用

PHYSICS IN MEDICINE AND BIOLOGY 2022年第19期67卷 195011-195011页

作者： Deshmukh, Gayatri Susladkar, Onkar Makwana, Dhruv Teja, Sai Chandra R. Kumar, Nagesh S. Mittal, Sparsh Vishwakarma Inst Informat Technol Pune Maharashtra India Sri Venkateswara Inst Med Sci Tirupati Andhra Pradesh India Indian Inst Technol ITT Roorkee Roorkee Uttar Pradesh India IIT Roorkee Roorkee Uttar Pradesh India

Objective. Automated cell nuclei segmentation is vital for the histopathological diagnosis of cancer. However, nuclei segmentation from 'hematoxylin and eosin' (HE) stained 'whole slide images' (WSIs) remains a challenge due to noise-induced intensity variations and uneven staining. The goal of this paper is to propose a novel deep learning model for accurately segmenting the nuclei in HE-stained WSIs. Approach. We introduce FEEDNet, a novel encoder-decoder network that uses LSTM units and `feature enhancement blocks' (FE-blocks). Our proposed FE-block avoids the loss of location information incurred by pooling layers by concatenating the downsampled version of the original image to preserve pixel intensities. FEEDNet uses an LSTM unit to capture multi-channel representations compactly. Secondly, for datasets that provide class information, we train a multiclass segmentation model, which generates masks corresponding to each class at the output. Using this information, we generate more accurate binary masks than that generated by conventional binary segmentation models. Main results. We have thoroughly evaluated FEEDNet on CoNSeP, Kumar, and CPM-17 datasets. FEEDNet achieves the best value of PQ (panoptic quality) on CoNSeP and CPM-17 datasets and the second best value of PQ on the Kumar dataset. The 32-bit floating-point version of FEEDNet has a model size of 64.90 MB. With INT8 quantization, the model size reduces to only 16.51 MB, with a negligible loss in predictive performance on Kumar and CPM-17 datasets and a minor loss on the CoNSeP dataset. Significance. Our proposed idea of generalized class-aware binary segmentation is shown to be accurate on a variety of datasets. FEEDNet has a smaller model size than the previous nuclei segmentation networks, which makes it suitable for execution on memory-constrained edge devices. The state-of-the-art predictive performance of FEEDNet makes it the most preferred network. The source code can be obtained from-https;//git

关键词： encoder-decoder network deep neural network nuclei instance segmentation AI for medical diagnosis cancer diagnosis digital image pathology

来源：评论

学校读者我要写书评

暂无评论

Multi-scale enhancement and aggregation network for singleimage deraining

引用

Computational Visual Media 2025年第1期11卷 213-226页

作者： Rui Zhang Yuetong Liu Huijian Han Yong Zheng Tao Zhang Yunfeng Zhang School of Computing and Artificial Intelligence Shandong University of Finance and EconomicsJinan 250014China School of Software Shandong UniversityJinan 250100China

Rain streaks in an image appear in different sizes and orientations,resulting in severe blurring and visual quality *** CNNbased algorithms have achieved encouraging deraining results although there are certain limitations in the description of rain streaks and the restoration of scene structures in different *** this paper,we propose an efficient multi-scale enhancement and aggregation network(MEAN)to solve the single-image deraining *** the importance of large receptive fields and multi-scale features,we introduce a multi-scale enhanced unit(MEU)to capture longrange dependencies and exploit features at different scales to depict ***,an attentive aggregation unit(AAU)is designed to utilize the informative features in spatial and channel dimensions,thereby aggregating effective information to eliminate redundant features for rich scenario *** improve the deraining performance of the encoder–decoder network,we utilized an AAU to filter the information in the encoder network and concatenated the useful features to the decoder network,which is conducive to predicting high-quality clean *** results on synthetic datasets and real-world samples show that the proposed method achieves a significant deraining performance compared to state-of-the-art approaches.

关键词： single-image deraining multi-scale enhan-cement and aggregation(MEA) encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

Fast Real-time Semantic Segmentation network with an Asymmetric encoder-decoder Structure 5

Fast Real-time Semantic Segmentation Network with an Asymmet...

引用

5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE)

作者： Rui, Tang Yan, Li Hui Kai, Xu Yi, Ding UESTC Sch Informat & Software Engn Chengdu Peoples R China CNPC CNPC Offshore Engn Co Ltd Tianjin Peoples R China BHDC 2 Cementing Co Tianjin Peoples R China Univ Elect Sci & Technol China Network & Data Secur Key Lab Sichuan Prov Chengdu 610054 Peoples R China UESTC Guangdong Inst Elect & Informat Engn Dongguan 523808 Peoples R China

ISBN: (纸本)9781665423144

Currently, the-state-of-art semantic segmentation methods often bring the huge computational cost to achieve high performance, and it is difficult to balance the inference speed and model accuracy, and it is difficult to deploy on the equipment with limited hardware resources. In this paper, we propose an asymmetric high-resolution (1024*2048px) real-time semantic segmentation network that can balance accuracy and speed-Fast Real-time Semantic Segmentation network(FRSSNet). We combine the existing multi-branch network and encoder-decoder structure and design a new bottleneck. Meanwhile, we use multiscale to get better segmentation results. By combining spatial details with semantic information, the accuracy can reach 69.6% on Cityscapes and the FPS can reach 200.

关键词： Real-time semantic segmentation encoder-decoder network two-branch network

来源：评论

学校读者我要写书评

暂无评论

Enhancing Crack Segmentation network with Multiple Selective Fusion Mechanisms

引用

BUILDINGS 2025年第7期15卷

作者： Chen, Yang Yang, Tao Dong, Shuai Wang, Like Pei, Bida Wang, Yunlong China Construct Fifth Engn Div Changsha 410004 Peoples R China Changsha Univ Sci & Technol Sch Civil & Environm Engn Changsha 410114 Peoples R China

Automated crack detection is vital for structural maintenance in areas such as construction, roads, and bridges. Accurate crack detection allows for the timely identification and repair of cracks, reducing safety risks and extending the service life of structures. However, traditional methods struggle with fine cracks, complex backgrounds, and image noise. In recent years, although deep learning techniques excel in pixel-level crack segmentation, challenges like inadequate local feature processing, information loss, and class imbalance persist. To address these challenges, we propose an encoder-decoder network based on multiple selective fusion mechanisms. Initially, a star feature enhancement module is designed to resolve the issues of insufficient local feature processing and feature redundancy during the feature extraction process. Then, a multi-scale adaptive fusion module is developed to selective capture both global and local contextual information, mitigating the information loss. Finally, to tackle class imbalance, a multi-scale monitoring and selective output module is introduced to enhance the model's focus on crack features and suppress the interference from background and irrelevant information. Extensive experiments are conducted on three publicly available crack datasets: SCD, CFD, and DeepCrack. The results demonstrate that the proposed segmentation network achieves superior performance in pixel-level crack segmentation, with Dice scores of 66.2%, 54.2%, and 86.8% and mIoU values of 74.4%, 67.5%, and 87.9% on the SCD, CFD, and DeepCrack datasets, respectively. These results outperform those of existing models, such as U-Net, DeepLabv3+, and Attention UNet, particularly in handling complex backgrounds, fine cracks, and low-contrast images. Furthermore, the proposed MSF-CrackNet also significantly reduces computational complexity, with only 2.39 million parameters and 8.58 GFLOPs, making it a practical and efficient solution for real-world crack detecti

关键词： crack detection pixel-level segmentation encoder-decoder network multiple selective fusion mechanism

来源：评论

学校读者我要写书评

暂无评论

Attention Mono-Depth: Attention-Enhanced Transformer for Monocular Depth Estimation of Volatile Kiln Burden Surface

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2025年第2期35卷 1686-1699页

作者： Liu, Cong Zhang, Chaobo Liang, Xiaojun Han, Zhiming Li, Yiming Yang, Chunhua Gui, Weihua Gao, Wen Wang, Xiaohao Li, Xinghui Tsinghua Univ Tsinghua Shenzhen Int Grad Sch Shenzhen 518055 Peoples R China Peng Cheng Lab Shenzhen 518000 Peoples R China Beijing Inst Technol Adv Technol Res Inst Jinan 250300 Peoples R China Cent South Univ Sch Automat Changsha 410017 Peoples R China Peking Univ Natl Engn Res Ctr Visual Technol Sch Comp Sci Beijing 100871 Peoples R China Chinese Acad Sci Beijing 100190 Peoples R China Tsinghua Univ Tsinghua Berkeley Shenzhen Inst Shenzhen 518055 Peoples R China

Accurate estimation of burden surface depth plays a crucial role in constructing the temperature field and optimizing reaction control in volatile kilns. However, most image-based depth estimation techniques require high-quality input images and achieve limited accuracy, which restrict their applications in actual harsh working conditions such as high temperature, heavy dust and dense smoke. In this study, a deep learning-based monocular depth estimation model is proposed to measure the burden surface depth in the volatile kiln head zone. The proposed model integrates an encoder-decoder network with an attention module. The encoder-decoder network outputs a set of deep semantic features, while the attention module intelligently fuses multi-level features to predict a probability distribution over depth intervals for each pixel. A volatile kiln prototype is designed and constructed to generate image datasets of the kiln head zone which approximate real data collected from industrial production sites. Results demonstrate that the proposed model has a depth prediction error of RMSE = 11.008 mm for the burden surface region, outperforming state-of-the-art neural networks and the traditional depth-from-defocus method. Code and datasets are available at https://***/LLLcong/Attention-MonoDepth.

关键词： Kilns Transformers Feature extraction Estimation Head Accuracy Temperature distribution Surface treatment Probability distribution Cameras Monocular depth estimation volatile kiln head zone encoder-decoder network attention mechanism

来源：评论

学校读者我要写书评

暂无评论

A novel hybrid architecture for video frame prediction: combining convolutional LSTM and 3D CNN

引用

JOURNAL OF REAL-TIME IMAGE PROCESSING 2025年第1期22卷 1-18页

作者： Aravinda, C. V. Al-Shehari, Taher Alsadhan, Nasser A. Shetty, Shashank Padmajadevi, G. Reddy, K. R. Udaya Kumar Nitte Deemed Be Univ NMAM Inst Technol Dept Comp Sci & Engn NITTE Karkala 574110 Karnataka India King Saud Univ Dept Selfdev Skill Comp Skills Common Year Deanship 1 Riyadh 11362 Saudi Arabia King Saud Univ Coll Comp & Informat Sci Comp Sci Dept Riyadh 12372 Saudi Arabia Maland Coll Engn Dept Elect & Commun Engn Hassan 573202 Karnataka India Dayananda Sagar Univ Bangalore 560078 Karnataka India

Video frame prediction represents a fundamental challenge in computer vision, necessitating precise modeling of both spatial and temporal dynamics within video sequences. This computational task holds substantial implications across diverse domains, including video compression optimization, robust object tracking systems, and advanced motion forecasting applications. In this investigation, we present a novel hybrid architecture that synthesizes the complementary strengths of Convolutional Long Short-Term Memory (ConvLSTM) networks and three-dimensional Convolutional Neural networks (3D CNN) for enhanced frame prediction capabilities. Our methodological framework incorporates a ConvLSTM component that fundamentally augments the traditional LSTM architecture through the integration of convolutional operations, thereby facilitating sophisticated modeling of sequential dependencies. Concurrently, the 3D CNN component employs volumetric convolutional layers to extract rich spatio-temporal features from the input sequences. Rigorous empirical evaluation demonstrates the superior performance of the ConvLSTM architecture, which consistently yields reduced validation errors and elevated coefficients of determination. Specifically, the ConvLSTM model achieves a validation Mean Squared Error (MSE) of 0.0237 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.6951, substantially outperforming the 3D CNN model, which exhibits a validation MSE of 0.0471 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.3939. These empiri

关键词： Video frame prediction ConvLSTM network Spatio-temporal model encoder-decoder network Convolutional neural network (CNN) Long short-term memory (LSTM) Convolutional operations Spatio-temporal dependencies Future frame generation Computer vision applications

来源：评论

学校读者我要写书评

暂无评论

Attention-Guided Asymmetric Multiscale Polyp Segmentation network

引用

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2025年 74卷

作者： Khan, Mukhtiar Fu, Shujun Ullah, Inam Shandong Univ Sch Math Jinan 250100 Shandong Peoples R China Shandong Jianzhu Univ Sch Comp Sci & Technol Jinan 250101 Shandong Peoples R China

Automated polyp segmentation of colonoscopy images is crucial in clinical settings, providing indispensable data for diagnosis and surgical procedures. Deep convolutional neural networks (CNNs) have shown promise in this domain, yet existing methods often fail to adequately address interactions between multilevel features, leading to suboptimal results. To overcome these limitations, we propose the attention-guided asymmetric multiscale polyp segmentation network (AAPCNet), a novel framework designed to effectively capture comprehensive semantic information for accurate polyp segmentation. AAPCNet leverages the Res2Net-50 backbone with a split-aggregation strategy embedded in Bottle2neck blocks to extract rich multilevel features. To enhance contextual understanding, we introduce the deep aggregation and fusion module (DAFM), which employs large-sized dilated and asymmetric convolutions to capture multiscale information, addressing the challenges posed by polyps of varying sizes. Furthermore, the spatial contextual fusion module (SCFM) utilizes spatial and channelwise attention mechanisms to refine features by emphasizing polyp-specific details while suppressing irrelevant background information. The innovation of our lightweight yet effective decoder lies in its unique architecture, which integrates a residual block (RB) between two SCFM modules, enabling feature refinement, enhanced polyp details, precise localization, and accurate boundary delineation while suppressing noise. This architecture achieves superior segmentation performance and outperforms state-of-the-art CCN-based models in both in-domain and out-of-domain datasets. Comprehensive experiments demonstrate that AAPCNet consistently achieves a favorable balance between accuracy and computational efficiency. Our codes and results are publicly available at: https://***/Mkhan143/AAPCNet.

关键词： Colonoscopy colorectal cancer (CRC) colorectal cancer (CRC) computer-aided diagnosis computer-aided diagnosis encoder-decoder network polyp segmentation polyp segmentation

来源：评论

学校读者我要写书评

暂无评论

WFIL-NET: image inpainting based on wavelet downsampling and frequency integrated learning module

引用

MULTIMEDIA SYSTEMS 2025年第1期31卷 1-13页

作者： Cao, Yu Ma, Ran Zhao, Kaifan An, Ping Shanghai Univ Sch Commun & Informat Engn Shanghai 200444 Peoples R China Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China

The purpose of image inpainting is to restore and fill missing areas, and how to restore delicate and reasonable missing content has always been one key issue. In the past decade, remarkable achievements have been made in image inpainting based on deep learning. However, when faced with large and irregular missing areas, there are still some problems such as semantic inconsistency, blurred edges and artifacts in the inpainted images. To address these problems, this paper proposes a novel image inpainting algorithm WFIL-NET which is based on wavelet downsampling and frequency integrated learning module. The WFIL-NET adopts the generative adversarial network (GAN) structure, where the encoder-decoder network is used in the generator part. To retain rich information while reducing the image resolution, we propose to use wavelet downsampling module in the encoder part to enhance the capacity of subsequent operations to learn representative features. Moreover, the wavelet transform extracts image features at different frequency levels: low-frequency information encapsulates the primary content and structure, whereas high-frequency information captures details and texture. The proposed frequency integrated learning module employs the attention mechanism to allocate appropriate weights to high and low frequency information, effectively integrating them to ensure a more coherent structure and semantic consistency in the inpainted image. Experimental results on the CelebA-HQ and Places2 datasets demonstrate that the proposed method effectively fills large and irregular missing areas, significantly enhances the visual quality of inpainted images, and mitigates edge blurring and artifacts.

关键词： Image inpainting Generative adversarial network encoder-decoder network Wavelet downsampling Frequency integrated

来源：评论

学校读者我要写书评

暂无评论

End-to-end trained encoder-decoder convolutional neural network for fetal electrocardiogram signal denoising

引用

PHYSIOLOGICAL MEASUREMENT 2020年第1期41卷 015005页

作者： Fotiadou, Eleni Konopczynski, Tomasz Hesser, Juergen Vullings, Rik Eindhoven Univ Technol Dept Elect Engn NL-5612 AP Eindhoven Netherlands Heidelberg Univ Med Fac Mannheim Cent Inst Sci Comp IWR Dept Radiat Oncol Heidelberg Germany Heidelberg Univ Cent Inst Comp Engn ZITI Med Fac Mannheim Heidelberg Germany

Objective: Non-invasive fetal electrocardiography has the potential to provide vital information for evaluating the health status of the fetus. However, the low signal-to-noise ratio of the fetal electrocardiogram (ECG) impedes the applicability of the method in clinical practice. Quality improvement of the fetal ECG is of great importance for providing accurate information to enable support in medical decision-making. In this paper we propose the use of artificial intelligence for the task of one-channel fetal ECG enhancement as a post-processing step after maternal ECG suppression. Approach: We propose a deep fully convolutional encoder-decoder framework, learning end-to-end mappings from noise-contaminated fetal ECGs to clean ones. Symmetric skip-layer connections are used between corresponding convolutional and transposed convolutional layers to help recover the signal details. Main results: Experiments on synthetic data show an average improvement of 7.5 dB in the signal-to-noise ratio (SNR) for input SNRs in the range of -15 to 15 dB. Application of the method with real signals and subsequent ECG interval analysis demonstrates a root mean square error of 9.9 and 14 ms for the PR and QT intervals, respectively, when compared with simultaneous scalp measurements. The proposed network can achieve substantial noise removal on both synthetic and real data. In cases of highly noise-contaminated signals some morphological features might be unreliably reconstructed. Significance: The presented method has the advantage of preserving individual variations in pulse shape and beat-to-beat intervals. Moreover, no prior knowledge on the power spectra of the noise or the pulse locations is required.

关键词： convolutional neural networks encoder-decoder network fetal ECG denoising fetal ECG enhancement fetal electrocardiography

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：