检索结果-内蒙古大学图书馆

10th International Conference on Brain Inspired Cognitive Systems (BICS)

作者： Gao, Fei He, Yishan Wang, Jun Ma, Fei Yang, Erfu Hussain, Amir Beihang Univ Beijing 100191 Peoples R China Univ Strathclyde Glasgow G1 1XJ Lanark Scotland Edinburgh Napier Univ Cognit Big Data & Cyber Informat CogBID Lab Edinburgh EH10 5DT Midlothian Scotland

ISBN: (纸本)9783030394318;9783030394301

Synthetic Aperture Radar (SAR) image segmentation is an important step in SAR image interpretation. Common Patch-based methods treat all the pixels within the patch as a single category and do not take the label consistency between neighbor patches into consideration, which makes the segmentation results less accurate. In this paper, we use an encoder-decoder network to conduct pixel-wise segmentation. Then, in order to make full use of the contextual information between patches, we use fully-connected conditional random field to optimize the combined probability map output from encoder-decoder network. The testing results on our SAR data set shows that our method can effectively maintain contextual information of pixels and achieve better segmentation results.

关键词： SAR image segmentation encoder-decoder network Fully-connected CRF

来源：评论

学校读者我要写书评

暂无评论

Semantic Segmentation for Identifying Road Surface Damages Using Lightweight encoder-decoder network

Semantic Segmentation for Identifying Road Surface Damages U...

引用

International Conference on Advanced Creative networks and Intelligent Systems (ICACNIS) - Blockchain Technology, Intelligent Systems, and the Applications for Human Life

作者： Abdussyukur, Hafizh Sulistiyo, Mahmud Dwi Rachmawati, Ema Arief, Mansur Maturidi Kosala, Gamma Adiwijaya Telkom Univ Sch Comp Bandung Indonesia Carnegie Mellon Univ Dept Mech Engn Pittsburgh PA USA

ISBN: (纸本)9798350334449

With the increasing growth of road infrastructure in recent decades, road surface damage is becoming more prevalent. The rapid advance of neural networks and their intelligent technologies can scale up efforts to help deal with this problem. One of the technologies that can be applied in this context is computer vision with semantic segmentation, which can help automatically identify road surface damage. While a naive implementation of semantic segmentation often sacrifices running time and speed performance, in this study, we propose the lightweight encoder-decoder network model to overcome this issue. Numerical experiments show that this method gives us 110 minutes running time and is able to run at 26 fps, which can boost nearly 2x than the baseline model's running time and speed performance for automated road surface damage identification tasks and can be extended to automatically measure the area of road damage and provide more meaningful information for decision-makers.

关键词： semantic segmentation road damage lightweight network encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

LSTM-convolutional-BLSTM encoder-decoder network for minimum mean-square error approach to speech enhancement

引用

APPLIED ACOUSTICS 2021年 172卷 107647-107647页

作者： Wang, Zeyu Zhang, Tao Shao, Yangyang Ding, Biyun Tianjin Univ Tianjin Int Engn Inst Tianjin 300072 Peoples R China Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China

In recent years, deep learning models have been employed for speech enhancement. Most of the existing methods based on deep learning use fully Convolutional Neural network (CNN) to capture time-frequency information of input features. Compared with CNNs, it is more reasonable to use Long Short-Term Memory (LSTM) network to capture contextual information on the time axis of features. However, the computation load of a fully LSTM structure is heavy. To balance the model complexity and the capability of capturing time-frequency features, we present an LSTM-Convolutional-BLSTM encoder-decoder (LCLED) network for speech enhancement. The LCLED additionally incorporates transpose convolution and skip connection. The key idea is that we use two LSTM parts and convolutional layers to model the contextual information and frequency dimension features, respectively. Furthermore, in order to achieve a higher quality of enhanced speech, a priori Signal-to-Noise Ratio (SNR) is applied as the learning target of LCLED. The Minimum Mean-Square Error (MMSE) approach is used for postprocessing. The results indicate that the proposed LCLED not only reduces the model complexity and training time but also improves the quality and the intelligibility of enhanced speech compared with the fully LSTM structure. (C) 2020 Elsevier Ltd. All rights reserved.

关键词： Speech enhancement LSTM-Convolutional-BLSTM encoder-decoder network Transpose convolution Minimum Mean-Square Error

来源：评论

学校读者我要写书评

暂无评论

TypeSeg: A type-aware encoder-decoder network for multi-type ultrasound images co-segmentation

引用

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022年 214卷 106580-106580页

作者： Chen, Fang Ye, Haoran Zhang, Daoqiang Liao, Hongen Nanjing Univ Aeronaut & Astronaut Coll Comp Sci & Technol MIIT Key Lab Pattern Anal & Machine Intelligence Nanjing Peoples R China Tsinghua Univ Sch Med Dept Biomed Engn Beijing Peoples R China

Purpose: As a portable and radiation-free imaging modality, ultrasound can be easily used to image various types of tissue structures. It is important to develop a method which supports the multi-type ultrasound images co-segmentation. However, state-of-the-art ultrasound segmentation methods commonly only focus on the single type images or ignore the type-aware information. Methods: To solve the above problem, this work proposes a novel type-aware encoder-decoder network (TypeSeg) for the multi-type ultrasound images co-segmentation. First, we develop a type-aware metric learning module to find an optimum latent feature space where the ultrasound images of the same types are close and that of the different types are separated by a certain margin. Second, depending on the extracted features, a decision module decides whether the input ultrasound images have the common tissue type or not, and the encoder-decoder network produces a segmentation mask accordingly. Results: We evaluate the performance of the proposed TypeSeg model on the ultrasound dataset that contains four types of tissues. The proposed TypeSeg model achieves the overall best results with the mean IOU score of 87.51% +/- 3.93% for the multi-type ultrasound images. Conclusion: The experimental results indicate that the proposed method outperforms all the compared state-of-the-art algorithms for the multi-type ultrasound images co-segmentation task. (C) 2021 Elsevier B.V. All rights reserved.

关键词： Multi-type ultrasound images Type-aware information encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

LAEDNet: A Lightweight Attention encoder-decoder network for ultrasound medical image segmentation

引用

COMPUTERS & ELECTRICAL ENGINEERING 2022年 99卷 107777-107777页

作者： Zhou, Quan Wang, Qianwen Bao, Yunchao Kong, Lingjun Jin, Xin Ou, Weihua Nanjing Univ Posts & Telecommun Natl Engn Res Ctr Commun & Networking Nanjing Peoples R China Jinling Inst Technol Fac Network & Telecommun Engn Nanjing Peoples R China Beijing Elect Sci & Technol Inst Dept Comp Sci & Technol Beijing Peoples R China Guizhou Normal Univ Sch Big Data & Comp Sci Guiyang Peoples R China

Automatic ultrasound image segmentation plays an important role in early diagnosis of human diseases. This paper introduces a novel and efficient encoder-decoder network, called Lightweight Attention encoder-decoder network (LAEDNet), for automatic ultrasound image segmentation. In contrast to previous encoder-decoder networks that involve complicated architecture with numerous parameters, our LAEDNet adopts lightweight version of EfficientNet as encoder. On the other hand, a Lightweight Residual Squeeze-and-Excitation (LRSE) block is employed in decoder. To achieve trade-off between segmentation accuracy and implementing efficiency, we also present a family of models, from light to heavy (denoted as LAEDNet-S, LAEDNet-M, and LAEDNet-L, respectively), with varying lightweight version of EfficientNet backbones. To evaluate LAEDNet, we have conducted extensive experiments on Brachial Plexus Dataset (BP), Breast Ultrasound Images Dataset (BUSI), and Head Circumference Ultrasound Images Dataset (HCUS), where ultrasound images are suffered from high noise, blurred borders and low contrast. The experiments show that, compared with U-Net and its variants, e.g., M-Net, U-Net++ and TransUNet, our LAEDNet achieves better results in terms of Dice Coefficient (DSC) and running speed. Particularly, LAEDNet-M only has 10.75M model parameters with 40.7 FPS, yet obtaining 73.0%, 73.8% and 91.3% DSC on BP, BUSI and HCUS datasets, respectively.

关键词： Lightweight network Medical ultrasound image segmentation encoder-decoder network Visual attention EfficientNet

来源：评论

学校读者我要写书评

暂无评论

Short-Term Wind Power Prediction Based on encoder-decoder network and Multi-Point Focused Linear Attention Mechanism

引用

SENSORS 2024年第17期24卷 5501页

作者： Mei, Jinlong Wang, Chengqun Luo, Shuyun Xu, Weiqiang Deng, Zhijiang Zhejiang Sci Tech Univ Sch Comp Sci & Technol Hangzhou 310018 Peoples R China Zhejiang Sci Tech Univ Key Lab Intelligent Text & Flexible Interconnect Z Hangzhou 310018 Peoples R China Fox Ess Co Ltd Wenzhou 325024 Peoples R China

Wind energy is a clean energy source that is characterised by significant uncertainty. The electricity generated from wind power also exhibits strong unpredictability, which when integrated can have a substantial impact on the security of the power grid. In the context of integrating wind power into the grid, accurate prediction of wind power generation is crucial in order to minimise damage to the grid system. This paper proposes a novel composite model (MLL-MPFLA) that combines a multilayer perceptron (MLP) and an LSTM-based encoder-decoder network for short-term prediction of wind power generation. In this model, the MLP first extracts multidimensional features from wind power data. Subsequently, an LSTM-based encoder-decoder network explores the temporal characteristics of the data in depth, combining multidimensional features and temporal features for effective prediction. During decoding, an improved focused linear attention mechanism called multi-point focused linear attention is employed. This mechanism enhances prediction accuracy by weighting predictions from different subspaces. A comparative analysis against the MLP, LSTM, LSTM-Attention-LSTM, LSTM-Self_Attention-LSTM, and CNN-LSTM-Attention models demonstrates that the proposed MLL-MPFLA model outperforms the others in terms of MAE, RMSE, MAPE, and R2, thereby validating its predictive performance.

关键词： short-term wind power prediction encoder-decoder network LSTM network multi-point focused linear attention

来源：评论

学校读者我要写书评

暂无评论

Water Stream Extraction via Feature-Fused encoder-decoder network Based on SAR Images

引用

REMOTE SENSING 2023年第6期15卷 1559-1559页

作者： Yuan, Da Wang, Chao Wu, Lin Yang, Xu Guo, Zhengwei Dang, Xiaoyan Zhao, Jianhui Li, Ning Henan Univ Coll Comp & Informat Engn Kaifeng 475004 Peoples R China Henan Univ Henan Engn Res Ctr Intelligent Technol & Applicat Kaifeng 475004 Peoples R China Henan Univ Henan Key Lab Big Data Anal & Proc Kaifeng 475004 Peoples R China Henan Acad Sci Inst Geog Sci Zhengzhou 450052 Peoples R China Henan Key Lab Remote Sensing & Geog Informat Syst Zhengzhou 450052 Peoples R China

The extraction of water stream based on synthetic aperture radar (SAR) is of great significance in surface water monitoring, flood monitoring, and the management of water resources. However, in recent years, the research mainly uses the backscattering feature (BF) to extract water bodies. In this paper, a feature-fused encoder-decoder network was proposed for delineating the water stream more completely and precisely using both the BF and polarimetric feature (PF) from SAR images. Firstly, the standard BFs were extracted and PFs were obtained using model-based decomposition. Specifically, the newly model-based decomposition, more suitable for dual-pol SAR images, was selected to acquire three different PFs of surface water stream for the first time. Five groups of candidate feature combinations were formed with two BFs and three PFs. Then, a new feature-fused encoder-decoder network (FFEDN) was developed for mining and fusing both BFs and PFs. Finally, several typical areas were selected to evaluate the performance of different combinations for water stream extraction. To further verify the effectiveness of the proposed method, two machine learning methods and four state-of-the-art deep learning algorithms were utilized for comparison. The experimental results showed that the proposed method using the optimal feature combination achieved the highest accuracy, with a precision of 95.21%, recall of 91.79%, intersection over union (IoU) score of 87.73%, overall accuracy (OA) of 93.35%, and average accuracy (AA) of 93.41%. The results showed that the performance was higher when BF and PF were combined. In short, in this study, the effectiveness of PFs for water stream extraction was verified and the proposed FFEDN can further improve the accuracy of water stream extraction.

关键词： water stream extraction encoder-decoder network synthetic aperture radar (SAR) polarimetric feature (PF) backscattering feature (BF) feature combination

来源：评论

学校读者我要写书评

暂无评论

Potential Obstacle Detection Using RGB to Depth Image encoder-decoder network: Application to Unmanned Aerial Vehicles

引用

SENSORS 2022年第17期22卷 6703-6703页

作者： Hachaj, Tomasz Pedag Univ Krakow Inst Comp Sci 2 Podchorazych Ave PL-30084 Krakow Poland

In this work, a new method is proposed that allows the use of a single RGB camera for the real-time detection of objects that could be potential collision sources for Unmanned Aerial Vehicles. For this purpose, a new network with an encoder-decoder architecture has been developed, which allows rapid distance estimation from a single image by performing RGB to depth mapping. Based on a comparison with other existing RGB to depth mapping methods, the proposed network achieved a satisfactory trade-off between complexity and accuracy. With only 6.3 million parameters, it achieved efficiency close to models with more than five times the number of parameters. This allows the proposed network to operate in real time. A special algorithm makes use of the distance predictions made by the network, compensating for measurement inaccuracies. The entire solution has been implemented and tested in practice in an indoor environment using a micro-drone equipped with a front-facing RGB camera. All data and source codes and pretrained network weights are available to download. Thus, one can easily reproduce the results, and the resulting solution can be tested and quickly deployed in practice.

关键词： encoder-decoder network depth prediction RGB to depth mapping obstacle detection Unmanned Aerial Vehicles deep neural network

来源：评论

学校读者我要写书评

暂无评论

A Dense encoder-decoder network with Feedback Connections for Pan-Sharpening

引用

REMOTE SENSING 2021年第22期13卷 4505-4505页

作者： Li, Weisheng Xiang, Minghao Liang, Xuesong Chongqing Univ Posts & Telecommun Coll Comp Sci & Technol Chongqing 400065 Peoples R China

To meet the need for multispectral images having high spatial resolution in practical applications, we propose a dense encoder-decoder network with feedback connections for pan-sharpening. Our network consists of four parts. The first part consists of two identical subnetworks, one each to extract features from PAN and MS images, respectively. The second part is an efficient feature-extraction block. We hope that the network can focus on features at different scales, so we propose innovative multiscale feature-extraction blocks that fully extract effective features from networks of various depths and widths by using three multiscale feature-extraction blocks and two long-jump connections. The third part is the feature fusion and recovery network. We are inspired by the work on U-Net network improvements to propose a brand new encoder network structure with dense connections that improves network performance through effective connections to encoders and decoders at different scales. The fourth part is a continuous feedback connection operation with overfeedback to refine shallow features, which enables the network to obtain better reconstruction capabilities earlier. To demonstrate the effectiveness of our method, we performed several experiments. Experiments on various satellite datasets show that the proposed method outperforms existing methods. Our results show significant improvements over those from other models in terms of the multiple-target index values used to measure the spectral quality and spatial details of the generated images.

关键词： convolutional neural network double-stream structure feedback encoder-decoder network dense connections

来源：评论

学校读者我要写书评

暂无评论

Counting in congested crowd scenes with hierarchical scale-aware encoder-decoder network

引用

EXPERT SYSTEMS WITH APPLICATIONS 2024年第PartD期238卷

作者： Han, Run Qi, Ran Lu, Xuequan Huang, Lei Lyu, Lei Shandong Normal Univ Sch Informat Sci & Engn Jinan Peoples R China Shandong Prov Key Lab Novel Distributed Comp Softw Jinan Peoples R China La Trobe Univ Dept Comp Sci & IT Melbourne Vic Australia Ocean Univ China Coll Informat Sci & Engn Qingdao Peoples R China

As an indispensable component of intelligent monitoring systems, crowd counting plays a crucial role in many fields, particularly crowd management and control during the COVID-19 pandemic. Despite the promising achievements of many methods, crowd scale variations and noise interference in congested crowd scenes remain urgent problems to be solved. In this paper, we propose a novel Hierarchical Scale-aware encoder-decoder network (HSED-Net) for single-image crowd counting to handle scale variations and noise interference, thereby generating high-quality density maps. The HSED-Net is designed as an encoder-decoder architecture, which contains two core networks: Scale-Aware Encoding network (SAEnet) and Multi-path Aggregation Decoding network (MADnet). The SAEnet focuses on extracting rich multi-scale crowd features, which employs cascaded scale-aware encoding branches to collaboratively obtain high-resolution feature representations. During the encoding phase, two adaptive weight generators are proposed to filter the crowd features from different dimensions to resist the interference of noise. Instead of fusing the multi-scale and multi-level features indiscriminately, the MADnet adopts a multi-path adaptive fusion strategy and selectively emphasizes more appropriate features through the spatial and channel guidance modules, further improving the quality of density maps and the robustness of network. Extensive experiments on four challenging datasets have strongly demonstrated the superiority of our HSED-Net.

关键词： Crowd counting Density estimation Scale-aware network encoder-decoder network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：