检索结果-内蒙古大学图书馆

光电子．激光 2021年第6期32卷 602-612页

作者：杨凯罗帅王勇高晓蓉蒋天赐李春江西南交通大学光电工程研究所四川成都610031 早稻田大学机械工程学院日本福冈8080135 浙江大学力学学院浙江杭州310058

提出了一种基于encoder-decoder网络的列车轮对激光曲线精确分割的算法。针对列车轮对激光曲线数据集局部特征丰富、语义信息简单的特点,设计了具有深度较浅、分辨率较高、细节表现良好的网络。设计的网络很好的利用了密集链接机制和上... 详细信息

提出了一种基于encoder-decoder网络的列车轮对激光曲线精确分割的算法。针对列车轮对激光曲线数据集局部特征丰富、语义信息简单的特点,设计了具有深度较浅、分辨率较高、细节表现良好的网络。设计的网络很好的利用了密集链接机制和上采样模块,加强了特征复用以及特征传播,具有较少参数的同时,能多尺度提取上下文语义信息。实验证明,encoder-decoder网络相比于其他网络在列车轮对激光曲线提取上表现出优异的性能。基于encoder-decoder的网络在列车轮对激光曲线数据集上交并比、召回率、准确率和F1score指标分别达到了86.5%、89.2%、99.9%、85.0%,能够比较精准提取列车轮对激光条纹。同时encoder-decoder网络在进行列车轮对激光条纹分割时能在一定程度上改善噪声对条纹提取的影响。因此在铁路安全方面具有良好的应用前景。

关键词：深度学习图像处理目标分割结构光测量激光条纹分割 encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

A classification method based on encoder-decoder structure with paper content

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2022年第9期34卷

作者： Yin, Yi Ouyang, Lin Wu, Zhixiang Yin, Shuifang Wuhan Univ Comp Sci & Technol Sch Comp Sci & Technol Wuhan Peoples R China Wuhan Univ Comp Sci & Technol Sch Sci Wuhan Peoples R China

The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder-decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder-decoder structure is formed with labels and words. This improves the original paper classification method based on encoder-decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder-decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results.

关键词： CNN encoder-decoder hidden information word embedding

来源：评论

学校读者我要写书评

暂无评论

encoder-decoder with dense dilated spatial pyramid pooling for prostate MR images segmentation

引用

COMPUTER ASSISTED SURGERY 2019年第sup2期24卷 13-19页

作者： Geng, Lei Wang, Jia Xiao, Zhitao Tong, Jun Zhang, Fang Wu, Jun Tianjin Key Lab Optoelect Detect Technol & Syst Tianjin Peoples R China Tianjin Polytech Univ Sch Elect & Informat Engn Tianjin Peoples R China Univ Wollongong SECTE Wollongong NSW Australia

Automatic segmentation of prostate magnetic resonance (MR) images has great significance for the diagnosis and clinical application of prostate diseases. It faces enormous challenges because of the low contrast of the tissue boundary and the small effective area of the prostate MR images. In order to solve these problems, we propose a novel end-to-end professional network which consists of an encoder-decoder structure with dense dilated spatial pyramid pooling (DDSPP) for prostate segmentation based on deep learning. First, the DDSPP module is used to extract the multi-scale convolution features in the prostate MR images, and then the decoder is used to capture the clear boundary of prostate. Competitive results are produced over state of the art on 130 MR images which key metrics Dice similarity coefficient (DSC) and Hausdorff distance (HD) are 0.954 and 1.752 mm respectively. Experimental results show that our method has high accuracy and robustness.

关键词： Prostate MRI encoder-decoder DDSPP

来源：评论

学校读者我要写书评

暂无评论

Short-Term Load Forecasting Using encoder-decoder WaveNet: Application to the French Grid

引用

ENERGIES 2021年第9期14卷 2524页

作者： Dorado Rueda, Fernando Duran Suarez, Jaime del Real Torres, Alejandro IDENER Seville 41300 Spain Univ Seville Dept Syst & Automat Seville 41092 Spain

The prediction of time series data applied to the energy sector (prediction of renewable energy production, forecasting prosumers' consumption/generation, forecast of country-level consumption, etc.) has numerous useful applications. Nevertheless, the complexity and non-linear behaviour associated with such kind of energy systems hinder the development of accurate algorithms. In such a context, this paper investigates the use of a state-of-art deep learning architecture in order to perform precise load demand forecasting 24-h-ahead in the whole country of France using RTE data. To this end, the authors propose an encoder-decoder architecture inspired by WaveNet, a deep generative model initially designed by Google DeepMind for raw audio waveforms. WaveNet uses dilated causal convolutions and skip-connection to utilise long-term information. This kind of novel ML architecture presents different advantages regarding other statistical algorithms. On the one hand, the proposed deep learning model's training process can be parallelized in GPUs, which is an advantage in terms of training times compared to recurrent networks. On the other hand, the model prevents degradations problems (explosions and vanishing gradients) due to the residual connections. In addition, this model can learn from an input sequence to produce a forecast sequence in a one-shot manner. For comparison purposes, a comparative analysis between the most performing state-of-art deep learning models and traditional statistical approaches is presented: Autoregressive-Integrated Moving Average (ARIMA), Long-Short-Term-Memory, Gated-Recurrent-Unit (GRU), Multi-Layer Perceptron (MLP), causal 1D-Convolutional Neural Networks (1D-CNN) and ConvLSTM (encoder-decoder). The values of the evaluation indicators reveal that WaveNet exhibits superior performance in both forecasting accuracy and robustness.

关键词： time series forecasting energy consumption forecasting deep learning machine learning convolutional neural networks artificial neural networks causal convolutions dilated convolutions encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Pooling Attention-based encoder-decoder Network for semantic segmentation

引用

COMPUTERS & ELECTRICAL ENGINEERING 2021年 93卷 107260-107260页

作者： Xu, Haixia Huang, Yunjia Hancock, Edwin R. Wang, Shuailong Xuan, Qijun Zhou, Wei Xiangtan Univ Coll Automat & Elect Informat Xiangtan 411105 Peoples R China Xiangtan Univ Coll Comp Sci Xiangtan 411105 Peoples R China Univ York Dept Comp Sci York YO10 5DD N Yorkshire England

Aiming to the challenge of poor pixel-consistency in inter-category and pixel-similarity in inter-category, in this paper, we propose an encoder-decoder network for image semantic segmentation using pooling SE-ResNet attention module, called PAEDN. It is an effective of attention mechanism to get aggregated information. According to the principle of SE-ResNet, a collection of Average, Maximum and Stochastic global pooling, which concentrate on contoured, detailed, and generalized information in a certain semantic segmentation, form attention modules. Channel Pooling Attention Module (CPAM) and Position Pooling Attention Module (PPAM) are designed and integrated into the encoder to extract discriminative features from input images, and the decoder is developed through SE-ResNet attention module to fuse the feature map in high-resolution with that in low-resolution. Experimental evaluations performed on the data sets PASCAL and Cityscapes, show the proposed encoder-decoder with pooling attention module produces good pixel-consistency semantic label, achieves 15.1% improvement to FCN.

关键词： Semantic segmentation encoder-decoder Pooling attention module Channel Position

来源：评论

学校读者我要写书评

暂无评论

Dance with a Robot: encoder-decoder Neural Network for Music-Dance Learning 15

Dance with a Robot: Encoder-Decoder Neural Network for Music...

引用

15th Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI)

作者： Xie, Baijun Park, Chung Hyuk George Washington Univ Washington DC 20052 USA

ISBN: (纸本)9781450370578

This late-breaking report presents a method for learning sequential and temporal mapping between music and dance via the Sequence-to-Sequence (Seq2Seq) architecture. In this study, the Seq2Seq model comprises two parts: the encoder for processing the music inputs and the decoder for generating the output motion vectors. This model has the ability to accept music features and motion inputs from the user for human-robot interactive learning sessions, which outputs the motion patterns that teach the corrective movements to follow the moves from the expert dancer. Three different types of Seq2Seq models are compared in the results and applied to a simulation platform. This model will be applied in social interaction scenarios with children with autism spectrum disorder (ASD).

关键词： Robotics neural networks Seq2Seq encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

SalED: Saliency prediction with a pithy encoder-decoder architecture sensing local and global information

引用

IMAGE AND VISION COMPUTING 2021年 109卷 104149-104149页

作者： Wang, Ziqiang Liu, Zhi Wei, Weijie Duan, Huizhan Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China Shanghai Univ Sch Commun & Informat Engn Shanghai 200444 Peoples R China

This paper proposes a deep convolutional neural network with a concise and effective encoder-decoder architec-ture for saliency prediction. Local and global contextual features make a considerable contribution to saliency prediction. In order to integrate and exploit these features more thoroughly, in the proposed pithy architecture, we deploy a dense and global context connection structure between the encoder and decoder, after that, a multi-scale readout module is designed to process various information from the previous portion of the decoder with different parallel mapping relationships for full-scale accurate results. Our model ranks first in light of multiple metrics on two famous saliency benchmarks and performs good generalization on other datasets. Besides, we evaluate the precision and the speed of our model with different backbones. The saliency prediction performance of VGGNet-Based, ResNet-based, and DenseNet-based model gradually increases while the speed also drops off. And the experiments illustrate that our model performs better than other models even if replacing the backbone of our model with the same backbone of the compared model. Therefore, we can provide optional versions of our model for different requirements of performance and efficiency. (c) 2021 Elsevier B.V. All rights reserved.

关键词： Saliency prediction Fixation prediction Convolutional neural networks encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Single Image Haze Removal Based on transmission map estimation using encoder-decoder based deep learning architecture

引用

OPTIK 2021年 248卷 168197-168197页

作者： Satrasupalli, Sivaji Daniel, Ebenezer Reddy Guntur, Sitaramanjaneya Vignans Fdn Sci Res & Technol Dept Elect & Commun Engn Guntur 522213 Andhra Pradesh India

Haze removal is an essential requirement in autonomous vehicle applications for identifying different objects on the road. Most of the available techniques are based on different constraints/ priors. The important parameters required for recovering the ground truth from hazy image are transmission map and air light. In this paper, we proposed a learning-based encoder-decoder deep learning architecture for transmission map estimation. Based on the assumption that at least twenty percent of the outdoor image includes with sky region and hence airlight is calculated as average of the twenty percent brightest pixels of the image. These two parameters namely transmission map and airlight were applied in atmospheric scattering model for ground truth image recovery. In encoder-decoder architecture, Max pooling layer, dropout layer was used for feature learning and efficient generalization respectively. The proposed architecture was trained on different datasets like NYU Depth data set, FRIDA and RESIDE Dataset for better generalization on unseen data. Experimental results shows that the proposed method has shown better performance compared to the existing state of the art methods.

关键词： Transmission map Airlight Dropout layer Deep learning encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Self-Attention based encoder-decoder for multistep human density prediction

引用

JOURNAL OF URBAN MOBILITY 2022年 2卷

作者： Violos, John Theodoropoulos, Theodoros Maroudis, Angelos-Christos Leivadeas, Aris Tserpes, Konstantinos Ecole Technol Super Dept Software & IT Engn 1100 Notre Dame St W Montreal PQ H3C 1K3 Canada Harokopio Univ Athens Dept Informat & Telemat 9 Omirou Tavros 16671 Greece

Multistep Human Density Prediction (MHDP) is an emerging challenge in urban mobility with lots of applications in several domains such as Smart Cities, Edge Computing and Epidemiology Modeling. The basic goal is to estimate the density of people gathered in a set of urban Regions of Interests (ROIs) or Points of Interests (POIs) in a forecast horizon of different granularities. Accordingly, this paper aims to contribute and go beyond the existing literature on human density prediction by proposing an innovative time series Deep Learning (DL) model and a geospatial feature preprocessing technique. Specifically, our research aim is to develop a highly-accurate MHDP model leveraging jointly the temporal and spatial components of mobility data. In the beginning, we compare 29 baseline and state-of-the-art methods grouped into six categories and we find that the statistical time series and Deep Learning encoders-decoders (ED) that we propose are highly accurate outperforming the other models based on a real and a synthetic mobility dataset. Our model achieves an average of 28.88 Mean Absolute Error (MAE) and 87.58 Root Mean Squared Error (RMSE) with 200,000 pedestrians per day distributed in multiple regions of interest in a 30 minutes time-window at different granularities. In addition, the geospatial feature transformation increases 4% further the RMSE of the proposed model compared to the state of the art solutions. Hence, this work provides an efficient and at the same time general applicable MHDP model that can benefit the planning and decision-making of many major urban mobility applications.

关键词： Mobility encoder-decoder Self attention Time series Deep learning Points of interest Regions of interest

来源：评论

学校读者我要写书评

暂无评论

End-to-End Speaker Diarization for an Unknown Number of Speakers with encoder-decoder Based Attractors 21

End-to-End Speaker Diarization for an Unknown Number of Spea...

引用

Interspeech Conference

作者： Horiguchi, Shota Fujita, Yusuke Watanabe, Shinji Xue, Yawen Nagamatsu, Kenji Hitachi Ltd Tokyo Japan Johns Hopkins Univ Baltimore MD 21218 USA

ISBN: (纸本)9781713820697

End-to-end speaker diarization for an unknown number of speakers is addressed in this paper. Recently proposed end-to-end speaker diarization outperformed conventional clustering-based speaker diarization, but it has one drawback: it is less flexible in terms of the number of speakers. This paper proposes a method for encoder-decoder based attractor calculation (EDA), which first generates a flexible number of attractors from a speech embedding sequence. Then, the generated multiple attractors are multiplied by the speech embedding sequence to produce the same number of speaker activities. The speech embedding sequence is extracted using the conventional self-attentive end-to-end neural speaker diarization (SA-EEND) network. In a two-speaker condition, our method achieved a 2.69% diarization error rate (DER) on simulated mixtures and a 8.07% DER on the two-speaker subset of CALLHOME, while vanilla SA-EEND attained 4.56% and 9.54 %, respectively. In unknown numbers of speakers conditions, our method attained a 15.29% DER on CALLHOME, while the x-vector-based clustering method achieved a 19.43% DER.

关键词： speaker diarization encoder-decoder attractor calculation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：