检索结果-内蒙古大学图书馆

23rd International Conference on Applications of Natural Language to Information Systems (NLDB)

作者： Ou, Wenjie Chen, Chaotao Ren, Jiangtao Sun Yat Sen Univ Sch Data & Comp Sci Guangzhou Guangdong Peoples R China

ISBN: (纸本)9783319919478;9783319919461

Natural language generation (NLG) plays a critical role in various natural language processing (NLP) applications. And the topics provide a powerful tool to understand the natural language. We propose a novel topic-based NLG model which can generate topic coherent sentences given single topic or combination of topics. The model is an extension of the recurrent encoder-decoder framework by introducing a global topic embedding matrix. Experimental results show that our encoder can not only transform a source sentence to a representative topic distribution which can give a better interpretation of the source sentence, but also generate topic coherent and diversified sentences given different topic distribution without any text-level input.

关键词： Natural language generation Topic encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Colorectal Segmentation using Multiple encoder-decoder Network in Colonoscopy Images 1

Colorectal Segmentation using Multiple Encoder-Decoder Netwo...

引用

1st IEEE International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)

作者： Ngoc-Quang Nguyen Lee, Sang-Woong Gachon Univ Pattern Recognit & Machine Learning Lab Seongnam South Korea

ISBN: (纸本)9781538695555

Colorectal cancer is the third most common cancer which causes of cancer-related deaths. Therefore, early diagnosis of polyps by colonoscopy could result in successful treatment. Diagnosis of polyps in colonoscopy videos is a challenging task due to variations in the size and shape of polyps. In this paper, we propose a polyp segmentation method based on the encoder decoder network. Performance of the method is enhanced by two strategies, we perform a novel database augmentation method for colonoscopy images in the training phase. Besides, in the test phase, we perform an effective prediction by combining multi model to compare the probability of each image that is produced by the network. Evaluation of the proposed method using the ETIS-LariPolypDB [9] database shows that our proposed method outperforms state-of-the-art results.

关键词： colorectal segmentation CRC encoder-decoder multi-model

来源：评论

学校读者我要写书评

暂无评论

DSNet:Multi-resolution Dense encoder and Stack decoder Network for Aerial Image Segmentation

DSNet:Multi-resolution Dense Encoder and Stack Decoder Netwo...

引用

Chinese Automation Congress (CAC)

作者： Chong, Yanwen Nie, Congchong Tao, Yulong Pan, Shaoming Wuhan Univ State Key Lab Informat Engn Surveying Mapping & R Wuhan Peoples R China

ISBN: (纸本)9781728140940

Semantic segmentation in high resolution aerial image is faced with a challenge caused by ubiquitous fine-structure objects. Traditional encoder-decoder structure losses some detail information during the process of down-sampling, which is harmful to the location of fine-structure objects. In this work, we present a multi-resolution dense encoder and stack decoder network to deal with this problem. On the one hand, the dense encoder embeds shallow detailed feature into deep semantic feature through proposed information-reserved down-sampling method called CE-Pooling. On the other hand, the stack decoder gradually enhances the detailed feature through iterative attention fusion. Extensive experiments on several benchmark datasets have been conducted, which shows that our method is superior than the state-of-the-art approaches.

关键词： semantic segmentation encoder-decoder fine-structure ISPRS

来源：评论

学校读者我要写书评

暂无评论

FEATURE FUSION encoder decoder NETWORK FOR AUTOMATIC LIVER LESION SEGMENTATION 16

FEATURE FUSION ENCODER DECODER NETWORK FOR AUTOMATIC LIVER L...

引用

16th IEEE International Symposium on Biomedical Imaging (ISBI)

作者： Chen, Xueying Zhang, Rong Yang, Pingkun Univ Sci & Technol China Dept Elect Engn & Informat Sci Hefei Anhui Peoples R China Rensselaer Polytech Inst Dept Biomed Engn Troy NY 12180 USA

ISBN: (纸本)9781538636411

Liver lesion segmentation is a difficult yet critical task for medical image analysis. Recently, deep learning based image segmentation methods have achieved promising performance, which can be divided into three categories: 2D, 2.51) and 3D, based on the dimensionality of the models. However, 2.51) and 31) methods can have very high complexity and 2D methods may not perform satisfactorily. To obtain competitive performance with low complexity, in this paper, we propose a ***-decoder Network (FED-Net) based 2D segmentation model to tackle the challenging problem of liver lesion segmentation from CT images. Our feature fusion method is based on the attention mechanism, which fuses high-level features carrying semantic information with low-level features having image details. Additionally, to compensate for the information loss during the upsampling process, a dense upsampling convolution and a residual convolutional structure are proposed. We tested our method on the dataset of MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge and achieved competitive results compared with other state-of-the-art methods.

关键词： Liver lesion segmentation,deep learning encoder-decoder attention feature fusion

来源：评论

学校读者我要写书评

暂无评论

ECTC-DOCD: An End-to-end Structure with CTC encoder and OCD decoder for Speech Recognition 20

ECTC-DOCD: An End-to-end Structure with CTC Encoder and OCD ...

引用

Interspeech Conference

作者： Yi, Cheng Wang, Feng Xu, Bo Chinese Acad Sci Inst Automat Beijing Peoples R China Univ Chinese Acad Sci Beijinga Peoples R China

Real-time streaming speech recognition is required by most applications for a nice interactive experience. To naturally support online recognition, a common strategy used in recently proposed end-to-end models is to introduce a blank label to the label set and instead output alignments. However, generating the alignment means decoding much longer than the length of the linguistic sequence. Besides, there exist several blank labels between two output units in the alignment, which hinders models from learning the adjacent dependency of units in the target sequence. In this work, we propose an innovative encoder-decoder structure, called ECTC-DOCD, for online speech recognition which directly predicts the linguistic sequence without blank labels. Apart from the encoder and decoder structures, ECTC-DOCD contains an additional shrinking layer to drop the redundant acoustic information. This layer serves as a bridge connecting acoustic representation and linguistic modelling parts. Through experiments, we confirm that ECTC-DOCD can obtain better performance than a strong CTC model in online ASR tasks. We also show that ECTC-DOCD can achieve promising results on both Mandarin and English ASR datasets with first and second pass decoding.

关键词： end-to-end streaming ASR encoder-decoder OCD CTC

来源：评论

学校读者我要写书评

暂无评论

encoder-decoder WITH FOCUS-MECHANISM FOR SEQUENCE LABELLING BASED SPOKEN LANGUAGE UNDERSTANDING

ENCODER-DECODER WITH FOCUS-MECHANISM FOR SEQUENCE LABELLING ...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Zhu, Su Yu, Kai Shanghai Jiao Tong Univ Brain Sci & Technol Res Ctr Key Lab Shanghai Educ Commiss Intelligent Interac SpeechLabDept Comp Sci & Engn Shanghai Peoples R China

ISBN: (纸本)9781509041176

This paper investigates the framework of encoder-decoder with attention for sequence labelling based spoken language understanding. We introduce Bidirectional Long Short Term Memory - Long Short Term Memory networks (BLSTM-LSTM) as the encoder-decoder model to fully utilize the power of deep learning. In the sequence labelling task, the input and output sequences are aligned word by word, while the attention mechanism cannot provide the exact alignment. To address this limitation, we propose a novel focus mechanism for encoder-decoder framework. Experiments on the standard ATIS dataset showed that BLSTM-LSTM with focus mechanism defined the new state-of-the-art by outperforming standard BLSTM and attention based encoder-decoder. Further experiments also show that the proposed model is more robust to speech recognition errors.

关键词： Spoken language understanding encoder-decoder focus-mechanism robustness

来源：评论

学校读者我要写书评

暂无评论

ECRU: An encoder-decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding

引用

JOURNAL OF IMAGING 2018年第10期4卷 116-116页

作者： Yasrab, Robail Univ Nottingham Comp Vis Lab Sch Comp Sci Nottingham NG8 1BB England Univ Sci & Technol China Sch Comp Sci & Technol Hefei 230000 Anhui Peoples R China

This research presents the idea of a novel fully-Convolutional Neural Network (CNN)-based model for probabilistic pixel-wise segmentation, titled encoder-decoder-based CNN for Road-Scene Understanding (ECRU). Lately, scene understanding has become an evolving research area, and semantic segmentation is the most recent method for visual recognition. Among vision-based smart systems, the driving assistance system turns out to be a much preferred research topic. The proposed model is an encoder-decoder that performs pixel-wise class predictions. The encoder network is composed of a VGG-19 layer model, while the decoder network uses 16 upsampling and deconvolution units. The encoder of the network has a very flexible architecture that can be altered and trained for any size and resolution of images. The decoder network upsamples and maps the low-resolution encoder's features. Consequently, there is a substantial reduction in the trainable parameters, as the network recycles the encoder's pooling indices for pixel-wise classification and segmentation. The proposed model is intended to offer a simplified CNN model with less overhead and higher performance. The network is trained and tested on the famous road scenes dataset CamVid and offers outstanding outcomes in comparison to similar early approaches like FCN and VGG16 in terms of performance vs. trainable parameters.

关键词： convolutional neural network (CNN) ReLU encoder-decoder CamVid pooling semantic segmentation VGG-19 ADAS

来源：评论

学校读者我要写书评

暂无评论

A Non-negative Symmetric encoder-decoder Approach for Community Detection 17

A Non-negative Symmetric Encoder-Decoder Approach for Commun...

引用

ACM Conference on Information and Knowledge Management (CIKM)

作者： Sun, Bing-Jie Shen, Huawei Gao, Jinhua Ouyang, Wentao Cheng, Xueqi Chinese Acad Sci Inst Comp Technol CAS Key Lab Network Data Sci & Technol Beijing Peoples R China

ISBN: (纸本)9781450349185

Community detection or graph clustering is crucial to understanding the structure of complex networks and extracting relevant knowledge from networked data. Latent factor model, e.g., non-negative matrix factorization and mixed membership block model, is one of the most successful methods for community detection. Latent factor models for community detection aim to find a distributed and generally low-dimensional representation, or coding, that captures the structural regularity of network and reflects the community membership of nodes. Existing latent factor models are mainly based on reconstructing a network from the representation of its nodes, namely network decoder, while constraining the representation to have certain desirable properties. These methods, however, lack an encoder that transforms nodes into their representation. Consequently, they fail to give a clear explanation about the meaning of a community and suffer from undesired computational problems. In this paper, we propose a non-negative symmetric encoder-decoder approach for community detection. By explicitly integrating a decoder and an encoder into a unified loss function, the proposed approach achieves better performance over state-of-the-art latent factor models for community detection task. Moreover, different from existing methods that explicitly impose the sparsity constraint on the representation of nodes, the proposed approach implicitly achieves the sparsity of node representation through its symmetric and non-negative properties, making the optimization much easier than competing methods based on sparse matrix factorization.

关键词： Community detection Latent factor model encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

A GRU-based encoder-decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition 14

A GRU-based Encoder-Decoder Approach with Attention for Onli...

引用

14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

作者： Zhang, Jianshu Du, Jun Dai, Lirong Univ Sci & Technol China Natl Engn Lab Speech & Language Informat Proc Hefei Anhui Peoples R China

ISBN: (纸本)9781538635865

In this study, we present a novel end-to-end approach based on the encoder-decoder framework with the attention mechanism for online handwritten mathematical expression recognition (OHMER). First, the input two-dimensional ink trajectory information of handwritten expression is encoded via the gated recurrent unit based recurrent neural network (GRU-RNN). Then the decoder is also implemented by the GRU-RNN with a coverage-based attention model. The proposed approach can simultaneously accomplish the symbol recognition and structural analysis to output a character sequence in LaTeX format. Validated on the CROHME 2014 competition task, our approach significantly outperforms the state-of-the-art with an expression recognition accuracy of 52.43% by only using the official training dataset. Furthermore, the alignments between the input trajectories of handwritten expressions and the output LaTeX sequences are visualized by the attention mechanism to show the effectiveness of the proposed method.

关键词： Online Handwritten Mathematical Expression Recognition encoder-decoder Gated Recurrent Unit Attention

来源：评论

学校读者我要写书评

暂无评论

Multitask Learning with Low-Level Auxiliary Tasks for encoder-decoder Based Speech Recognition 18

Multitask Learning with Low-Level Auxiliary Tasks for Encode...

引用

18th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2017)

作者： Toshniwal, Shubham Tang, Hao Lu, Liang Livescu, Karen Toyota Technol Inst Chicago IL 60637 USA

ISBN: (纸本)9781510848764

End-to-end training of deep learning-based models allows for implicit learning of intermediate representations based on the final task loss. However, the end-to-end approach ignores the useful domain knowledge encoded in explicit intermediate-level supervision. We hypothesize that using intermediate representations as auxiliary supervision at lower levels of deep networks may be a good way of combining the advantages of end-to-end training and more traditional pipeline approaches. We present experiments on conversational speech recognition where we use lower-level tasks, such as phoneme recognition, in a multitask training approach with an encoder-decoder model for direct character transcription. We compare multiple types of lower-level tasks and analyze the effects of the auxiliary tasks. Our results on the Switchboard corpus show that this approach improves recognition accuracy over a standard encoder-decoder model on the Eva12000 test set.

关键词： speech recognition multitask learning encoder-decoder CTC LSTM

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：