检索结果-内蒙古大学图书馆

ECRU: An encoder-decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding

JOURNAL OF IMAGING 2018年第10期4卷 116-116页

作者： Yasrab, Robail Univ Nottingham Comp Vis Lab Sch Comp Sci Nottingham NG8 1BB England Univ Sci & Technol China Sch Comp Sci & Technol Hefei 230000 Anhui Peoples R China

This research presents the idea of a novel fully-Convolutional Neural Network (CNN)-based model for probabilistic pixel-wise segmentation, titled encoder-decoder-based CNN for Road-Scene Understanding (ECRU). Lately, scene understanding has become an evolving research area, and semantic segmentation is the most recent method for visual recognition. Among vision-based smart systems, the driving assistance system turns out to be a much preferred research topic. The proposed model is an encoder-decoder that performs pixel-wise class predictions. The encoder network is composed of a VGG-19 layer model, while the decoder network uses 16 upsampling and deconvolution units. The encoder of the network has a very flexible architecture that can be altered and trained for any size and resolution of images. The decoder network upsamples and maps the low-resolution encoder's features. Consequently, there is a substantial reduction in the trainable parameters, as the network recycles the encoder's pooling indices for pixel-wise classification and segmentation. The proposed model is intended to offer a simplified CNN model with less overhead and higher performance. The network is trained and tested on the famous road scenes dataset CamVid and offers outstanding outcomes in comparison to similar early approaches like FCN and VGG16 in terms of performance vs. trainable parameters.

关键词： convolutional neural network (CNN) ReLU encoder-decoder CamVid pooling semantic segmentation VGG-19 ADAS

来源：评论

学校读者我要写书评

暂无评论

A Non-negative Symmetric encoder-decoder Approach for Community Detection 17

A Non-negative Symmetric Encoder-Decoder Approach for Commun...

引用

ACM Conference on Information and Knowledge Management (CIKM)

作者： Sun, Bing-Jie Shen, Huawei Gao, Jinhua Ouyang, Wentao Cheng, Xueqi Chinese Acad Sci Inst Comp Technol CAS Key Lab Network Data Sci & Technol Beijing Peoples R China

ISBN: (纸本)9781450349185

Community detection or graph clustering is crucial to understanding the structure of complex networks and extracting relevant knowledge from networked data. Latent factor model, e.g., non-negative matrix factorization and mixed membership block model, is one of the most successful methods for community detection. Latent factor models for community detection aim to find a distributed and generally low-dimensional representation, or coding, that captures the structural regularity of network and reflects the community membership of nodes. Existing latent factor models are mainly based on reconstructing a network from the representation of its nodes, namely network decoder, while constraining the representation to have certain desirable properties. These methods, however, lack an encoder that transforms nodes into their representation. Consequently, they fail to give a clear explanation about the meaning of a community and suffer from undesired computational problems. In this paper, we propose a non-negative symmetric encoder-decoder approach for community detection. By explicitly integrating a decoder and an encoder into a unified loss function, the proposed approach achieves better performance over state-of-the-art latent factor models for community detection task. Moreover, different from existing methods that explicitly impose the sparsity constraint on the representation of nodes, the proposed approach implicitly achieves the sparsity of node representation through its symmetric and non-negative properties, making the optimization much easier than competing methods based on sparse matrix factorization.

关键词： Community detection Latent factor model encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

A GRU-based encoder-decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition 14

A GRU-based Encoder-Decoder Approach with Attention for Onli...

引用

14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

作者： Zhang, Jianshu Du, Jun Dai, Lirong Univ Sci & Technol China Natl Engn Lab Speech & Language Informat Proc Hefei Anhui Peoples R China

ISBN: (纸本)9781538635865

In this study, we present a novel end-to-end approach based on the encoder-decoder framework with the attention mechanism for online handwritten mathematical expression recognition (OHMER). First, the input two-dimensional ink trajectory information of handwritten expression is encoded via the gated recurrent unit based recurrent neural network (GRU-RNN). Then the decoder is also implemented by the GRU-RNN with a coverage-based attention model. The proposed approach can simultaneously accomplish the symbol recognition and structural analysis to output a character sequence in LaTeX format. Validated on the CROHME 2014 competition task, our approach significantly outperforms the state-of-the-art with an expression recognition accuracy of 52.43% by only using the official training dataset. Furthermore, the alignments between the input trajectories of handwritten expressions and the output LaTeX sequences are visualized by the attention mechanism to show the effectiveness of the proposed method.

关键词： Online Handwritten Mathematical Expression Recognition encoder-decoder Gated Recurrent Unit Attention

来源：评论

学校读者我要写书评

暂无评论

Multitask Learning with Low-Level Auxiliary Tasks for encoder-decoder Based Speech Recognition 18

Multitask Learning with Low-Level Auxiliary Tasks for Encode...

引用

18th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2017)

作者： Toshniwal, Shubham Tang, Hao Lu, Liang Livescu, Karen Toyota Technol Inst Chicago IL 60637 USA

ISBN: (纸本)9781510848764

End-to-end training of deep learning-based models allows for implicit learning of intermediate representations based on the final task loss. However, the end-to-end approach ignores the useful domain knowledge encoded in explicit intermediate-level supervision. We hypothesize that using intermediate representations as auxiliary supervision at lower levels of deep networks may be a good way of combining the advantages of end-to-end training and more traditional pipeline approaches. We present experiments on conversational speech recognition where we use lower-level tasks, such as phoneme recognition, in a multitask training approach with an encoder-decoder model for direct character transcription. We compare multiple types of lower-level tasks and analyze the effects of the auxiliary tasks. Our results on the Switchboard corpus show that this approach improves recognition accuracy over a standard encoder-decoder model on the Eva12000 test set.

关键词： speech recognition multitask learning encoder-decoder CTC LSTM

来源：评论

学校读者我要写书评

暂无评论

NASA Technical Reports Server (Ntrs) 19850019880: a Software Simulation Study of a (255,223) Reed-Solomon encoder-decoder

引用

2017年

NASA Technical Reports Server (Ntrs) 19850019880: a Software Simulation Study of a (255,223) Reed-Solomon encoder-decoder by NASA Technical Reports Server (Ntrs); published by

关键词： (ntrs) 19850019880: algorithms coders computer programs computerized simulation decoders encoder-decoder error detection codes fast fourier transformations information theory nasa technical reports server (ntrs) pollara, f. reed-solomon voyager project

来源：评论

学校读者我要写书评

暂无评论

encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding

Encoder-decoder with focus-mechanism for sequence labelling ...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Su Zhu Kai Yu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering SpeechLab Department of Computer Science and Engineering Brain Science and Technology Research Center Shanghai Jiao Tong University China

ISBN: (纸本)9781509041183

This paper investigates the framework of encoder-decoder with attention for sequence labelling based spoken language understanding. We introduce Bidirectional Long Short Term Memory - Long Short Term Memory networks (BLSTM-LSTM) as the encoder-decoder model to fully utilize the power of deep learning. In the sequence labelling task, the input and output sequences are aligned word by word, while the attention mechanism cannot provide the exact alignment. To address this limitation, we propose a novel focus mechanism for encoder-decoder framework. Experiments on the standard ATIS dataset showed that BLSTM-LSTM with focus mechanism defined the new state-of-the-art by outperforming standard BLSTM and attention based encoder-decoder. Further experiments also show that the proposed model is more robust to speech recognition errors.

关键词： Spoken language understanding encoder-decoder focus-mechanism robustness spoken language understanding Word Packaging Short-Term Memory Robustness Speech recognition

来源：评论

学校读者我要写书评

暂无评论

A Recurrent encoder-decoder Network for Sequential Face Alignment 14th

A Recurrent Encoder-Decoder Network for Sequential Face Alig...

引用

14th European Conference on Computer Vision (ECCV)

作者： Peng, Xi Feris, Rogerio S. Wang, Xiaoyu Metaxas, Dimitris N. Rutgers State Univ Piscataway NJ 08854 USA IBM TJ Watson Res Ctr Yorktown Hts NY USA Snapchat Res Venice CA USA

ISBN: (纸本)9783319464480;9783319464473

We propose a novel recurrent encoder-decoder network model for real-time video-based face alignment. Our proposed model predicts 2D facial point maps regularized by a regression loss, while uniquely exploiting recurrent learning at both spatial and temporal dimensions. At the spatial level, we add a feedback loop connection between the combined output response map and the input, in order to enable iterative coarse-to-fine face alignment using a single network model. At the temporal level, we first decouple the features in the bottleneck of the network into temporal-variant factors, such as pose and expression, and temporal-invariant factors, such as identity information. Temporal recurrent learning is then applied to the decoupled temporal-variant features, yielding better generalization and significantly more accurate results at test time. We perform a comprehensive experimental analysis, showing the importance of each component of our proposed model, as well as superior results over the state-of-the-art in standard datasets.

关键词： Recurrent learning encoder-decoder Face alignment

来源：评论

学校读者我要写书评

暂无评论

ON TRAINING THE RECURRENT NEURAL NETWORK encoder-decoder FOR LARGE VOCABULARY END-TO-END SPEECH RECOGNITION 41

ON TRAINING THE RECURRENT NEURAL NETWORK ENCODER-DECODER FOR...

引用

41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Lu, Liang Zhang, Xingxing Renals, Steve Univ Edinburgh Ctr Speech Technol Res Edinburgh Midlothian Scotland Univ Edinburgh Inst Language Cognit & Computat Edinburgh Midlothian Scotland

ISBN: (纸本)9781479999880

Recently, there has been an increasing interest in end-to-end speech recognition using neural networks, with no reliance on hidden Markov models (HMMs) for sequence modelling as in the standard hybrid framework. The recurrent neural network (RNN) encoder-decoder is such a model, performing sequence to sequence mapping without any predefined alignment. This model first transforms the input sequence into a fixed length vector representation, from which the decoder recovers the output sequence. In this paper, we extend our previous work on this model for large vocabulary end-to-end speech recognition. We first present a more effective stochastic gradient decent (SGD) learning rate schedule that can significantly improve the recognition accuracy. We then extend the decoder with long memory by introducing another recurrent layer that performs implicit language modelling. Finally, we demonstrate that using multiple recurrent layers in the encoder can reduce the word error rate. Our experiments were carried out on the Switchboard corpus using a training set of around 300 hours of transcribed audio data, and we have achieved significantly higher recognition accuracy, thereby reduced the gap compared to the hybrid baseline.

关键词： end-to-end speech recognition deep neural networks recurrent neural networks encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM encoder-decoder 16

Tweet2Vec: Learning Tweet Embeddings Using Character-level C...

引用

39th International ACM SIGIR conference on Research and Development in Information Retrieval

作者： Vosoughi, Soroush Vijayaraghavan, Prashanth Roy, Deb MIT Media Lab Cambridge MA 02139 USA

ISBN: (纸本)9781450340694

We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTMencoder-decoder. We trained our model on 3 million, randomly selected English-language tweets. The model was evaluated using two methods: tweet semantic similarity and tweet sentiment categorization, outperforming the previous state-of-the-art in both tasks. The evaluations demonstrate the power of the tweet embeddings generated by our model for various tweet categorization tasks. The vector representations generated by our model are generic, and hence can be applied to a variety of tasks. Though the model presented in this paper is trained on English-language tweets, the method presented can be used to learn tweet embeddings for different languages.

关键词： Twitter Embedding Tweet Convolutional Neural Networks CNN LSTM Tweet2Vec encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

ES-net: An Integration Model Based on encoderdecoder and Siamese Time Series Difference Network for Grade Monitoring of Zinc Tailings and Concentrate

引用

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2023年第11期70卷 11819-11830页

作者： Zhang, Hu Tang, Zhaohui Xie, Yongfang Yin, Zeyang Gui, Weihua Changsha Univ Coll Comp Sci & Engn Changsha 410022 Peoples R China Cent South Univ Sch Automat Changsha 410083 Peoples R China

In froth flotation, the tailings grade and concentrate grade are the two key performance indexes. At present, the monitoring models of these two key grades mostly use the froth image or video from a flotation cell. However, flotation cells are closely related and coupled seriously. It is difficult to use a froth image or video from a flotation cell to represent the concentrate or tailings grade. Therefore, an encoder-decoder and Siamese time series network (ES-net) is proposed. First, an encoder-decoder (ED) model is designed to predict target grade (i.e., the zinc tailings or concentrate grade) by the video feature sequence of the first rougher and the measured target grade sequence. Meanwhile, a Siamese time series and difference network (STS-D net) is constructed to predict the target grade by the video feature sequences of target flotation cell (i.e., the last scavenger or cleaner) at current and previous moments and the previously measured target grade. After that, a multitask learning strategy is proposed to integrate the ED model and STS-D net. Experiments show that the proposed ES-net can effectively integrate multiple froth visual features from different flotation cells and obtain more accurate concentrate and tailings grades than the existing models.

关键词： Visualization Monitoring Zinc Feature extraction Manufacturing processes Time series analysis Current measurement encoder-decoder froth flotation froth image froth video grade monitoring Siamese time series network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：