检索结果-内蒙古大学图书馆

sequence-to-sequence model for Trajectory Planning of Nonprehensile Manipulation Including Contact model

IEEE ROBOTICS AND AUTOMATION LETTERS 2018年第4期3卷 3606-3613页

作者： Kutsuzawa, Kyo Sakaino, Sho Tsuji, Toshiaki Saitama Univ Grad Sch Sci & Engn Saitama 3388570 Japan Saitama Univ JST PRESTO Saitama 3388570 Japan

Nonprehensile manipulation is necessary for robots to operate in humans' daily lives. As nonprehensile manipulation should satisfy both kinematics and dynamics requirements simultaneously, it is difficult to manipulate objects along given paths. Previous studies have considered the problems with sequence-to-sequence models, which are neural networks for time-series conversion. However, they did not consider nonlinear contact models, such as friction models. When we train the seq2seq models using end-to-end backpropagation, training losses vanish owing to static friction. In this letter, we realize sequence-to-sequence models for trajectory planning of nonprehensile manipulation including contact models between the robots and target objects. This letter proposes a training curriculum that commences training without contact models to bring the seq2seq models outside of the gradient-vanishing zone. This letter discusses sliding manipulation, which includes a friction model between objects and tools, such as frying pans fixed onto the robots. We validated the proposed curriculum through a simulation. In addition, we observed that the trained seq2seq models could handle parameter fluctuations that did not exist during training.

关键词： Deep learning in robotics and automation motion and path planning nonprehensile manipulation sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

SC-NER: A sequence-to-sequence model with Sentence Classification for Named Entity Recognition 23rd

SC-NER: A Sequence-to-Sequence Model with Sentence Classific...

引用

23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)

作者： Wang, Yu Li, Yun Zhu, Ziye Xia, Bin Liu, Zheng Nanjing Univ Posts & Telecommun Sch Comp Sci Jiangsu Key Lab Big Data Secur & Intelligent Proc Nanjing 210023 Jiangsu Peoples R China

ISBN: (纸本)9783030161484;9783030161477

Named Entity Recognition (NER) is a basic task in Natural Language Processing (NLP). Recently, the sequence-to-sequence (seq2seq) model has been widely used in NLP task. Different from the general NLP task, 60% sentences in the NER task do not contain entities. Traditional seq2seq method cannot address this issue effectively. To solve the aforementioned problem, we propose a novel seq2seq model, named SC-NER, for NER task. We construct a classifier between the encoder and decoder. In particular, the classifier's input is the last hidden state of the encoder. Moreover, we present the restricted beam search to improve the performance of the proposed SC-NER. To evaluate our proposed model, we construct the patent documents corpus in the communications field, and conduct experiments on it. Experimental results show that our SC-NER model achieves better performance than other baseline methods.

关键词： Named Entity Recognition sequence-to-sequence model Deep learning

来源：评论

学校读者我要写书评

暂无评论

Direct speech-to-speech translation with a sequence-to-sequence model 20

Direct speech-to-speech translation with a sequence-to-seque...

引用

Interspeech Conference

作者： Jia, Ye Weiss, Ron J. Biadsy, Fadi Macherey, Wolfgang Johnson, Melvin Chen, Zhifeng Wu, Yonghui Google Mountain View CA 94043 USA

We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation. The network is trained end-to-end, learning to map speech spectrograms into target spectrograms in another language, corresponding to the translated content (in a different canonical voice). We further demonstrate the ability to synthesize translated speech using the voice of the source speaker. We conduct experiments on two Spanish-to-English speech translation datasets, and find that the proposed model slightly underperforms a baseline cascade of a direct speech-to-text translation model and a text-to-speech synthesis model, demonstrating the feasibility of the approach on this very challenging task.

关键词： speech-to-speech translation voice transfer attention sequence-to-sequence model end-to-end model

来源：评论

学校读者我要写书评

暂无评论

A sequence-to-sequence model-based deep learning approach for recognizing activity of daily living for senior care

引用

JOURNAL OF BIOMEDICAL INFORMATICS 2018年 84卷 148-158页

作者： Zhu, Hongyi Chen, Hsinchun Brown, Randall Univ Arizona Dept Management Informat Syst Tucson AZ 85721 USA

Ensuring the health and safety of independent-living senior citizens is a growing societal concern. Researchers have developed sensor based systems to monitor senior citizens' Activity of Daily Living (ADL), a set of daily activities that can indicate their self-caring ability. However, most ADL monitoring systems are designed for one specific sensor modality, resulting in less generalizable models that is not flexible to account variations in real life monitoring settings. Current classic machine learning and deep learning methods do not provide a generalizable solution to recognize complex ADLs for different sensor settings. This study proposes a novel sequence-to-sequence model based deep-learning framework to recognize complex ADLs leveraging an activity state representation. The proposed activity state representation integrated motion and environment sensor data without labor-intense feature engineering. We evaluated our proposed framework against several state-of-the-art machine leaming and deep learning benchmarks. Overall, our approach outperformed baselines in most performance metrics, accurately recognized complex ADLs from different types of sensor input. This framework can generalize to different sensor settings and provide a viable approach to understand senior citizen's daily activity patterns with smart home health monitoring systems.

关键词： Activity of daily living ADL recognition Deep learning Activity state representation sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Temporal Pattern Attention-Based sequence to sequence model for Multistep Individual Load Forecasting 46

Temporal Pattern Attention-Based Sequence to Sequence model ...

引用

46th Annual Conference of the IEEE-Industrial-Electronics-Society (IECON)

作者： Xu, Chongchong Chen, Guo Zhou, Xiaojun Cent South Univ Sch Automat Changsha Peoples R China

ISBN: (纸本)9781728154145

Load forecasting plays a critical part in grid operation and planning. In particular, the importance of multistep load forecasting for individual power customer is increasingly prominent. Due to the strong volatility of individual consumers' electricity consumption behavior, traditional machine learning methods that cannot capture time dependence are difficult to obtain good prediction results. The recurrent neural network (RNN) can capture the time correlations existing in the load data, and the sequence to sequence (Seq2Seq) model combining two RNNs of the encoder and decoder is very suitable for multistep prediction. The temporal pattern attention mechanism can further capture the periodic change pattern in historical load data, which further improves time series modeling. We combined their advantages to propose a new type of multistep individual load forecasting framework, called the temporal pattern attention based sequence to sequence (TPA-Seq2Seq) model. This model can overcome the difficulty of multi-step prediction and further capture the load change pattern. The proposed framework was tested on real residential smart meter data, the results show that the proposed model has good prediction accuracy and is well suited for longer prediction sequences.

关键词： Individual load forecasting recurrent neural network sequence-to-sequence model temporal pattern attention mechanism

来源：评论

学校读者我要写书评

暂无评论

GENERATING SOUND WORDS FROM AUDIO SIGNALS OF ACOUSTIC EVENTS WITH sequence-to-sequence model

GENERATING SOUND WORDS FROM AUDIO SIGNALS OF ACOUSTIC EVENTS...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Ikawa, Shota Kashino, Kunio Univ Tokyo Grad Sch Informat Sci & Technol Tokyo Japan NTT Corp NTT Commun Sci Labs Tokyo Japan

ISBN: (纸本)9781538646588

Representing various sounds in language, such as sound words, or onomatopoeias, is not only useful as an auxiliary means for automatic speech recognition, but also essential in emerging fields such as natural human-machine communication, searching audio archives for acoustic events, and abnormality detection based on sounds. This paper proposes a novel method for sound word generation from audio signals. The method is based on an end-to-end, sequence-to-sequence framework to solve the audio segmentation problem to find an appropriate segment of audio signals along time that corresponds to a sequence of phonemes, and the ambiguity problem, where multiple words may correspond to the same sound, depending on the situations or listeners. Our tests show that the method worked efficiently and achieved a 2.8 % mean phoneme error rate (MPER) and a 7.2 % word error rate (WER) in a sound word generation task.

关键词： Sound word onomatopoeia sequence-to-sequence model sound transcription

来源：评论

学校读者我要写书评

暂无评论

MULTI-SCALE ALIGNMENT AND CONTEXTUAL HISTORY FOR ATTENTION MECHANISM IN sequence-to-sequence model

MULTI-SCALE ALIGNMENT AND CONTEXTUAL HISTORY FOR ATTENTION M...

引用

IEEE Workshop on Spoken Language Technology (SLT)

作者： Tjandra, Andros Sakti, Sakriani Nakamura, Satoshi Nara Inst Sci & Technol Nara Japan RIKEN Ctr AIP Tokyo Japan

ISBN: (纸本)9781538643341

A sequence-to-sequence model is a neural network module for mapping two sequences of different lengths. The sequence-to-sequence model has three core modules: encoder, decoder, and attention. Attention is the bridge that connects the encoder and decoder modules and improves model performance in many tasks. In this paper, we propose two ideas to improve sequence-to-sequence model performance by enhancing the attention module. First, we maintain the history of the location and the expected context from several previous time-steps. Second, we apply multiscale convolution from several previous attention vectors to the current decoder state. We utilized our proposed framework for sequence-to-sequence speech recognition and text-to-speech systems. The results reveal that our proposed extension can improve performance significantly compared to a standard attention baseline.

关键词： sequence-to-sequence model attention mechanism multiscale alignment contextual history ASR and TTS

来源：评论

学校读者我要写书评

暂无评论

Generating Sound Words from Audio Signals of Acoustic Events with sequence-to-sequence model

Generating Sound Words from Audio Signals of Acoustic Events...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Shota Ikawa Kunio Kashino Graduate School of Information Science and Technology University of Tokyo

ISBN: (纸本)9781538646595

Representing various sounds in language, such as sound words, or onomatopoeias, is not only useful as an auxiliary means for automatic speech recognition, but also essential in emerging fields such as natural human-machine communication, searching audio archives for acoustic events, and abnormality detection based on sounds. This paper proposes a novel method for sound word generation from audio signals. The method is based on an end-to-end, sequence-to-sequence framework to solve the audio segmentation problem to find an appropriate segment of audio signals along time that corresponds to a sequence of phonemes, and the ambiguity problem, where multiple words may correspond to the same sound, depending on the situations or listeners. Our tests show that the method worked efficiently and achieved a 2.8% mean phoneme error rate (MPER) and a 7.2% word error rate (WER) in a sound word generation task.

关键词： Sound word onomatopoeia sequence-to-sequence model sound transcription SOUND audio signals Audio Vibration Hearing Speech recognition anomaly detection Phonemes Word Acoustics Error analysis

来源：评论

学校读者我要写书评

暂无评论

AN ANALYSIS OF INCORPORATING AN EXTERNAL LANGUAGE model INTO A sequence-to-sequence model

AN ANALYSIS OF INCORPORATING AN EXTERNAL LANGUAGE MODEL INTO...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Anjuli Kannan Yonghui Wu Patrick Nguyen Tara N. Sainath Zhifeng Chen Rohit Prabhavalkar Google Inc. USA

ISBN: (数字)9781538646588

ISBN: (纸本)9781538646595

Attention-based sequence-to-sequence models for automatic speech recognition jointly train an acoustic model, language model, and alignment mechanism. Thus, the language model component is only trained on transcribed audio-text pairs. This leads to the use of shallow fusion with an external language model at inference time. Shallow fusion refers to log-linear interpolation with a separately trained language model at each step of the beam search. In this work, we investigate the behavior of shallow fusion across a range of conditions: different types of language models, different decoding units, and different tasks. On Google Voice Search, we demonstrate that the use of shallow fusion with an neural LM with wordpieces yields a 9.1% relative word error rate reduction (WERR) over our competitive attention-based sequence-to-sequence model, obviating the need for second-pass rescoring.

关键词： EXTERNAL LANGUAGE sequence-to-sequence model WERR modelling languages directed search Fusion reactions Fusion, biological FUSION fusion (melting) Speech recognition

来源：评论

学校读者我要写书评

暂无评论

sequence-to-sequence model with Attention for Time Series Classification 16

Sequence-to-Sequence Model with Attention for Time Series Cl...

引用

16th IEEE International Conference on Data Mining (ICDM)

作者： Tang, Yujin Xu, Jianfeng Matsumoto, Kazunori Ono, Chihiro KDDI R&D Labs Inc 2-1-15 Ohara Saitama 3568502 Japan

ISBN: (纸本)9781509059102

Encouraged by recent waves of successful applications of deep learning, some researchers have demonstrated the effectiveness of applying convolutional neural networks (CNN) to time series classification problems. However, CNN and other traditional methods require the input data to be of the same dimension which prevents its direct application on data of various lengths and multi-channel time series with different sampling rates across channels. Long short-term memory (LSTM), another tool in the deep learning arsenal and with its design nature, is more appropriate for problems involving time series such as speech recognition and language translation. In this paper, we propose a novel model incorporating a sequence-to-sequence model that consists two LSTMs, one encoder and one decoder. The encoder LSTM accepts input time series of arbitrary lengths, extracts information from the raw data and based on which the decoder LSTM constructs fixed length sequences that can be regarded as discriminatory features. For better utilization of the raw data, we also introduce the attention mechanism into our model so that the feature generation process can peek at the raw data and focus its attention on the part of the raw data that is most relevant to the feature under construction. We call our model S2SwA, as the short for sequence-to-sequence with Attention. We test S2SwA on both uni-channel and multi-channel time series datasets and show that our model is competitive with the state-of-the-art in real world tasks such as human activity recognition.

关键词： time series classification human activity recognition deep learning sequence-to-sequence model attention mechanism

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：