检索结果-内蒙古大学图书馆

TransSounder: A Hybrid TransUNet-TransFuse Architectural Framework for Semantic Segmentation of Radar Sounder Data

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2022年 60卷 1页

作者： Ghosh, Raktim Bovolo, Francesca Fdn Bruno Kessler Ctr Digital Soc I-38123 Trento Italy Univ Trento Dept Informat Engn & Comp Sci I-38123 Trento Italy

Radar sounders (RSs) are nadir-looking sensors operating in high frequency (HF) or very high frequency (VHF) bands that profile subsurface targets to retrieve miscellaneous scientific information. Due to the complex electromagnetic interaction between backscattered returns, the interpretation of RS data is challenging. The investigations of ice-sheet subsurface structures require automatic techniques to account for both the sequential spatial distribution of subsurface targets and relevant statistical properties embedded in RS signals. Automatic techniques exist for characterizing these targets either related to probabilistic inference models or convolutional neural network (CNN) deep learning methods. Unfortunately, CNN-based methods capture local spatial context and merely model the global spatial context. In contrast to CNN, the transformer-based models are reliable architectures for capturing long-range sequence-to-sequence global spatial contextual prior. Motivated by the aforementioned fact, we propose a novel transformer-based semantic segmentation architecture named TransSounder to effectively encode the sequential structures of the RS signals. The TransSounder was constructed on a hybrid TransUNet-TransFuse architectural framework to systematically augment the modules from TransUNet and TransFuse architectures. Experimental results obtained using the Multichannel Coherent Radar Depth Sounder (MCoRDS) dataset confirms the robustness and capability of transformers to accurately characterize the different subsurface targets.

关键词： Ice Transformers Feature extraction Probabilistic logic Convolutional neural networks Decoding Convolution Multichannel Coherent Radar Depth Sounder (MCoRDS) radar sounder (RS) semantic segmentation sequence-to-sequence model transformers TransFuse TransUNet

来源：评论

学校读者我要写书评

暂无评论

AIML and sequence-to-sequence models to Build Artificial Intelligence Chatbots: Insights from a Comparative Analysis 2nd

AIML and Sequence-to-Sequence Models to Build Artificial Int...

引用

2nd International Conference on Emerging Trends in Electrical, Electronic and Communications Engineering (ELECOM)

作者： Teckchandani, Nishant Santokhee, Aditya Bekaroo, Girish Middlesex Univ Mauritius Sch Sci & Technol Flic En Flac Mauritius

ISBN: (纸本)9783030182403;9783030182397

A chatbot is a software that is able to autonomously communicate with a human being through text and due to its usefulness, an increasing number of businesses are implementing such tools in order to provide timely communication to their clients. In the past, whilst literature has focused on implementing innovative chatbots and the evaluation of such tools, limited studies have been done to critically comparing such conversational systems. In order to address this gap, this study critically compares the Artificial Intelligence Mark-up Language (AIML), and sequence-to-sequence models for building chatbots. In this endeavor, two chatbots were developed to implement each model and were evaluated using a mixture of glass box and black box evaluation, based on 3 metrics, namely, user's satisfaction, the information retrieval rate, and the task completion rate of each chatbot. Results showed that the AIML chatbot ensured better user satisfaction, and task completion rate, while the sequence-to-sequence model had better information retrieval rate.

关键词： sequence-to-sequence model AIML Chatbot Conversational agents Artificial Intelligence Recurrent Neural Network

来源：评论

学校读者我要写书评

暂无评论

Incorporating word attention with convolutional neural networks for abstractive summarization

引用

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS 2020年第1期23卷 267-287页

作者： Yuan, Chengzhe Bao, Zhifeng Sanderson, Mark Tang, Yong South China Normal Univ Sch Comp Sci Guangzhou Peoples R China RMIT Univ Sch Sci Comp Sci & Informat Technol Melbourne Vic Australia

Neural sequence-to-sequence (seq2seq) models have been widely used in abstractive summarization tasks. One of the challenges of this task is redundant contents in the input document often confuses the models and leads to poor performance. An efficient way to solve this problem is to select salient information from the input document. In this paper, we propose an approach that incorporates word attention with multilayer convolutional neural networks (CNNs) to extend a standard seq2seq model for abstractive summarization. First, by concentrating on a subset of source words during encoding an input sentence, word attention is able to extract informative keywords in the input, which gives us the ability to interpret generated summaries. Second, these keywords are further distilled by multilayer CNNs to capture the coarse-grained contextual features of the input sentence. Thus, the combined word attention and multilayer CNNs modules provide a better-learned representation of the input document, which helps the model generate interpretable, coherent and informative summaries in an abstractive summarization task. We evaluate the effectiveness of our model on the English Gigaword, DUC2004 and Chinese summarization dataset LCSTS. Experimental results show the effectiveness of our approach.

关键词： Abstractive summarization Word attention Convolutional neural networks sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Onoma-to-wave: Environmental Sound Synthesis from Onomatopoeic Words

引用

APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING 2022年第1期11卷

作者： Okamoto, Yuki Imoto, Keisuke Takamichi, Shinnosuke Yamanishi, Ryosuke Fukumori, Takahiro Yamashita, Yoichi Ritsumeikan Univ Shiga Japan Doshisha Univ Kyoto Japan Univ Tokyo Tokyo Japan Kansai Univ Osaka Japan

In this paper, we propose a framework for environmental sound synthesis from onomatopoeic words. As one way of expressing an environmental sound, we can use an onomatopoeic word, which is a character sequence for phonetically imitating a sound. An onomatopoeic word is effective for describing diverse sound features. Therefore, the use of onomatopoeic words as input for environmental sound synthesis will enable us to generate diverse sounds. To generate diverse sounds, we propose a method based on a sequence-to-sequence framework for synthesizing environmental sounds from onomatopoeic words. We also propose a method of environmental sound synthesis using onomatopoeic words and sound event labels. The use of sound event labels in addition to onomatopoeic words enables us to capture each sound event's feature depending on the input sound event label. Our subjective experiments show that our proposed methods achieve higher diversity and naturalness than conventional methods using sound event labels.

关键词： Environmental sound synthesis sound event onomatopoeic word sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Neural Conversation Generation with Auxiliary Emotional Supervised models

引用

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING 2020年第2期19卷 1–17页

作者： Zhou, Guangyou Fang, Yizhen Peng, Yehong Lu, Jiaheng Cent China Normal Univ Sch Comp Wuhan 430079 Peoples R China Univ Helsinki Dept Comp Sci FI-00014 Helsinki Finland

An important aspect of developing dialogue agents involves endowing a conversation system with emotion perception and interaction. Most existing emotion dialogue models lack the adaptability and extensibility of different scenes because of their limitation to require a specified emotion category or their reliance on a fixed emotional dictionary. To overcome these limitations, we propose a neural conversation generation with auxiliary emotional supervised model (nCG-FSM) comprising a sequence-to-sequence (Seq2Seq) generation model and an emotional classifier used as an auxiliary model. The emotional classifier was trained to predict the emotion distributions of the dialogues, which were then used as emotion supervised signals to guide the generation model to generate diverse emotional responses. The proposed nCG-ESM is flexible enough to generate responses with emotional diversity, including specified or unspecified emotions, which can be adapted and extended to different scenarios. We conducted extensive experiments on the popular dataset of Weibo post-response pairs. Experimental results showed that the proposed model was capable of producing more diverse, appropriate, and emotionally rich responses, yielding substantial gains in diversity scores and human evaluations.

关键词： Neural conversation sequence-to-sequence model natural language processing

来源：评论

学校读者我要写书评

暂无评论

Physically Consistent Soft-Sensor Development Using sequence-to-sequence Neural Networks

引用

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 2020年第4期16卷 2829-2838页

作者： Chou, Cheng-Hung Wu, Haibin Kang, Jia-Lin Wong, David Shan-Hill Yao, Yuan Chuang, Yao-Chen Jang, Shi-Shang Ou, John Di-Yi Natl Tsing Hua Univ Dept Chem Engn Hsinchu 30013 Taiwan Natl Taiwan Univ Grad Inst Commun Engn Taipei 10617 Taiwan Natl Yunlin Univ Sci & Technol Dept Chem & Mat Engn Touliu 64002 Yunlin Taiwan Natl Tsing Hua Univ Ctr Energy & Environm Res Hsinchu 30013 Taiwan

Soft sensors attempt to predict the key quality variables that are infrequently available using the sensor and manipulated variables that are readily available. Since only limited amount of labeled data are available, there is always the concern whether the underlying physics were captured so that the model can be reasonably extrapolated. A sequence-to-sequence model in the form of a nonlinear state-observer & x002F;encoder and predictor & x002F;decoder was proposed. The observer can be trained using a large amount of unlabeled data, but in a supervised manner in which the process dynamics is tracked. The encoder output and manipulated variables are used to train the quality predictor. The model is applied to the product impurity predictions of an industrial column. Results show that good predictions and excellent consistency in the sign of estimated gains can be achieved even with limited amount of data. These findings indicated that the proposed sequence-to-sequence data-driven approach is able to capture the underlying physics of the process.

关键词： Data models Temperature measurement Neural networks Feature extraction Predictive models Observers Decoding Gain consistency sequence-to-sequence model soft sensor

来源：评论

学校读者我要写书评

暂无评论

Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis

引用

COMPUTER SPEECH AND LANGUAGE 2021年 67卷 101183-101183页

作者： Yasuda, Yusuke Wang, Xin Yamagishi, Junichi Natl Inst Informat Chiyoda Ku 2-1-2 Hitotsubashi Tokyo 1018430 Japan SOKENDAI Grad Univ Adv Studies Hayama Kanagawa 2400193 Japan Univ Edinburgh Ctr Speech Technol Res 10 Crichton St Edinburgh EH8 9AB Midlothian Scotland

Neural sequence-to-sequence text-to-speech synthesis (TTS) can produce high-quality speech directly from text or simple linguistic features such as phonemes. Unlike traditional pipeline TTS, the neural sequence-to-sequence TTS does not require manually annotated and complicated linguistic features such as part-of-speech tags and syntactic structures for system training. However, it must be carefully designed and well optimized so that it can implicitly extract useful linguistic features from the input features. In this paper we investigate under what conditions the neural sequence-to-sequence TTS can work well in Japanese and English along with comparisons with deep neural network (DNN) based pipeline TTS systems. Unlike past comparative studies, the pipeline systems also use neural autoregressive (AR) probabilistic modeling and a neural vocoder in the same way as the sequence-to-sequence systems do for a fair and deep analysis in this paper. We investigated systems from three aspects: a) model architecture, b) model parameter size, and c) language. For the model architecture aspect, we adopt modified Tacotron systems that we previously proposed and their variants using an encoder from Tacotron or Tacotron2. For the model parameter size aspect, we investigate two model parameter sizes. For the language aspect, we conduct listening tests in both Japanese and English to see if our findings can be generalized across languages. Our experiments on Japanese demonstrated that the Tacotron TTS systems with increased parameter size and input of phonemes and accentual type labels outperformed the DNN-based pipeline systems using the complicated linguistic features and that its encoder could learn to compensate for a lack of rich linguistic features. Our experiments on English demonstrated that, when using a suitable encoder, the Tacotron TTS system with characters as input can disambiguate pronunciations and produce natural speech as good as those of the systems using phoneme

关键词： Text-to-speech synthesis Deep learning sequence-to-sequence model End-to-end learning Tacotron

来源：评论

学校读者我要写书评

暂无评论

sequence-TO-sequence LABANOTATION GENERATION BASED ON MOTION CAPTURE DATA

SEQUENCE-TO-SEQUENCE LABANOTATION GENERATION BASED ON MOTION...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Li, Min Miao, Zhenjiang Ma, Cong Beijing Jiaotong Univ Beijing Peoples R China

ISBN: (纸本)9781509066315

Labanotation is an important notation system for recording dances. Automatically generating Labanotation scores from motion capture data has attracted more interest in recent years. Current methods usually focus on individual movement segments and generate Labanotation symbols one by one. This requires segmenting the captured data sequence in advance. Manual segmentation will consume a lot of time and effort, while automatic segmentation may not be reliable enough. In this paper, we propose a sequence-to-sequence approach that can generate Labanotation scores from unsegmented motion data sequences. First, we extract effective features from motion capture data based on body skeleton analysis. Then, we train a neural network under the encoder-decoder architecture to transform the motion feature sequences to corresponding Labanotation symbols. As such, the dance score is generated. Experiments show that the proposed method performs favorably against state-of-the-art algorithms in the automatic Labanotation generation task.

关键词： Automatic Labanotation generation Motio capture data sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Attention Forcing for Speech Synthesis 21

Attention Forcing for Speech Synthesis

引用

Interspeech Conference

作者： Dou, Qingyun Efiong, Joshua Gales, Mark J. F. Univ Cambridge Cambridge England

ISBN: (纸本)9781713820697

Auto-regressive sequence-to-sequence models with attention mechanisms have achieved state-of-the-art performance in various tasks including speech synthesis. Training these models can be difficult. The standard approach guides a model with the reference output history during training. However during synthesis the generated output history must be used. This mismatch can impact performance. Several approaches have been proposed to handle this, normally by selectively using the generated output history. To make training stable, these approaches often require a heuristic schedule or an auxiliary classifier. This paper introduces attention forcing, which guides the model with the generated output history and reference attention. This approach reduces the training-evaluation mismatch without the need for a schedule or a classifier. Additionally, for standard training approaches, the frame rate is often reduced to prevent models from copying the output history. As attention forcing does not feed the reference output history to the model, it allows using a higher frame rate, which improves the speech quality. Finally, attention forcing allows the model to generate output sequences aligned with the references, which is important for some down-stream tasks such as training neural vocoders. Experiments show that attention forcing allows doubling the frame rate, and yields significant gain in speech quality.

关键词： sequence-to-sequence model attention mechanism training speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Patching as Translation: the Data and the Metaphor 35

Patching as Translation: the Data and the Metaphor

引用

35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

作者： Ding, Yangruibo Ray, Baishakhi Devanbu, Premkumar Hellendoorn, Vincent J. Columbia Univ New York NY 10027 USA Univ Calif Davis Davis CA 95616 USA

ISBN: (纸本)9781450367684

Machine Learning models from other fields, like Computational Linguistics, have been transplanted to Software Engineering tasks, often quite successfully. Yet a transplanted model's initial success at a given task does not necessarily mean it is well-suited for the task. In this work, we examine a common example of this phenomenon: the conceit that "software patching is like language translation". We demonstrate empirically that there are subtle, but critical distinctions between sequence-to-sequence models and translation model: while program repair benefits greatly from the former, general modeling architecture, it actually suffers from design decisions built into the latter, both in terms of translation accuracy and diversity. Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions. Our findings also lend strong support to the recent trend towards synthesizing edits of code conditional on the buggy context, to repair bugs. We implement such models ourselves as "proof-of-concept" tools and empirically confirm that they behave in a fundamentally different, more effective way than the studied translation-based architectures. Overall, our results demonstrate the merit of studying the intricacies of machine learned models in software engineering: not only can this help elucidate potential issues that may be overshadowed by increases in accuracy;it can also help innovate on these models to raise the state-of-the-art further. We will publicly release our replication data and materials at https://***/ARiSE- Lab/Patch-as-translation.

关键词： neural machine translation big code sequence-to-sequence model automated program repair

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：