检索结果-内蒙古大学图书馆

LongT5-Mulla: LongT5 With Multi-Level Local Attention for a Longer sequence

IEEE ACCESS 2023年 11卷 138433-138444页

作者： Zhou, Le Shanghai Jiao Tong Univ Dept Comp Sci & Engn Shanghai 200240 Peoples R China

Efficient Transformer models typically employ local and global attention methods, or utilize hierarchical or recurrent architectures, to process long text inputs in natural language processing tasks. However, these models face challenges in terms of sacrificing either efficiency, accuracy, or compatibility to develop their application in longer sequences. To maintain both the accuracy of global attention and the efficiency of local attention, while keeping a good compatibility to be easily applied to an existing pre-trained model, in this paper, we propose multi-level local attention (Mulla attention), which is a hierarchical local attention that acts on both the input sequence and multiple pooling sequences of different granularity simultaneously, thus performing long-range modeling while maintaining linear or log-linear complexity. We apply Mulla attention to LongT5 and implement our LongT5-Mulla sequence-to-sequence model, without introducing new parameters except for positional embeddings. Experiments show that our model can surpass all baseline models, including two original variants of LongT5, in the 8 similar to 16k-input long text summarization task on the Multi-News, arXiv and WCEP-10 datasets, with improvements of at least +0.22, +0.01, +0.52 percentage points (pp) averaged Rouge scores respectively, while at the meantime being able to effectively process longer sequences that have 16 similar to 48k tokens with at least 52.6% lower memory consumption than LongT5-tglobal, and +0.56 similar to 1.62 pp averaged Rouge scores higher than LongT5-local. These results demonstrate that our proposed LongT5-Mulla model can effectively process long sequences and extend the maximum input length for long text tasks from 16k to 48k while maintaining accuracy and efficiency.

关键词： Efficient transformer long-range modeling natural language processing sequence-to-sequence model text summarization

来源：评论

学校读者我要写书评

暂无评论

Improving Seq2Seq TTS Frontends With Transcribed Speech Audio

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2023年 31卷 1940-1952页

作者： Sun, Siqi Richmond, Korin Tang, Hao Univ Edinburgh UKRI Ctr Doctoral Training Nat Language Proc Edinburgh EH8 9LW Scotland Univ Edinburgh Ctr Speech Technol Res Edinburgh EH8 9LW Scotland Univ Edinburgh Sch Informat Edinburgh EH8 9LW Scotland

Due to the data inefficiency and low speech quality of grapheme-based end-to-end text-to-speech (TTS), having a separate high-performance TTS linguistic frontend is still commonly regarded as necessary. However, a TTS frontend is itself difficult to build and maintain, since it requires abundant linguistic knowledge for its construction. In this article, we start by bootstrapping an integrated sequence-to-sequence (Seq2Seq) TTS frontend using a pre-existing pipeline-based frontend and large amounts of unlabelled normalized text, achieving promising memorization and generalisation abilities. To overcome the performance limitation imposed by the pipeline-based frontend, this work proposes a Forced Alignment (FA) method to decode the pronunciations from transcribed speech audio and then use them to update the Seq2Seq frontend. Our experiments demonstrate the effectiveness of our proposed FA method, which can significantly improve the word token accuracy from 52.6% to 91.2% for out-of-dictionary words. In addition, it can also correct the pronunciation of homographs from transcribed speech audio and potentially improve the homograph disambiguation performance of the Seq2Seq frontend.

关键词： Training Pipelines Phonetics Dictionaries Acoustics Neural networks Training data Text-to-Speech synthesis sequence-to-sequence model linguistic frontend pronunciation learning grapheme-to-phoneme

来源：评论

学校读者我要写书评

暂无评论

TAGNet: a tiny answer-guided network for conversational question generation

引用

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 2023年第5期14卷 1921-1932页

作者： Wang, Zekun Zhu, Haichao Liu, Ming Qin, Bing Harbin Inst Technol Res Ctr Social Comp & Informat Retrieval Harbin Peoples R China Peng Cheng Lab Shenzhen Peoples R China

Conversational Question Generation (CQG) aims to generate conversational questions with the given passage and conversa-tion history. Previous work of CQG presumes a contiguous span as the answer and generates a question targeting it. However, this limits the application scenarios because answers in practical conversations are usually abstractive free-form text instead of extractive spans. In addition, most state-of-the-art CQG systems are based on pretrained language models consisting of hundreds of millions of parameters, bringing challenges to real-life applications due to latency and capacity constraints. To elegantly address these problems, in this work, we introduce the Tiny Answer-Guided Network (TAGNET) based on the lightweight module (Bi-LSTM) for CQG. We explicitly take the target answers as input, which interacts with the passages and conversation history in the encoder and guides the question generation through the gated attention mechanism in the decoder. Besides, we distill the knowledge from larger pretrained language models into our smaller network to make the trade-off between performance and efficiency. Experimental results show that our TAGNET achieves a comparable perfor-mance with large pretrained language models (retaining 95.9% of teacher performance) while using 5.7x fewer parameters and 10.4x faster inference latency. TAGNET outperforms the previous best-performing model with similar parameter size by a large margin, and further analysis shows that TAGNET generates more answer-specific conversational questions.

关键词： Conversational Question Generation sequence-to-sequence model Knowledge Distillation model Compression

来源：评论

学校读者我要写书评

暂无评论

An Attentive LSTM based approach for adverse drug reactions prediction

引用

APPLIED INTELLIGENCE 2023年第5期53卷 4875-4889页

作者： Qian, Jiahui Qiu, Xihe Tan, Xiaoyu Li, Qiong Chen, Jue Jiang, Xiaoyan Shanghai Univ Engn Sci Sch Elect & Elect Engn 333 Longteng Rd Shanghai 201620 Peoples R China Ant Grp 566 Xixi Rd Hangzhou 310013 Zhejiang Peoples R China Beijing Jiaotong Univ Sch Comp Sci & Technol 3 ShangYuanCun Beijing 100044 Peoples R China

Adverse drug reactions (ADRs), which are harmful physical reactions of patients to drug treatments, are inherent to the nature of drugs;the reactions can occur with any drug and are becoming a leading cause of patient morbidity and mortality during medical procedures. ADRs can be hazardous and even fatal to patients. In traditional methods, ADRs are detected through clinical trials. To obtain a comprehensive collection of ADRs, sufficient experimental samples and time are required before a drug comes to the market, which is not a realistic possibility. Moreover, even if extensive clinical trials are performed, many undetected ADRs might still be discovered after a drug is released to the market. ADRs can lead to disastrous consequences for humanity, which obviates a dramatically increased need for precise predictions of potential ADRs as early as possible. In this paper, we propose an encoder-decoder framework based on attention mechanism and the long short-term memory (LSTM) model to predict potential ADRs. We regard the prediction of ADRs as a sequence-to-sequence problem and improve the encoder-decoder framework based on the attention mechanism to learn the interrelationships between ADRs. Unlike other classical methods utilizing molecular drug structures, our model is based solely on ADRs, which is an independent but parallel approach compared to traditional methods. We capitalize on the mask method to generate the target data and use the 5-fold cross-validation method to cyclically verify the performance of our proposed model. Based on the Top-k accuracy test results, our model outperforms the baseline models in potential ADRs predictions.

关键词： Adverse drug reactions Attention mechanism Long short-term memory sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Towards improving coherence and diversity of slogan generation

引用

NATURAL LANGUAGE ENGINEERING 2023年第2期29卷 254-286页

作者： Jin, Yiping Bhatia, Akshay Wanvarie, Dittaya Le, Phu T. V. Chulalongkorn Univ Dept Math & Comp Sci Fac Sci Bangkok 10300 Thailand Knorex 14-16 Crown Robinson Singapore 068907 Singapore

Previous work in slogan generation focused on utilising slogan skeletons mined from existing slogans. While some generated slogans can be catchy, they are often not coherent with the company's focus or style across their marketing communications because the skeletons are mined from other companies' slogans. We propose a sequence-to-sequence (seq2seq) Transformer model to generate slogans from a brief company description. A naive seq2seq model fine-tuned for slogan generation is prone to introducing false information. We use company name delexicalisation and entity masking to alleviate this problem and improve the generated slogans' quality and truthfulness. Furthermore, we apply conditional training based on the first words' part-of-speech tag to generate syntactically diverse slogans. Our best model achieved a ROUGE-1/-2/-L F-1 score of 35.58/18.47/33.32. Besides, automatic and human evaluations indicate that our method generates significantly more factual, diverse and catchy slogans than strong long short-term memory and Transformer seq2seq baselines.

关键词： Natural language generation sequence-to-sequence model Slogan generation

来源：评论

学校读者我要写书评

暂无评论

Improving unified named entity recognition by incorporating mention relevance

引用

NEURAL COMPUTING & APPLICATIONS 2023年第30期35卷 22223-22234页

作者： Ji, Lijun Yan, Danfeng Cheng, Zhuoran Song, Yan Beijing Univ Posts & Telecommun State Key Lab Networking & Switching Technol Beijing 100876 Peoples R China Shanghai Int Studies Univ Sch Business & Management Shanghai 200092 Peoples R China

Named entity recognition (NER) is a fundamental task for natural language processing, which aims to detect mentions of real-world entities from text and classifying them into predefined types. Recently, research on overlapped and discontinuous named entity recognition has received increasing attention. However, we note that few studies have considered both overlapped and discontinuous entities. In this paper, we proposed a novel sequence-to-sequence model that is capable of recognizing both overlapped and discontinuous entities based on machine reading comprehension. The model utilizes machine reading comprehension formulation to encode significant inferior information about the entity category. Then input sequence passes through a question-answering model to predict the mention relevance of the given source sentences to the query. Finally, we incorporate the mention relevance into the BART-based generation model. We conducted experiments on three type of NER datasets to show the generality of our model. The experimental results demonstrate that our model beats almost all the current top-performing baselines achieves a vast amount of performance boost over current SOTA models on overlapped and discontinuous NER datasets.

关键词： Named entity recognition sequence-to-sequence model Machine reading comprehension Mention relevance attention

来源：评论

学校读者我要写书评

暂无评论

A response generator with response-aware encoder for generating specific and relevant responses

引用

SOFT COMPUTING 2023年第7期27卷 3721-3732页

作者： Kim, So-Eon Song, Hyun-Je Park, Seong-Bae Kyung Hee Univ Dept Comp Sci & Engn 1732 Deogyeong Yongin 17104 Gyeonggi South Korea Jeonbuk Natl Univ Dept Informat Technol 567 Baekje Jeonju 54896 Jeollabuk South Korea

The dialogue data usually consist of the pairs of a query and its response, but no previous response generators have exploited the responses explicitly in their training while a response provides significant information about the meaning of a query. Therefore, this paper proposes a sequence-to-sequence response generator with a response-aware encoder. The proposed generator exploits golden responses by reflecting them into query representation. For this purpose, the response-aware encoder adds a relevancy scorer layer to the transformer encoder that calculates the relevancy of query tokens to a response. However, golden responses are available only during training of the response generator and unavailable at inference time. As a solution to this problem, the joint learning of a teacher and a student relevancy scorer is adopted. That is, at the training time, both the teacher and the student relevancy scorers are optimized but the decoder generates a response using only the relevancy of the teacher scorer. However, at the inference time, the decoder uses that of the student scorer. Since the student scorer is trained to minimize the difference from the teacher scorer, it can be used to compute the relevancy of a prospective response. The proposed model is the first attempt to use a golden response directly for generating a query representation, whereas previous studies used the responses for its implicit and indirect reflection. As a result, it achieved higher dialogue evaluation score than the current state-of-the-art model for Reddit, Persona-Chat, and DailyDialog data sets.

关键词： Response generator sequence-to-sequence model Response-aware Transformer architecture Natural language processing

来源：评论

学校读者我要写书评

暂无评论

A Survey of Research and Application of NLP-based Machine Translation 6

A Survey of Research and Application of NLP-based Machine Tr...

引用

6th International Conference on Natural Language Processing (ICNLP)

作者： Liu, Youyao Ma, Yuechi Zhou, Sicong Luo, Xun Xian Univ Posts & Telecommun Sch Elect Engn Xian Peoples R China

ISBN: (纸本)9798350349122;9798350349115

Machine translation is the process of using computers to convert one natural language into another natural language, shouldering the important task of building a language communication bridge. It has always been a concern research direction in natural language processing. As the latest paradigm of machine translation, neural machine translation completely relies on a neural network to execute the translation process from the source language to the target language. Thanks to the development of artificial intelligence, rich research results have been achieved in recent years, effectively alleviating the bottleneck problem of statistical machine translation. This paper first compares neural machine translation with other machine translation methods, then introduces the mainstream neural machinetranslation models, and finally introduces the problems and challenges faced by neural machine translation.

关键词： machine translation neural machine translation sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

UCSC-CGEC: A Unified Approach For Chinese Spelling Check And Grammatical Error Correction

UCSC-CGEC: A Unified Approach For Chinese Spelling Check And...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Su, Jindian Xie, Yunhao Mou, Yueqi South China Univ Technol Sch Comp Sci & Engn Guangzhou Peoples R China South China Univ Technol Sch Software Engn Guangzhou Peoples R China

ISBN: (纸本)9798350359329;9798350359312

Chinese Spelling Check (CSC) and Chinese Grammatical Error Correction (CGEC) are two important and challenging tasks in the Natural Language Processing (NLP) field. The former aims to detect and correct Chinese misspellings while the latter focuses on grammatical errors in sentences. Existing methods treat them as two separate tasks, sequence labeling, and conditional text generation respectively. As a consequence, a single encoder is typically selected as the backbone network to handle the CSC task whereas an encoder-decoder structure becomes a requisite for the CGEC task. However, in real-world applications, it is inefficient for a system to determine whether an input sentence contains spelling or grammatical errors and subsequently select different models according to the decision from the previous step. In this paper, to address these two tasks effectively, we propose a unified approach, denoted as UCSC-CGEC, based on a standard Transformer encoder-decoder structure. Notably, we choose to use a recent dataset named CSCD-IME instead of SIGHAN to ensure higher data quality in the CSC task. Additionally, to reduce the training difficulty and enhance generation quality, we introduce Copy Mechanism. Furthermore, to improve training efficiency and reduce cost, we adopt AdaLoRA, a Parameter-Efficient Fine-Tuning (PEFT) method, rather than fine-tuning the model with entire parameter set during the training phase. Experiments are conducted on CSCD-IME and NLPCC2018 datasets, and the results indicate the superiority of our approach when compared to all baseline models.

关键词： Chinese Spelling Check Chinese Grammatical Error Correction sequence-to-sequence model ParameterEfficient Fine-Tuning Copy Mechanism

来源：评论

学校读者我要写书评

暂无评论

Convolutional Auto-Encoder for Variable Length Respiratory Sound Compression and Reconstruction

Convolutional Auto-Encoder for Variable Length Respiratory S...

引用

2024 Biomedical Circuits and Systems Conference

作者： Tao, Shuailin Ho, Jinhai Goh, Wang Ling Gao, Yuan Nanyang Technol Univ Interdisciplinary Grad Programme AI X Singapore Singapore Nanyang Technol Univ NTU Sch Elect & Elect Engn Singapore Singapore ASTAR Inst Microelect IME Singapore Singapore

ISBN: (纸本)9798350354966;9798350354959

This paper presents a respiratory sound compression and reconstruction method based on convolutional Auto-Encoder. By utilizing convolutional and transpose convolutional layers, this model can process variable length sound waveform, which is an important feature for data transmission from edge-based medical devices to cloud server and reconstruct the signal with high fidelity. This work shows that utilizing a non-variational latent space in respiratory sounds compression generates smaller reconstruction error compared to other state-of-art solution. Additionally, this work proposes a new composite loss function to guide the network training. Tested with BioCAS 2024 Grand Challenge dataset, this method achieves a Percent Root Mean Square Difference (PRD) of 0.2230, Correlation Coefficient (CC) of 0.972, and Signal-to-Noise Ratio Loss (SNRL) -0.7129 dB with a compression rate of 222.

关键词： Audio Compression Respiratory sound processing sequence-to-sequence model Auto-Encoders

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：