检索结果-内蒙古大学图书馆

Bio-K-Transformer: A pre-trained transformer-based sequence-to-sequence model for adverse drug reactions prediction

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025年 260卷 108524-108524页

作者： Qiu, Xihe Shao, Siyue Wang, Haoyu Tan, Xiaoyu Shanghai Univ Engn Sci Sch Elect & Elect Engn Shanghai Peoples R China INF Technol Shanghai Co Ltd Shanghai Peoples R China

Background and Objective: Adverse drug reactions (ADRs) pose a serious threat to patient health, potentially resulting in severe consequences, including mortality. Accurate prediction of ADRs before drug market release is crucial for early prevention. Traditional ADR detection, relying on clinical trials and voluntary reporting, has inherent limitations. Clinical trials face challenges in capturing rare and long-term reactions due to scale and time constraints, while voluntary reporting tends to neglect mild and common reactions. Consequently, drugs on the market may carry unknown risks, leading to an increasing demand for more accurate predictions of ADRs before their commercial release. This study aims to develop a more accurate prediction model for ADRs prior to drug market release. Methods: We frame the ADR prediction task as a sequence-to-sequence problem and propose the Bio-KTransformer, which integrates the transformer model with pre-trained models ( i.e. , Bio_ClinicalBERT and K-bert), to forecast potential ADRs. We enhance the attention mechanism of the Transformer encoder structure and adjust embedding layers to model diverse relationships between drug adverse reactions. Additionally, we employ a masking technique to handle target data. Experimental findings demonstrate a notable improvement in predicting potential adverse reactions, achieving a predictive accuracy of 90.08%. It significantly exceeds current state-of-the-art baseline models and even the fine-tuned Llama-3.1-8B and Llama3-Aloe-8B-Alpha model, while being cost-effective. The results highlight the model's efficacy in identifying potential adverse reactions with high precision, sensitivity, and specificity. Conclusion: The Bio-K-Transformer significantly enhances the prediction of ADRs, offering a cost-effective method with strong potential for improving pre-market safety evaluations of pharmaceuticals.

关键词： Adverse drug reactions Transformer Diagonal-masked sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

An Exploratory Study of Abstractive Text Summarization Using a sequence-to-sequence model

An Exploratory Study of Abstractive Text Summarization Using...

引用

2023 Intelligent Computing and Control for Engineering and Business Systems, ICCEBS 2023

作者： Kavitha, M. Akila, K. Srm Institute of Science and Technology College of Engineering and Technology Department of Cse Chennai Vadapalani India

ISBN: (纸本)9798350394580

Text summarization has evolved over a period of time in various domains and benefits most professionals and researchers. To provide salient summarization in a short span of time, various approaches to text summarization are discovered. The objective of the paper is to increase the efficiency of the model by delivering a concise and precise summary. The intention of text summarization is to provide sufficient topic coverage and readability. The paper also presents the applications and limitations of text summarization. The study utilized a sequence-to-sequence model stacked with LSTM networks for text summarization. The results of the sequence-to-sequence model are better than other conventional models. © 2023 IEEE.

关键词： Abstractive summarization Decoder Encoder Extractive summarization LSTM sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

A sequence-to-sequence model for Large-scale Chinese Abbreviation Database Construction 22

A Sequence-to-Sequence Model for Large-scale Chinese Abbrevi...

引用

15th ACM International Conference on Web Search and Data Mining (WSDM)

作者： Wang, Chao Liu, Jingping Zhuang, Tianyi Li, Jiahang Liu, Juntao Xiao, Yanghua Wang, Wei Xie, Rui Fudan Univ Sch Comp Sci Shanghai Key Lab Data Sci Shanghai Peoples R China Fudan Aishu Cognit Intelligence Joint Res Ctr Shanghai Peoples R China East China Univ Sci & Technol Sch Informat Sci & Engn Shanghai Peoples R China Meituan Shanghai Peoples R China Fudan Univ Shanghai Peoples R China

ISBN: (纸本)9781450391320

Abbreviations often used in our daily communication play an important role in natural language processing. Most of the existing studies regard the Chinese abbreviation prediction as a sequence labeling problem. However, sequence labeling models usually ignore label dependencies in the process of abbreviation prediction, and the label prediction of each character should be conditioned on its previous labels. In this paper, we propose to formalize the Chinese abbreviation prediction task as a sequence generation problem, and a novel sequence-to-sequence model is designed. To boost the performance of our deep model, we further propose a multi-level pre-trained model that incorporates character, word, and concept-level embeddings. To evaluate our methods, a new dataset for Chinese abbreviation prediction is automatically built, which contains 81,351 pairs of full forms and abbreviations. Finally, we conduct extensive experiments on a public dataset and the built dataset, and the experimental results on both datasets show that our model outperforms the state-of-the-art methods. More importantly, we build a large-scale database for a specific domain, i.e., life services in Meituan Inc., with high accuracy of about 82.7%, which contains 4,134,142 pairs of full forms and abbreviations. The online A/B testing on Meituan APP and Dianping APP suggests that Click-Through Rate increases by 0.59% and 0.86% respectively when the built database is used in the searching system. We have released our API on http://***/ddemos/abbr/ with over 87k API calls in 9 months.

关键词： Chinese abbreviation sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks

引用

NATURAL LANGUAGE ENGINEERING 2021年第5期27卷 545-574页

作者： Kanerva, Jenna Ginter, Filip Salakoski, Tapio Univ Turku Dept Future Technol TurkuNLP Grp Turku Finland

In this paper, we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer generates the lemma one character at a time based on the surface form characters and its morphosyntactic features obtained from a morphological tagger. We argue that a sliding window context representation suffers from sparseness, while in majority of cases the morphosyntactic features of a word bring enough information to resolve lemma ambiguities while keeping the context representation dense and more practical for machine learning systems. Additionally, we study two different data augmentation methods utilizing autoencoder training and morphological transducers especially beneficial for low-resource languages. We evaluate our lemmatizer on 52 different languages and 76 different treebanks, showing that our system outperforms all latest baseline systems. Compared to the best overall baseline, UDPipe Future, our system outperforms it on 62 out of 76 treebanks reducing errors on average by 19% relative. The lemmatizer together with all trained models is made available as a part of the Turku-neural-parsing-pipeline under the Apache 2.0 license.

关键词： Lemmatization Universal Dependencies Parsing sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

U-TILISE: A sequence-to-sequence model for Cloud Removal in Optical Satellite Time Series

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2023年 61卷 1页

作者： Stucker, Corinne Garnot, Vivien Sainte Fare Schindler, Konrad Swiss Fed Inst Technol Chair Photogrammetry & Remote Sensing CH-8093 Zurich Switzerland Univ Zurich Inst Computat Sci CH-8057 Zurich Switzerland

Satellite image time series in the optical and infrared spectrum suffer from frequent data gaps due to cloud cover, cloud shadows, and temporary sensor outages. It has been a long-standing problem of remote sensing research how to best reconstruct the missing pixel values and obtain complete, cloud-free image sequences. We approach that problem from the perspective of representation learning and develop U-TILISE, an efficient neural model that is able to implicitly capture spatio-temporal patterns of the spectral intensities, and that can therefore be trained to map a cloud-masked input sequence to a cloud-free output sequence. The model consists of a convolutional spatial encoder that maps each individual frame of the input sequence to a latent encoding;an attention-based temporal encoder that captures dependencies between those per-frame encodings and lets them exchange information along the time dimension;and a convolutional spatial decoder that decodes the latent embeddings back into multi-spectral images. We experimentally evaluate the proposed model on EarthNet2021, a dataset of Sentinel-2 time series acquired all over Europe, and demonstrate its superior ability to reconstruct the missing pixels. Compared to a standard interpolation baseline, it increases the PSNR by 1.8 dB at previously seen locations and by 1.3 dB at unseen locations.

关键词： Clouds Optical imaging Time series analysis Optical sensors Cloud computing Adaptive optics Optical reflection Cloud removal image time series reconstruction self-attention Sentinel-2 sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Centralized Traffic Signal Control for Multiple Intersections based on sequence-to-sequence model and Attention Mechanism

Centralized Traffic Signal Control for Multiple Intersection...

引用

IEEE Intelligent Transportation Systems Conference (ITSC)

作者： Ma, Le Xue, Bo Wu, Jia Univ Elect Sci & Technol China Sch Informat & Software Engn Chengdu Peoples R China

ISBN: (纸本)9781728191423

Recently, the use of deep reinforcement learning techniques (DRL) has attracted increasing interest due to its ability of dynamical traffic signal control for multiple intersections. Only a few researches use the centralized control with single-agent to intelligently control all the signals, because the major problem, i.e., the curse of dimensionality, has not been successfully solved. We propose a novel centralized control method based on sequence-to-sequence model and attention mechanism to deal with this problem. The idea is similar to the Divide and Conquer paradigm. We mitigate the difficulty of searching in the huge space by dividing the state and action space into sub-spaces through sequence-to-sequence model. In addition, we greatly facilitate the communication and cooperation of traffic signals among intersections by introducing the attention mechanism. The DRL agent is trained by an efficient off-policy learning method - Proximal Policy Optimization. To the best of our knowledge, we are the first to use sequence-to-sequence method to deal with the huge search space problem in the traffic control. The comprehensive experiments demonstrate that our method can efficiently solve the curse of dimensionalitly problem and outperforms the traditional methods and other centralized control methods based on DRL.

关键词： Traffic signal control Deep reinforcement learning sequence-to-sequence model Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Deep learning based sequence to sequence model for abstractive telugu text summarization

引用

MULTIMEDIA TOOLS AND APPLICATIONS 2023年第11期82卷 17075-17096页

作者： Babu, G. L. Anand Badugu, Srinivasu Osmania Univ Univ Coll Engn Hyderabad India Stanley Coll Engn & Technol Women Dept CSE Hyderabad India

With the emergence of deep learning, the attention of researchers has increased significantly towards abstractive text summarization approaches. Though extractive text summarization (ETS) is an important approach, the generated summaries are not always coherent. This paper mainly focuses on the abstractive text summarization (ATS) approach for Telugu language to generate coherent summary. The majority research on ATS approach is conducted in English, while no significant research in Telugu has been documented. An abstractive Telugu text summarization model based on sequence-to-sequence (seq2seq) encoder-decoder architecture is proposed in this paper. The seq2seq model is implemented with bidirectional long short-term memory (Bi-LSTM) based encoder and long short-term memory (LSTM) based decoder. The existing ATS approaches have some drawbacks such as they cannot handle out vocabulary words, attention deficiency issue arising while handling long text sequence and repetition problem. To overcome these issues, some operating mechanisms like pointer generator network, temporal attention mechanism and coverage mechanism are also integrated in the proposed model. Besides, diverse beam search decoding algorithm is also employed to increase the diversity of generated summary. Thus, the proposed seq2seq model is the combination of Bi-LSTM and LSTM based encoder-decoder, pointer generator network, temporal attention mechanism, coverage mechanism and diverse beam search decoding algorithm. The performance of the proposed work is evaluated using the ROUGE toolkit in terms of F-measure, recall and precision. The experimental results of the proposed scheme are evaluated with other existing methods to show that the proposed ATS model outperforms existing Telugu text summarization models.

关键词： sequence-to-sequence model Temporal attention mechanism Coverage mechanism Diverse beam search Pointer generator network

来源：评论

学校读者我要写书评

暂无评论

Detecting multi-stage attacks using sequence-to-sequence model

引用

COMPUTERS & SECURITY 2021年 105卷 102203-102203页

作者： Zhou, Peng Zhou, Gongyan Wu, Dakui Fei, Minrui Shanghai Univ Shanghai Key Lab Power Stn Automat Technol Shanghai Peoples R China

Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use of hidden markov model (HMM). However, despite the HMM can model the relationships and dependencies among different alerts and stages for detection, they cannot handle well the stage dependencies buried in a longer sequence of alerts. In this paper, we tackle the challenge of the stages' long-term dependency and propose a new detection solution using a sequence-to-sequence (seq2seq) model. The basic idea is to encode a sequence of alerts (i.e., detector's observation) into a latent feature vector using a long-short term memory (LSTM) network and then decode this vector to a sequence of predicted attacking stages with another LSTM. By the encoder-decoder collaboration, we can decouple the local constraint between the observed alerts and the potential attacking stages, and thus able to take the full knowledge of all the alerts for the detection of stages in a sequence basis. By the LSTM, we can learn to "forget" irrelevant alerts and thereby have more opportunities to "remember" the long-term dependency between different stages for our sequence detection. To evaluate our model's effectiveness, we have conducted extensive experiments using four public datasets, all of which include simulated or re-constructed samples of real-world multi-stage attacks in controlled testbeds. Our results have successfully confirmed the better detection performance of our model compared with the previous HMM solutions. (c) 2021 Elsevier Ltd. All rights reserved.

关键词： Multi-stage attack Intrusion detection sequence-to-sequence model Encoder-decoder architecture Long-short term memory (LSTM) network

来源：评论

学校读者我要写书评

暂无评论

Comparing Three Data Representations for Music with a sequence-to-sequence model 33rd

Comparing Three Data Representations for Music with a Sequen...

引用

33rd Australasian Joint Conference on Artificial Intelligence (AI)

作者： Li, Sichao Martin, Charles Patrick Australian Natl Univ Res Sch Comp Sci Canberra Australia

ISBN: (纸本)9783030649838;9783030649845

The choices of neural network model and data representation, a mapping between musical notation and input signals for a neural network, have emerged as a major challenge in creating convincing models for melody generation. Music generation can inspire creativity in artists and the general public, but choosing a proper data representation is complicated because the same musical piece can be presented in a range of expressive ways. In this paper, we compare three different data representations on the task of generating melodies with a sequence-to-sequence model, which generates melodies with flexible length, to explore how they affect the performance of generated music. These three representations are: a monophonic representation, playing one note each time, a polyphonic representation, indicating simultaneous notes and a complex polyphonic representation, expanding the polyphonic representation with dynamics. The influences of three data representations on the generated performance are compared and evaluated by mathematical analysis and human-cantered evaluation. The results show that different data representations fed into the same model endow the generated music with various features, the monophonic representation makes the music sound more melodious to humans' ears, the polyphonic representation provides expressiveness and the complex-polyphonic representation guarantees the complexity of the generated music.

关键词： Machine learning Music generation Data representation sequence-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

Trajectory adjustment for nonprehensile manipulation using latent space of trained sequence-to-sequence model

引用

ADVANCED ROBOTICS 2019年第21期33卷 1144-1154页

作者： Kutsuzawa, K. Sakaino, S. Tsuji, T. Saitama Univ Grad Sch Sci & Engn Saitama Japan Saitama Univ Saitama Japan Univ Tsukuba Grad Sch Syst & Informat Engn Ibaraki Japan Univ Tsukuba JST PRESTO Ibaraki Japan

When robots are used to manipulate objects in various ways, they often have to consider the dynamic constraint. Machine learning is a good candidate for such complex trajectory planning problems. However, it sometimes does not satisfy the task objectives due to a change in the objective or a lack of guarantee that the objective functions will be satisfied. To overcome this issue, we applied a method of trajectory deformation by using sequence-to-sequence (seq2seq) models. We propose a method of adjusting the generated trajectories, by utilizing the architecture of seq2seq models. The proposed method optimizes the latent variables of the seq2seq models instead of the trajectories to minimize the given objective functions. The verification results show that the use of latent variables can obtain the desired trajectories faster than direct optimization of the trajectories.

关键词： Nonprehensile manipulation sequence-to-sequence model latent variable neural network motion optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：