Background and Objective: Adverse drug reactions (ADRs) pose a serious threat to patient health, potentially resulting in severe consequences, including mortality. Accurate prediction of ADRs before drug market releas...
详细信息
Background and Objective: Adverse drug reactions (ADRs) pose a serious threat to patient health, potentially resulting in severe consequences, including mortality. Accurate prediction of ADRs before drug market release is crucial for early prevention. Traditional ADR detection, relying on clinical trials and voluntary reporting, has inherent limitations. Clinical trials face challenges in capturing rare and long-term reactions due to scale and time constraints, while voluntary reporting tends to neglect mild and common reactions. Consequently, drugs on the market may carry unknown risks, leading to an increasing demand for more accurate predictions of ADRs before their commercial release. This study aims to develop a more accurate prediction model for ADRs prior to drug market release. Methods: We frame the ADR prediction task as a sequence-to-sequence problem and propose the Bio-KTransformer, which integrates the transformer model with pre-trained models ( i.e. , Bio_ClinicalBERT and K-bert), to forecast potential ADRs. We enhance the attention mechanism of the Transformer encoder structure and adjust embedding layers to model diverse relationships between drug adverse reactions. Additionally, we employ a masking technique to handle target data. Experimental findings demonstrate a notable improvement in predicting potential adverse reactions, achieving a predictive accuracy of 90.08%. It significantly exceeds current state-of-the-art baseline models and even the fine-tuned Llama-3.1-8B and Llama3-Aloe-8B-Alpha model, while being cost-effective. The results highlight the model's efficacy in identifying potential adverse reactions with high precision, sensitivity, and specificity. Conclusion: The Bio-K-Transformer significantly enhances the prediction of ADRs, offering a cost-effective method with strong potential for improving pre-market safety evaluations of pharmaceuticals.
Text summarization has evolved over a period of time in various domains and benefits most professionals and researchers. To provide salient summarization in a short span of time, various approaches to text summarizati...
详细信息
Abbreviations often used in our daily communication play an important role in natural language processing. Most of the existing studies regard the Chinese abbreviation prediction as a sequence labeling problem. Howeve...
详细信息
ISBN:
(纸本)9781450391320
Abbreviations often used in our daily communication play an important role in natural language processing. Most of the existing studies regard the Chinese abbreviation prediction as a sequence labeling problem. However, sequence labeling models usually ignore label dependencies in the process of abbreviation prediction, and the label prediction of each character should be conditioned on its previous labels. In this paper, we propose to formalize the Chinese abbreviation prediction task as a sequence generation problem, and a novel sequence-to-sequence model is designed. To boost the performance of our deep model, we further propose a multi-level pre-trained model that incorporates character, word, and concept-level embeddings. To evaluate our methods, a new dataset for Chinese abbreviation prediction is automatically built, which contains 81,351 pairs of full forms and abbreviations. Finally, we conduct extensive experiments on a public dataset and the built dataset, and the experimental results on both datasets show that our model outperforms the state-of-the-art methods. More importantly, we build a large-scale database for a specific domain, i.e., life services in Meituan Inc., with high accuracy of about 82.7%, which contains 4,134,142 pairs of full forms and abbreviations. The online A/B testing on Meituan APP and Dianping APP suggests that Click-Through Rate increases by 0.59% and 0.86% respectively when the built database is used in the searching system. We have released our API on http://***/ddemos/abbr/ with over 87k API calls in 9 months.
In this paper, we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer g...
详细信息
In this paper, we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer generates the lemma one character at a time based on the surface form characters and its morphosyntactic features obtained from a morphological tagger. We argue that a sliding window context representation suffers from sparseness, while in majority of cases the morphosyntactic features of a word bring enough information to resolve lemma ambiguities while keeping the context representation dense and more practical for machine learning systems. Additionally, we study two different data augmentation methods utilizing autoencoder training and morphological transducers especially beneficial for low-resource languages. We evaluate our lemmatizer on 52 different languages and 76 different treebanks, showing that our system outperforms all latest baseline systems. Compared to the best overall baseline, UDPipe Future, our system outperforms it on 62 out of 76 treebanks reducing errors on average by 19% relative. The lemmatizer together with all trained models is made available as a part of the Turku-neural-parsing-pipeline under the Apache 2.0 license.
Satellite image time series in the optical and infrared spectrum suffer from frequent data gaps due to cloud cover, cloud shadows, and temporary sensor outages. It has been a long-standing problem of remote sensing re...
详细信息
Satellite image time series in the optical and infrared spectrum suffer from frequent data gaps due to cloud cover, cloud shadows, and temporary sensor outages. It has been a long-standing problem of remote sensing research how to best reconstruct the missing pixel values and obtain complete, cloud-free image sequences. We approach that problem from the perspective of representation learning and develop U-TILISE, an efficient neural model that is able to implicitly capture spatio-temporal patterns of the spectral intensities, and that can therefore be trained to map a cloud-masked input sequence to a cloud-free output sequence. The model consists of a convolutional spatial encoder that maps each individual frame of the input sequence to a latent encoding;an attention-based temporal encoder that captures dependencies between those per-frame encodings and lets them exchange information along the time dimension;and a convolutional spatial decoder that decodes the latent embeddings back into multi-spectral images. We experimentally evaluate the proposed model on EarthNet2021, a dataset of Sentinel-2 time series acquired all over Europe, and demonstrate its superior ability to reconstruct the missing pixels. Compared to a standard interpolation baseline, it increases the PSNR by 1.8 dB at previously seen locations and by 1.3 dB at unseen locations.
Recently, the use of deep reinforcement learning techniques (DRL) has attracted increasing interest due to its ability of dynamical traffic signal control for multiple intersections. Only a few researches use the cent...
详细信息
ISBN:
(纸本)9781728191423
Recently, the use of deep reinforcement learning techniques (DRL) has attracted increasing interest due to its ability of dynamical traffic signal control for multiple intersections. Only a few researches use the centralized control with single-agent to intelligently control all the signals, because the major problem, i.e., the curse of dimensionality, has not been successfully solved. We propose a novel centralized control method based on sequence-to-sequence model and attention mechanism to deal with this problem. The idea is similar to the Divide and Conquer paradigm. We mitigate the difficulty of searching in the huge space by dividing the state and action space into sub-spaces through sequence-to-sequence model. In addition, we greatly facilitate the communication and cooperation of traffic signals among intersections by introducing the attention mechanism. The DRL agent is trained by an efficient off-policy learning method - Proximal Policy Optimization. To the best of our knowledge, we are the first to use sequence-to-sequence method to deal with the huge search space problem in the traffic control. The comprehensive experiments demonstrate that our method can efficiently solve the curse of dimensionalitly problem and outperforms the traditional methods and other centralized control methods based on DRL.
With the emergence of deep learning, the attention of researchers has increased significantly towards abstractive text summarization approaches. Though extractive text summarization (ETS) is an important approach, the...
详细信息
With the emergence of deep learning, the attention of researchers has increased significantly towards abstractive text summarization approaches. Though extractive text summarization (ETS) is an important approach, the generated summaries are not always coherent. This paper mainly focuses on the abstractive text summarization (ATS) approach for Telugu language to generate coherent summary. The majority research on ATS approach is conducted in English, while no significant research in Telugu has been documented. An abstractive Telugu text summarization model based on sequence-to-sequence (seq2seq) encoder-decoder architecture is proposed in this paper. The seq2seq model is implemented with bidirectional long short-term memory (Bi-LSTM) based encoder and long short-term memory (LSTM) based decoder. The existing ATS approaches have some drawbacks such as they cannot handle out vocabulary words, attention deficiency issue arising while handling long text sequence and repetition problem. To overcome these issues, some operating mechanisms like pointer generator network, temporal attention mechanism and coverage mechanism are also integrated in the proposed model. Besides, diverse beam search decoding algorithm is also employed to increase the diversity of generated summary. Thus, the proposed seq2seq model is the combination of Bi-LSTM and LSTM based encoder-decoder, pointer generator network, temporal attention mechanism, coverage mechanism and diverse beam search decoding algorithm. The performance of the proposed work is evaluated using the ROUGE toolkit in terms of F-measure, recall and precision. The experimental results of the proposed scheme are evaluated with other existing methods to show that the proposed ATS model outperforms existing Telugu text summarization models.
Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use...
详细信息
Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use of hidden markov model (HMM). However, despite the HMM can model the relationships and dependencies among different alerts and stages for detection, they cannot handle well the stage dependencies buried in a longer sequence of alerts. In this paper, we tackle the challenge of the stages' long-term dependency and propose a new detection solution using a sequence-to-sequence (seq2seq) model. The basic idea is to encode a sequence of alerts (i.e., detector's observation) into a latent feature vector using a long-short term memory (LSTM) network and then decode this vector to a sequence of predicted attacking stages with another LSTM. By the encoder-decoder collaboration, we can decouple the local constraint between the observed alerts and the potential attacking stages, and thus able to take the full knowledge of all the alerts for the detection of stages in a sequence basis. By the LSTM, we can learn to "forget" irrelevant alerts and thereby have more opportunities to "remember" the long-term dependency between different stages for our sequence detection. To evaluate our model's effectiveness, we have conducted extensive experiments using four public datasets, all of which include simulated or re-constructed samples of real-world multi-stage attacks in controlled testbeds. Our results have successfully confirmed the better detection performance of our model compared with the previous HMM solutions. (c) 2021 Elsevier Ltd. All rights reserved.
The choices of neural network model and data representation, a mapping between musical notation and input signals for a neural network, have emerged as a major challenge in creating convincing models for melody genera...
详细信息
ISBN:
(纸本)9783030649838;9783030649845
The choices of neural network model and data representation, a mapping between musical notation and input signals for a neural network, have emerged as a major challenge in creating convincing models for melody generation. Music generation can inspire creativity in artists and the general public, but choosing a proper data representation is complicated because the same musical piece can be presented in a range of expressive ways. In this paper, we compare three different data representations on the task of generating melodies with a sequence-to-sequence model, which generates melodies with flexible length, to explore how they affect the performance of generated music. These three representations are: a monophonic representation, playing one note each time, a polyphonic representation, indicating simultaneous notes and a complex polyphonic representation, expanding the polyphonic representation with dynamics. The influences of three data representations on the generated performance are compared and evaluated by mathematical analysis and human-cantered evaluation. The results show that different data representations fed into the same model endow the generated music with various features, the monophonic representation makes the music sound more melodious to humans' ears, the polyphonic representation provides expressiveness and the complex-polyphonic representation guarantees the complexity of the generated music.
When robots are used to manipulate objects in various ways, they often have to consider the dynamic constraint. Machine learning is a good candidate for such complex trajectory planning problems. However, it sometimes...
详细信息
When robots are used to manipulate objects in various ways, they often have to consider the dynamic constraint. Machine learning is a good candidate for such complex trajectory planning problems. However, it sometimes does not satisfy the task objectives due to a change in the objective or a lack of guarantee that the objective functions will be satisfied. To overcome this issue, we applied a method of trajectory deformation by using sequence-to-sequence (seq2seq) models. We propose a method of adjusting the generated trajectories, by utilizing the architecture of seq2seq models. The proposed method optimizes the latent variables of the seq2seq models instead of the trajectories to minimize the given objective functions. The verification results show that the use of latent variables can obtain the desired trajectories faster than direct optimization of the trajectories.
暂无评论