Transformers have been widely studied in many natural language processing (NLP) tasks, which can capture the dependency from the whole sentence with a high parallelizability thanks to the multi-head attention and the ...
详细信息
Transformers have been widely studied in many natural language processing (NLP) tasks, which can capture the dependency from the whole sentence with a high parallelizability thanks to the multi-head attention and the position-wise feed-forward network. However, the above two components of transformers are position-independent, which causes transformers to be weak in modeling sentence structures. Existing studies commonly utilized positional encoding or mask strategies for capturing the structural information of sentences. In this paper, we aim at strengthening the ability of transformers on modeling the linear structure of sentences from three aspects, containing the absolute position of tokens, the relative distance, and the direction between tokens. We propose a novel bidirectional Transformer with absolute-position aware relative position encoding (BiAR-Transformer) that combines the positional encoding and the mask strategy together. We model the relative distance between tokens along with the absolute position of tokens by a novel absolute-position aware relative position encoding. Meanwhile, we apply a bidirectional mask strategy for modeling the direction between tokens. Experimental results on the natural language inference, paraphrase identification, sentiment classification and machine translation tasks show that BiAR-Transformer achieves superior performance than other strong baselines.
Slot filling,to extract entities for specific types of information(slot),is a vitally important modular of dialogue systems for automatic *** responses can be regarded as the weak supervision of patient *** this way,a...
详细信息
Slot filling,to extract entities for specific types of information(slot),is a vitally important modular of dialogue systems for automatic *** responses can be regarded as the weak supervision of patient *** this way,a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue,alleviating the problem of costly and time-consuming data ***,weakly labeled data suffers from extremely noisy *** alleviate the problem,we propose a simple and effective Co-WeakTeaching *** method trains two slot filling models *** two models learn from two different weakly labeled data,ensuring learning from two ***,one model utilizes selected weakly labeled data generated by the other,*** model,obtained by the Co-WeakTeaching on weakly labeled data,can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated *** results on these two settings illustrate the effectiveness of the method with an increase of 8.03%and 14.74%in micro and macro f1 scores,respectively.
In the realm of deep learning, Generative Adversarial Networks (GANs) have emerged as a topic of significant interest for their potential to enhance model performance and enable effective data augmentation. This paper...
详细信息
Predicting RNA binding protein(RBP) binding sites on circular RNAs(circ RNAs) is a fundamental step to understand their interaction mechanism. Numerous computational methods are developed to solve this problem, but th...
详细信息
Predicting RNA binding protein(RBP) binding sites on circular RNAs(circ RNAs) is a fundamental step to understand their interaction mechanism. Numerous computational methods are developed to solve this problem, but they cannot fully learn the features. Therefore, we propose circ-CNNED, a convolutional neural network(CNN)-based encoding and decoding framework. We first adopt two encoding methods to obtain two original matrices. We preprocess them using CNN before fusion. To capture the feature dependencies, we utilize temporal convolutional network(TCN) and CNN to construct encoding and decoding blocks, respectively. Then we introduce global expectation pooling to learn latent information and enhance the robustness of circ-CNNED. We perform circ-CNNED across 37 datasets to evaluate its effect. The comparison and ablation experiments demonstrate that our method is superior. In addition, motif enrichment analysis on four datasets helps us to explore the reason for performance improvement of circ-CNNED.
Multiarmed bandit(MAB) models are widely used for sequential decision-making in uncertain environments, such as resource allocation in computer communication systems.A critical challenge in interactive multiagent syst...
Multiarmed bandit(MAB) models are widely used for sequential decision-making in uncertain environments, such as resource allocation in computer communication systems.A critical challenge in interactive multiagent systems with bandit feedback is to explore and understand the equilibrium state to ensure stable and tractable system performance.
Voronoi diagrams on triangulated surfaces based on the geodesic metric play a key role in many applications of computer *** methods of constructing such Voronoi diagrams generally depended on having an exact geodesic ...
详细信息
Voronoi diagrams on triangulated surfaces based on the geodesic metric play a key role in many applications of computer *** methods of constructing such Voronoi diagrams generally depended on having an exact geodesic ***,exact geodesic computation is time-consuming and has high memory usage,limiting wider application of geodesic Voronoi diagrams(GVDs).In order to overcome this issue,instead of using exact methods,we reformulate a graph method based on Steiner point insertion,as an effective way to obtain geodesic ***,since a bisector comprises hyperbolic and line segments,we utilize Apollonius diagrams to encode complicated structures,enabling Voronoi diagrams to encode a medial-axis surface for a dense set of boundary *** on these strategies,we present an approximation algorithm for efficient Voronoi diagram construction on triangulated *** also suggest a measure for evaluating similarity of our results to the exact *** our GVD results are constructed using approximate geodesic distances,we can get GVD results similar to exact results by inserting Steiner points on triangle *** results on many 3D models indicate the improved speed and memory requirements compared to previous leading methods.
A multi-secret image sharing (MSIS) scheme facilitates the secure distribution of multiple images among a group of participants. Several MSIS schemes have been proposed with a (n, n) structure that encodes secret...
详细信息
Real-time systems are widely implemented in the Internet of Things(IoT) and safety-critical systems, both of which have generated enormous social value. Aiming at the classic schedulability analysis problem in real-ti...
详细信息
Real-time systems are widely implemented in the Internet of Things(IoT) and safety-critical systems, both of which have generated enormous social value. Aiming at the classic schedulability analysis problem in real-time systems, we proposed an exact Boolean analysis based on interference(EBAI) for schedulability analysis in real-time systems. EBAI is based on worst-case interference time(WCIT), which considers both the release jitter and blocking time of the task. We improved the efficiency of the three existing tests and provided a comprehensive summary of related research results in the field. Abundant experiments were conducted to compare EBAI with other related results. Our evaluation showed that in certain cases, the runtime gain achieved using our analysis method may exceed 73% compared to the stateof-the-art schedulability test. Furthermore, the benefits obtained from our tests grew with the number of tasks, reaching a level suitable for practical application. EBAI is oriented to the five-tuple real-time task model with stronger expression ability and possesses a low runtime overhead. These characteristics make it applicable in various real-time systems such as spacecraft, autonomous vehicles, industrial robots, and traffic command systems.
Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillati...
详细信息
Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation(SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation(TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization ***, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,in...
详细信息
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound *** existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,*** address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule *** MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding *** transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the *** approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the ***,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation *** results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)*** findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.
暂无评论