In recent years, deep reinforcement learning (DRL) approaches have generated highly successful controllers for a myriad of complex domains. However, the opaque nature of these models limits their applicability in aero...
详细信息
ISBN:
(数字)9798350349610
ISBN:
(纸本)9798350349627
In recent years, deep reinforcement learning (DRL) approaches have generated highly successful controllers for a myriad of complex domains. However, the opaque nature of these models limits their applicability in aerospace systems and sasfety-critical domains, in which a single mistake can have dire consequences. In this paper, we present novel advancements in both the training and verification of DRL controllers, which can help ensure their safe behavior. We showcase a design-for-verification approach utilizing k-induction and demonstrate its use in verifying liveness properties. In addition, we also give a brief overview of neural Lyapunov Barrier certificates and summarize their capabilities on a case study. Finally, we describe several other novel reachability-based approaches which, despite failing to provide guarantees of interest, could be effective for verification of other DRL systems, and could be of further interest to the community.
Knowledge representation is becoming an effective way of information extraction. However, considerable studies ignored its application scenarios in the zero-shot setting. In this paper, we propose a novel framework fo...
详细信息
Knowledge representation is becoming an effective way of information extraction. However, considerable studies ignored its application scenarios in the zero-shot setting. In this paper, we propose a novel framework for prompt language models based on external ontology knowledge called Knowledge-Based Prompt Tuning for the Zero-shot Relation Triplet Extraction (KBPT), which encourages further investigation in low-resource regimes to address the data scarcity problem in Relation Triplet Extraction (RTE). The major task of relation triplet extraction in zero-shot learning is to extract multiple triplets that are consisted of head entities, tail entities, and relation labels from an input sentence where the extracted relation labels do not exist in the training set. The fundamental idea of prompt tuning is to construct a prompt template, then append it behind the input text as the pre-trained language models (PLMs) input, thus, transforming the classification task into a masked language model prediction. However, our proposed model does not involve the mask language model prediction but a well-designed prompt template in a structured text format to generate synthetic training data containing the unseen relation category. Concretely, we utilize the relation labels and incorporate virtual tokens sources from the relation semantics to construct a structured prompt template for generating synthetic training instances. Moreover, to further enrich and supplement prior knowledge, we draw on ontology schema based on external knowledge bases to enhance the capability of semantic representation in the prompt template. To address the problem of knowledge heterogeneity, we synergistically optimize these embedding representations by way of collective training. In addition, we carefully design a Multiple Triplets Decoding (MTD) algorithm to break through the limitation of extracting multiple relation triplets in a sentence, and our proposed model is model-agnostic and can be orthogon
In this paper, we investigate several classes of permutation pentanomials over $${{\mathbb {F}}}_{2^{2m}}$$ of the form $$f(x)=x^t+x^{r_1(q-1)+t}+x^{r_2(q-1)+t}+x^{r_3(q-1)+t}+x^{r_4(q-1)+t}$$ with $$ 1\le r_i\le t$$ ...
In this paper, we investigate several classes of permutation pentanomials over $${{\mathbb {F}}}_{2^{2m}}$$ of the form $$f(x)=x^t+x^{r_1(q-1)+t}+x^{r_2(q-1)+t}+x^{r_3(q-1)+t}+x^{r_4(q-1)+t}$$ with $$ 1\le r_i\le t$$ for $$i\in [1,4]$$ . A new technique is presented to describe the sufficient condition for f(x) to be a permutation through investigating two kinds of irreducible factors, which are called polynomials of nonzero trace and zero trace, of some certain polynomials over $${{\mathbb {F}}}_{2}$$ . We resolve the open problem the authors left in Zhang et al. (Finite Fields Appl 98:102468, 2024). Numerical results suggest that the results in this paper seem to contain all the permutation pentanomials of that form with $$\textrm{gcd}(x^{r_4}+x^{r_3}+x^{r_2}+x^{r_1}+1,x^t+x^{t-r_1}+x^{t-r_2}+x^{t-r_3}+x^{t-r_4})=1$$ for $$t>23$$ and the conditions presented in Theorems 3.1, 3.7, 3.10 and 3.14 of this paper are also necessary.
Gone are the days when agriculture was said to be a profession of uneducated people who used only basic mechanical tools to survive. Use of technology in agriculture has transformed it into a high-tech, smart and effi...
详细信息
The field of clinical natural language processing (NLP) can extract useful information from clinical text. Since 2017, the NLP field has shifted towards using pre-trained language models (PLMs), improving performance ...
详细信息
The issue of building evacuation in the event of a fire is a significant concern in urban planning and architecture. In the absence of appropriate measures, an emergency situation can potentially result in disastrous ...
详细信息
Dynamic graphs (DG) describe dynamic interactions between entities in many practical scenarios. Most existing DG representation learning models combine graph convolutional network and sequence neural network, which mo...
详细信息
Multi-hop Knowledge Reasoning is a task that involves generating an answer given a query and a knowledge graph. Existing sequence-to-sequence reasoning models use the Transformer to encode and decode sequences, but th...
详细信息
ISBN:
(数字)9798350375107
ISBN:
(纸本)9798350375114
Multi-hop Knowledge Reasoning is a task that involves generating an answer given a query and a knowledge graph. Existing sequence-to-sequence reasoning models use the Transformer to encode and decode sequences, but these models have some flaws, such as the inability to effectively handle long-sequence reasoning and susceptibility to exposure bias. To address these issues, we provides a sequence-to-sequence reasoning model named STSR, which is based on Retentive Network. aiming at improving training efficiency through parallel training, this model leverages the advantages in Retentive Network, reduce time overhead, and enhance reasoning efficiency through iterative reasoning. It effectively mitigates the problems of high spatial overhead and low efficiency in long-sequence reasoning faced by the Transformer. Moreover, it retains and updates historical information during the encoding and decoding process, enhancing the model’s memory and generalization capabilities. Additionally, the Scheduled Sampling method is adopted to alleviate the exposure bias introduced during reasoning, which involves using the model’s own output as the input for the next moment with a certain probability during training, instead of using the true label. We conduct the experiment on six public datasets, with the results showing that the proposed model outperforms existing baseline models in terms of precision and generation quality. This paper provides a new solution for the task of sequence-to-sequence multihop knowledge reasoning and also demonstrates the potential application of the Retentive Network in natural language processing.
Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-co...
Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions across layers and the recursive graph structure have made it challenging to establish a theoretical foundation for learning and generalization. This study introduces the first theoretical investigation of a shallow Graph Transformer for semi-supervised node classification, comprising a self-attention layer with relative positional encoding and a two-layer perceptron. Focusing on a graph data model with discriminative nodes that determine node labels and non-discriminative nodes that are class-irrelevant, we characterize the sample complexity required to achieve a desirable generalization error by training with stochastic gradient descent (SGD). This paper provides the quantitative characterization of the sample complexity and number of iterations for convergence dependent on the fraction of discriminative nodes, the dominant patterns, and the initial model errors. Furthermore, we demonstrate that self-attention and positional encoding enhance generalization by making the attention map sparse and promoting the core neighborhood during training, which explains the superior feature representation of Graph Transformers. Our theoretical results are supported by empirical experiments on synthetic and real-world benchmarks.
High-dimensional and incomplete (HDI) matrix contains many complex interactions between numerous nodes. A stochastic gradient descent (SGD)-based latent factor analysis (LFA) model is remarkably effective in extractin...
详细信息
暂无评论