Semantic analysis has great potential applications in various fields of science and the national economy. Much of the information in the world is not structured, so there is the problem of processing and extracting us...
详细信息
This paper presents a knowledge graph construction method for legal case documents and related laws, aiming to organize legal information efficiently and enhance various downstream tasks. Our approach consists of thre...
详细信息
Vision-language pre-training (VLP) on large-scale image-text pairs has achieved huge success for the cross-modal downstream tasks. The most existing pre-training methods mainly adopt a two-step training procedure, whi...
详细信息
ISBN:
(纸本)9781954085527
Vision-language pre-training (VLP) on large-scale image-text pairs has achieved huge success for the cross-modal downstream tasks. The most existing pre-training methods mainly adopt a two-step training procedure, which firstly employs a pre-trained object detector to extract region-based visual features, then concatenates the image representation and text embedding as the input of Transformer to train. However, these methods face problems of using task-specific visual representation of the specific object detector for generic cross-modal understanding, and the computation inefficiency of two-stage pipeline. In this paper, we propose the first end-to-end vision-language pre-trained model for both V+L understanding and generation, namely E2E-VLP, where we build a unified Transformer framework to jointly learn visual representation, and semantic alignments between image and text. We incorporate the tasks of object detection and image captioning into pre-training with a unified Transformer encoder-decoder architecture for enhancing visual learning. An extensive set of experiments have been conducted on well-established vision-language downstream tasks to demonstrate the effectiveness of this novel VLP paradigm.
graph-based semi-supervised learning is appealing when labels are scarce but large amounts of unlabeled data are available. These methods typically use a heuristic strategy to construct the graphbased on some fixed d...
详细信息
Despite all the advantages social networks have brought to the world, they are also a very favourable environment for the growth of so-called electronic crimes. Textual exchanges between users may include clues to cri...
详细信息
In task-oriented dialogue systems, recent dialogue state tracking methods tend to perform one-pass generation of the dialogue state based on the previous dialogue state. The mistakes of these models made at the curren...
详细信息
Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in naturallanguageprocessing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g.,...
详细信息
Wikification (entity annotation) is a challenging task in naturallanguageprocessing (NLP). It is a method to automatically enrich a text with links to Wikipedia as a knowledge base. Wikification starts from detectin...
详细信息
ISBN:
(纸本)9783030861599;9783030861582
Wikification (entity annotation) is a challenging task in naturallanguageprocessing (NLP). It is a method to automatically enrich a text with links to Wikipedia as a knowledge base. Wikification starts from detecting ambiguous mentions in the document, and later tries to disambiguate those mentions. In the core of the Wikification task, there is one other important NLP task: word representation. This paper proposes a new word representation for senses of a mention with graph convolutional networks architecture. Senses are the possible meanings of one mention, based on the knowledge base. In our representation modeling, we used the context document and the first paragraph of each Wikipedia page to enhance our contextual representation. Using the nearest neighbor algorithm for disambiguating the mentions via our sense representations, we show the efficiency of our representations. The results of comparing our method with recent state-of-the-art methods show the efficiency of our solution.
Learning low-dimensional representations of networked documents is a crucial task for documents linked in network structures. Relational Topic Models (RTMs) have shown their strengths in modeling both document content...
详细信息
ISBN:
(纸本)9781954085541
Learning low-dimensional representations of networked documents is a crucial task for documents linked in network structures. Relational Topic Models (RTMs) have shown their strengths in modeling both document contents and relations to discover the latent topic semantic representations. However, higher-order correlation structure information among documents is largely ignored in these methods. Therefore, we propose a novel graph relational topic model (GRTM) for document network, to fully explore and mix neighborhood information of documents on each order, based on the Higher-order graph Attention Network (HGAT) with the log-normal prior in the graph attention. The proposed method can address the aforementioned issue via the information propagation among document-document based on the HGAT probabilistic encoder, to learn efficient networked document representations in the latent topic space, which can fully reflect document contents, along with document connections. Experiments on several real-world document network datasets show that, through fully exploring information in documents and document networks, our model achieves better performance on unsupervised representation learning and outperforms existing competitive methods in various downstream tasks.
We propose a simple and efficient framework to learn syntactic embeddings based on information derived from constituency parse trees. Using biased random walk methods, our embeddings not only encode syntactic informat...
详细信息
暂无评论