Semantic Dependency Graph is a framework for representing deep semantic knowledge through flexible graph structures. While recent works indicate that large language models (LLMs) have impressive language and knowledge...
详细信息
ISBN:
(纸本)9789819794362;9789819794379
Semantic Dependency Graph is a framework for representing deep semantic knowledge through flexible graph structures. While recent works indicate that large language models (LLMs) have impressive language and knowledge understanding abilities, it remains unclear whether they can understand this deep semantic knowledge. To explore this problem, we design four prompt-style probing tasks from aspects of semantic structure and semantic relations to adapt the inherent abilities of LLMs. To ensure thorough evaluation, we conduct extensive experiments in both in-context learning (ICL) and supervised fine-tuning (SFT) scenarios. Our findings indicate that the understanding of deep semantic knowledge requires larger parameter scale, especially the understanding of high-order semantic structure knowledge and semantic relation knowledge. Furthermore, our experiments reveal that while LLMs perform well on the in-domain (ID) test set via SFT, their generalization ability on out-of-domain (OOD) test set remains inadequate.
A backbone of knowledge graphs are their class membership relations, which assign entities to a given class. As part of the knowledgeengineering process, we propose a new method for evaluating the quality of these re...
详细信息
ISBN:
(纸本)9783031789519;9783031789526
A backbone of knowledge graphs are their class membership relations, which assign entities to a given class. As part of the knowledgeengineering process, we propose a new method for evaluating the quality of these relations by processing descriptions of a given entity and class using a zero-shot chain-of-thought classifier that uses a naturallanguage intensional definition of a class. We evaluate the method using two publicly available knowledge graphs, Wikidata and CaLiGraph, and 7 large language models. Using the gpt-4-0125-preview large language model, the method's classification performance achieves a macro-averaged F1-score of 0.830 on data from Wikidata and 0.893 on data from CaLiGraph. Moreover, a manual analysis of the classification errors shows that 40.9% of errors were due to the knowledge graphs, with 16.0% due to missing relations and 24.9% due to incorrectly asserted relations. These results show how large language models can assist knowledge engineers in the process of knowledge graph refinement. The code and data are available on Github (https://***/bradleypallen/evaluating-kg-class-memberships-using-llms).
With the emergence of massive news every day leading to information overload, it is difficult for people to select the content they are really interested in from numerous news articles. The languageknowledge and stro...
详细信息
Utterance Domain Classification (UDC) is essential for Spoken language Understanding (SLU), a task analogous to short text classification. Short texts are often challenging to understand due to their lack of context, ...
详细信息
ISBN:
(纸本)9789819794393;9789819794409
Utterance Domain Classification (UDC) is essential for Spoken language Understanding (SLU), a task analogous to short text classification. Short texts are often challenging to understand due to their lack of context, necessitating the enrichment of their semantic representation with supplementary information such as concepts from external knowledge bases. However, the inclusion of concepts introduces noise, making the selection of valuable concepts challenging. This paper proposes a UDC method employing keyword-guided signals to enhance the purity of external knowledge. We use two keyword extraction strategies to construct two types of keywords. A keyword-assisted concept denoising module addresses the concept noise problem, and a knowledge injection module is designed to better integrate concepts into the model. Experimental results on two Chinese SLU datasets demonstrate that our model achieves state-of-the-art performance.
knowledge-based visual question answering (VQA) requires external knowledge in addition to the image content to answer questions. Recent studies convert images to text descriptions and then generate answers or acquire...
详细信息
ISBN:
(纸本)9789819794362;9789819794379
knowledge-based visual question answering (VQA) requires external knowledge in addition to the image content to answer questions. Recent studies convert images to text descriptions and then generate answers or acquire implicit knowledge using a large language model (LLM). These methods achieve encouraging results with the strong knowledge retrieval and reasoning capabilities of LLMs. However, methods that incorporate LLMs are limited by the discrepancies between images and their text descriptions presented to LLMs. To address this challenge, we present RAVL, a retrieval-augmented visual language model (VLM) framework for knowledge-based VQA. Specifically, we first fine-tune a VLM on the knowledge-based VQA task with inputs consisting of retrieved knowledge and image-question pairs to adapt the VLM to inputs with retrieved knowledge. After that, we adapt the retrieval module to the fine-tuned VLM using supervision signals provided by the VLM, enabling the retrieved knowledge to improve the VLM perplexity. RAVL overcomes the limitation of visual information loss and improves the effectiveness of VLMs with external knowledge. We conduct experiments on OK-VQA dataset and our method achieves 65.73% accuracy, surpassing the previous state-of-the-art method (+3.63%).
Paraphrasing, the art of rephrasing text while retaining its original meaning, lies at the core of naturallanguage understanding and generation. With the rise of demand for more domain-specialized models, high-qualit...
详细信息
Paraphrasing, the art of rephrasing text while retaining its original meaning, lies at the core of naturallanguage understanding and generation. With the rise of demand for more domain-specialized models, high-quality data is more valued than ever;this includes paraphrasing. ParaFusion-Extend (PFE) is a large-scale dataset driven by Large language Models incorporating lexical and phrasal knowledge. The dataset is curated to contain high-quality diverse paraphrase pairs and also separate knowledge bases that could be used for research work and data augmentation models. We show that PFE offers around at least a 30% increase in syntactic and lexical diversity compared to the original data sources that are commonly used. We demonstrate the effectiveness of PFE on several downstream tasks such as few-shot learning and training on sentence embeddings. We utilize a gold-standard evaluation scheme, which is further strengthened by human evaluation that shows the potential of PFE in advancing paraphrase generation.
Research indicates that incorporating external knowledge into pre-trained language models (PLMs) can enhance their performance on knowledge-driven downstream tasks. However, most approaches either require retraining t...
详细信息
ISBN:
(纸本)9789819794331;9789819794348
Research indicates that incorporating external knowledge into pre-trained language models (PLMs) can enhance their performance on knowledge-driven downstream tasks. However, most approaches either require retraining the model or hardly maintain complete information of knowledge graphs (KGs). In this paper, we introduce a simple but effective ProSide module for PLMs, which includes two components: knowledge Projector and knowledge Sideway. knowledge Projector transforms knowledge representation from entity embedding in KG space to semantic space, while knowledge Sideway retains complete KG information in sideway modules through pre-training. Our ProSide can accommodate different frozen language models, without the need for retraining them. Results on three common knowledge-driven tasks demonstrate that our ProSide method enhances model performance and reaches state-of-the-art level. Additionally, we provide further analysis and case studies to illustrate the mechanism from the perspective of representation space. The code and models are publicly available at Github (Code and models are publicly available at https://***/hcffffff/ProSide).
Metaphor Components Identification (MCI) contributes to enhancing machine understanding of metaphors, thereby advancing downstream naturallanguageprocessing tasks. However, the complexity, diversity, and dependency ...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
Metaphor Components Identification (MCI) contributes to enhancing machine understanding of metaphors, thereby advancing downstream naturallanguageprocessing tasks. However, the complexity, diversity, and dependency on context and background knowledge pose significant challenges for MCI. Large language models (LLMs) offer new avenues for accurate comprehension of complex naturallanguage texts due to their strong semantic analysis and extensive commonsense knowledge. In this research, a new LLM-based framework is proposed, named Linguistics-aware In-context Learning with Data Augmentation (LaiDA). Specifically, ChatGPT and supervised fine-tuning are utilized to tailor a high-quality dataset. LaiDA incorporates a simile dataset for pre-training. A graph attention network encoder generates linguistically rich feature representations to retrieve similar examples. Subsequently, LLM is fine-tuned with prompts that integrate linguistically similar examples. LaiDA ranked 2nd in Subtask 2 of NLPCC2024 Shared Task 9, demonstrating its effectiveness. Code and data are available at https://***/WXLJZ/LaiDA.
knowledge distillation is an effective method for reducing the computational overhead of large language models. However, recent optimization efforts in distilling large language models have primarily focused on loss f...
详细信息
ISBN:
(纸本)9789819794362;9789819794379
knowledge distillation is an effective method for reducing the computational overhead of large language models. However, recent optimization efforts in distilling large language models have primarily focused on loss functions and training methodologies, with limited attention given to structural improvements of student models. This is largely due to the challenges posed by cross-architecture distillation and the substantial computational resources required for modifying model structures. To address these issues, we introduce a novel method that integrates a sparse mixture of experts (MoE) architecture with low-rank adaptation (LoRA). This combination not only bolsters the capabilities of the student model but also facilitates knowledge distillation using MoE without the necessity of continued pretraining. Experimental results indicate that our approach enhances the model's capabilities compared to dense model distillation, achieving superior performance across a multitude of tasks. We will release our code at https://***/sprogxhy/***.
Although large language models (LLMs) have achieved significant success in various tasks, they often struggle with hallucination issues in scenarios requiring deep reasoning. Incorporating external knowledge into LLM ...
详细信息
暂无评论