Social media has become the primary source of information for individuals, yet much of this information remains unverified. The rise of generative artificial intelligence has further accelerated the creation of unveri...
详细信息
ISBN:
(纸本)9789819794393;9789819794409
Social media has become the primary source of information for individuals, yet much of this information remains unverified. The rise of generative artificial intelligence has further accelerated the creation of unverified content. Adaptive rumor resolution systems are imperative for maintaining information integrity and public trust. Traditional methods have relied on encoder-based frameworks to enhance rumor representation and propagation characteristics. However, these models are often small in scale and lack generalizability for unforeseen events. Recent advances in Large language Models show promise but are unreliable in discerning truth from falsehood. Our work leverages LLMs by creating a testbed for predicting unprecedented rumors and designing a retrieval-augmented framework that integrates historical knowledge and collective intelligence. Experiments on two real-world datasets demonstrate the effectiveness of our proposed framework.
In the medical field, unstructured medical text holds rich medical knowledge. Identifying medical entities in this text accurately is crucial for structured medical databases, knowledge graphs, and intelligent diagnos...
详细信息
ISBN:
(纸本)9789819794300;9789819794317
In the medical field, unstructured medical text holds rich medical knowledge. Identifying medical entities in this text accurately is crucial for structured medical databases, knowledge graphs, and intelligent diagnostic systems. Medical text has unique features, making it hard for traditional NER methods to identify complex medical entities. In particular, the recognition of nested entities within medical text poses a significant challenge, as it requires systems to recognize and understand the complex hierarchical relationships between entities, placing higher demands on traditional entity recognition systems. To overcome the challenges of nested entity recognition in medical text, we propose a method that combines semantic knowledge enhancement and global pointer optimization. Initially, we incorporate semantic prior knowledge of entity categories, capturing the interplay between labels and text by integrating label relationships. This allows us to obtain candidate entity information enriched with integrated label details. Following this, we establish a classification module to evaluate and score these candidate entities along with their labels, enabling entity prediction. To address nested entities, we introduce a Efficient GlobalPointer module that computes the likelihood of each text span being a specific entity type, thus bolstering nested entity recognition. By merging the outputs from both modules, we arrive at the final predicted entities. Experimental results indicate that our method excels on two flat entity datasets, CMedQANER and CCKS2017, as well as on the nested entity dataset CMeEE. Compared to baseline models, our approach demonstrates notable performance enhancements.
The rapid development of large-scale language models has garnered widespread interest from both academia and industry. Efficiently applying those models across various domains is now posing a challenge to researchers....
详细信息
ISBN:
(纸本)9789819794362;9789819794379
The rapid development of large-scale language models has garnered widespread interest from both academia and industry. Efficiently applying those models across various domains is now posing a challenge to researchers. High training costs and the relative scarcity of domain-specific data have rendered continual learning on general pretrained language models as one preferable approach. In this paper, we provide a comprehensive analysis and modification of these continual learning strategies for large language models, as they were initially designed for encoderonly architectures. Then a probing algorithm for the token representation shift was proposed to better alleviate forgetting. Additionally, corresponding evaluation metrics were modified for quantitative analysis of our methods. Through the experiment across three different domains, we verified the effectiveness of continual learning and probing algorithms on recent models. Results showed that knowledge distillation outperforms other methods in cross-domain continual learning. Moreover, the introduction of probing can further enhance the accuracy with a relatively small calculation budget.
Mathematical reasoning is challenging for large language models (LLMs), while the scaling relationship concerning LLM capacity is under-explored. Existing works have tried to leverage the rationales of LLMs to train s...
详细信息
ISBN:
(纸本)9789819794393;9789819794409
Mathematical reasoning is challenging for large language models (LLMs), while the scaling relationship concerning LLM capacity is under-explored. Existing works have tried to leverage the rationales of LLMs to train small language models (SLMs) for enhanced reasoning abilities, referred to as distillation. However, most existing distillation methods have not considered guiding the small models to solve problems progressively from simple to complex, which can be a more effective way. This study proposes a multi-step self questioning and answering (M-SQA) method that guides SLMs to solve complex problems by starting from simple ones. Initially, multi-step self-questioning and answering rationales are extracted from LLMs based on complexity-based prompting. Subsequently, these rationales are employed for distilling SLMs in a multi-task learning framework, during which the model learns to multi-step reason in a self questioning and answering way and answer each sub-question in a single step iteratively. Experiments on current mathematical reasoning tasks demonstrate the effectiveness of the proposed approach.
The development of general-purpose artificial intelligence that can understand and reason with common sense is a significant challenge. However, there has been a lack of research on fully considering personality trait...
详细信息
ISBN:
(纸本)9789819794331;9789819794348
The development of general-purpose artificial intelligence that can understand and reason with common sense is a significant challenge. However, there has been a lack of research on fully considering personality traits in common sense understanding and reasoning tasks. To address this, we create a personalized commonsense knowledge comprehension and reasoning dataset. This dataset organizes reasoning knowledge with typed if-then relations and variables, while also introducing personality traits as constraints. We adopt a two-stage training framework based on curriculum-learning to gradually improve the model's personalized commonsense knowledge comprehension and reasoning ability. Additionally, We compare pre-trained language models such as BERT, GPT2, and BART with different structures. The experimental results show that the models trained using the curriculum-learning training framework are able to generate more diversified and personality-trait-compliant commonsense reasoning results.
Stance detection is an active task in naturallanguageprocessing (NLP) that aims to identify the author's stance towards a particular target within a text. Given the remarkable language understanding capabilities...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
Stance detection is an active task in naturallanguageprocessing (NLP) that aims to identify the author's stance towards a particular target within a text. Given the remarkable language understanding capabilities and encyclopedic prior knowledge of large language models (LLMs), how to explore the potential of LLMs in stance detection has received significant attention. Unlike existing LLM-based approaches that focus solely on fine-tuning with large-scale datasets, we propose a new prompting method, called Chain of Stance (CoS). In particular, it positions LLMs as expert stance detectors by decomposing the stance detection process into a series of intermediate, stance-related assertions that culminate in the final judgment. This approach leads to significant improvements in classification performance. We conducted extensive experiments using four SOTA LLMs on the SemEval 2016 dataset, covering the zero-shot and few-shot learning setups. The results indicate that the proposed method achieves state-of-the-art results with an F1 score of 79.84 in the few-shot setting.
Information Retrieval (IR) pre-trained language models are trained from large-scale retrieval-based corpora to promote the task-specific knowledge capacity. Previous works focus on general retrieval pre-trained datase...
详细信息
ISBN:
(纸本)9789819794300;9789819794317
Information Retrieval (IR) pre-trained language models are trained from large-scale retrieval-based corpora to promote the task-specific knowledge capacity. Previous works focus on general retrieval pre-trained datasets, which cover inter-document data and intra-document data, paying less attention to the important asset of clicked data which is commonly adopted in recommendation domain. However, the utilization of easily accessible clicked data is a non-trivial operation due to its characteristics of large volume and insufficient refinement, which affect model learning efficiency and imply the risk of distorting learning directions. In this paper, we propose a Momentum-Driven Clicked Feature Compressed Pre-trained language Models for Information Retrieval (MCFC). Specifically, to tackle the effective learning pace on large amounts of data, we generalize multiple similar feature instances and compress the dispersed knowledge together at the query granularity, named Multi-Instance Information Integration. Meanwhile, more relevant detection between queries and documents is eager in coarse clicked data background, we leverage a momentum-driven adjusting mechanism to refine the text representations, named Continuous Debiasing Calibration. Extensive experiments on downstream datasets validate the superiority of our work to other recent strong baselines.
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Genera...
ISBN:
(纸本)9789819794362
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Generation in Dialogue Systems;multi-hop Reading Comprehension Model Based on Abstract Meaning Representation and Multi-task Joint Learning;Leveraging Large language Models for QA Dialogue Dataset Construction and Analysis in Public Services;MCFC: A Momentum-Driven Clicked Feature Compressed Pre-trained language Model for Information Retrieval;integrating Syntax Tree and Graph Neural Network for Conversational Question Answering over Heterogeneous Sources;pqE: Zero-Shot Document Expansion for Dense Retrieval with Large language Models;CKF: Conditional knowledge Fusion Method for CommonSense Question Answering;MPPQA: Structure-Aware Extractive Multi-span Question Answering for Procedural Documents;GraphLLM: A General Framework for Multi-hop Question Answering over knowledge Graphs Using Large language Models;local or Global Optimization for Dialogue Discourse Parsing;structure and Behavior Dual-Graph Reasoning with Integrated Key-Clue Parsing for Multi-party Dialogue Reading Comprehension;enhancing Emotional Support Conversation with Cognitive Chain-of-Thought Reasoning;a Simple and Effective Span Interaction Modeling Method for Enhancing Multiple Span Question Answering;FacGPT: An Effective and Efficient Method for Evaluating knowledge-Based Visual Question Answering;PAPER: A Persona-Aware Chain-of-Thought Learning Framework for Personalized Dialogue Response Generation;towards Building a Robust knowledge Intensive Question Answering Model with Large language Models;model-Agnostic knowledge Distillation Between Heterogeneous Models;exploring Multimodal Information Fusion in Spoken Off-Topic Degree Assessment;integrating Hierarchical Key Information and Semantic Difference Features for Long Text Matching;CausalAPM: Generalizable Literal Disentanglement for NLU
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Genera...
ISBN:
(纸本)9789819794423
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Generation in Dialogue Systems;multi-hop Reading Comprehension Model Based on Abstract Meaning Representation and Multi-task Joint Learning;Leveraging Large language Models for QA Dialogue Dataset Construction and Analysis in Public Services;MCFC: A Momentum-Driven Clicked Feature Compressed Pre-trained language Model for Information Retrieval;integrating Syntax Tree and Graph Neural Network for Conversational Question Answering over Heterogeneous Sources;pqE: Zero-Shot Document Expansion for Dense Retrieval with Large language Models;CKF: Conditional knowledge Fusion Method for CommonSense Question Answering;MPPQA: Structure-Aware Extractive Multi-span Question Answering for Procedural Documents;GraphLLM: A General Framework for Multi-hop Question Answering over knowledge Graphs Using Large language Models;local or Global Optimization for Dialogue Discourse Parsing;structure and Behavior Dual-Graph Reasoning with Integrated Key-Clue Parsing for Multi-party Dialogue Reading Comprehension;enhancing Emotional Support Conversation with Cognitive Chain-of-Thought Reasoning;a Simple and Effective Span Interaction Modeling Method for Enhancing Multiple Span Question Answering;FacGPT: An Effective and Efficient Method for Evaluating knowledge-Based Visual Question Answering;PAPER: A Persona-Aware Chain-of-Thought Learning Framework for Personalized Dialogue Response Generation;towards Building a Robust knowledge Intensive Question Answering Model with Large language Models;model-Agnostic knowledge Distillation Between Heterogeneous Models;exploring Multimodal Information Fusion in Spoken Off-Topic Degree Assessment;integrating Hierarchical Key Information and Semantic Difference Features for Long Text Matching;CausalAPM: Generalizable Literal Disentanglement for NLU
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Genera...
ISBN:
(纸本)9789819794331
The proceedings contain 194 papers. The special focus in this conference is on naturallanguageprocessing and Chinese Computing. The topics include: Hierarchical knowledge Aggregation for Personalized Response Generation in Dialogue Systems;multi-hop Reading Comprehension Model Based on Abstract Meaning Representation and Multi-task Joint Learning;Leveraging Large language Models for QA Dialogue Dataset Construction and Analysis in Public Services;MCFC: A Momentum-Driven Clicked Feature Compressed Pre-trained language Model for Information Retrieval;integrating Syntax Tree and Graph Neural Network for Conversational Question Answering over Heterogeneous Sources;pqE: Zero-Shot Document Expansion for Dense Retrieval with Large language Models;CKF: Conditional knowledge Fusion Method for CommonSense Question Answering;MPPQA: Structure-Aware Extractive Multi-span Question Answering for Procedural Documents;GraphLLM: A General Framework for Multi-hop Question Answering over knowledge Graphs Using Large language Models;local or Global Optimization for Dialogue Discourse Parsing;structure and Behavior Dual-Graph Reasoning with Integrated Key-Clue Parsing for Multi-party Dialogue Reading Comprehension;enhancing Emotional Support Conversation with Cognitive Chain-of-Thought Reasoning;a Simple and Effective Span Interaction Modeling Method for Enhancing Multiple Span Question Answering;FacGPT: An Effective and Efficient Method for Evaluating knowledge-Based Visual Question Answering;PAPER: A Persona-Aware Chain-of-Thought Learning Framework for Personalized Dialogue Response Generation;towards Building a Robust knowledge Intensive Question Answering Model with Large language Models;model-Agnostic knowledge Distillation Between Heterogeneous Models;exploring Multimodal Information Fusion in Spoken Off-Topic Degree Assessment;integrating Hierarchical Key Information and Semantic Difference Features for Long Text Matching;CausalAPM: Generalizable Literal Disentanglement for NLU
暂无评论