Utterance Domain Classification (UDC) is essential for Spoken language Understanding (SLU), a task analogous to short text classification. Short texts are often challenging to understand due to their lack of context, ...
详细信息
ISBN:
(纸本)9789819794393;9789819794409
Utterance Domain Classification (UDC) is essential for Spoken language Understanding (SLU), a task analogous to short text classification. Short texts are often challenging to understand due to their lack of context, necessitating the enrichment of their semantic representation with supplementary information such as concepts from external knowledge bases. However, the inclusion of concepts introduces noise, making the selection of valuable concepts challenging. This paper proposes a UDC method employing keyword-guided signals to enhance the purity of external knowledge. We use two keyword extraction strategies to construct two types of keywords. A keyword-assisted concept denoising module addresses the concept noise problem, and a knowledge injection module is designed to better integrate concepts into the model. Experimental results on two Chinese SLU datasets demonstrate that our model achieves state-of-the-art performance.
Visual Chinese Character Checking (C3) aims to detect and correct errors in handwritten Chinese text images, including faked characters and misspelled characters. This task is beneficial for subsequent tasks by improv...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
Visual Chinese Character Checking (C3) aims to detect and correct errors in handwritten Chinese text images, including faked characters and misspelled characters. This task is beneficial for subsequent tasks by improving the efficiency of identifying errors in handwritten text. Recent methods are mainly based on Optical Character Recognition (OCR) and Pre-trained language Models (PLMs). Visual Chinese Character Checking is an emerging task, and relevant research has made progress. However, we believe that existing work has not fully leveraged the inherent knowledge of pre-trained models and has not addressed the semantic bias issue between pre-trained models and the character checking task. These challenges result in deficiencies in recognizing misspelled Chinese characters and correcting misused characters. Therefore, we propose various multimodal contrastive learning methods based on image-to-image and image-to-text comparisons. These methods are used throughout the processes of character recognition, error detection, and correction. By aligning the semantic feature representations among different models, our approach makes these models more suitable for the Visual Chinese Character Checking task, thereby enhancing their capabilities.
With the development and applications of the Smart Court System(SCS)in China,the reliability and accuracy of legal artificial intelligence have become focal points in recent ***,criminal sentencing prediction,a signif...
详细信息
With the development and applications of the Smart Court System(SCS)in China,the reliability and accuracy of legal artificial intelligence have become focal points in recent ***,criminal sentencing prediction,a significant component of the SCS,has also garnered widespread *** to the Chinese criminal law,actual sentencing data exhibits a saturated property due to statutory penalty ranges,but this mechanism has been ignored by most existing *** this,the authors propose a sentencing prediction model that combines judicial sentencing mechanisms including saturated outputs and floating boundaries with neural *** on the saturated structure of our model,a more effective adaptive prediction algorithm will be constructed based on the fusion of several key ideas and techniques that include the utilization of the L1 loss together with the corresponding gradient update strategy,a data pre-processing method based on large language model to extract semantically complex sentencing elements using prior legal knowledge,the choice of appropriate initial conditions for the learning algorithm and the construction of a double-hidden-layer network *** empirical study on the crime of disguising or concealing proceeds of crime demonstrates that our method can achieve superior sentencing prediction accuracy and significantly outperform common baseline methods.
Creativity relates to the ability to generate novel, and effective, ideas in the areas of interest. How are such creative ideas generated? One possible mechanism that supports creative ideation, and is gaining increas...
详细信息
Stance detection is a subproblem of sentiment analysis, commonly defined as classifying the stance of a text toward a certain target: {Favor, Against, Neither}. As an important research problem, reliance on high-quali...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
Stance detection is a subproblem of sentiment analysis, commonly defined as classifying the stance of a text toward a certain target: {Favor, Against, Neither}. As an important research problem, reliance on high-quality annotated data poses a significant challenge. However, in the real world, with the rapid development of social media, it is impossible to annotate the massive amount of text on diverse topics, a universal framework for stance detection is expected. Consequently, zero-shot stance detection methods that do not require annotated data have attracted researchers' attention. In this survey, we review and introduce the early research and definition of stance detection, with a focus on summarizing the current research status of ZSSD, discuss datasets and the most advanced models. Finally, based on the above research, we explore possible future directions.
This study emphasizes the significance of identifying counterfeit news and the essentiality of proficient pre-processing approaches to clean text corpora. This work effectively integrated two Indian datasets and emplo...
详细信息
With the relentless growth in the volume of academic publications and the accelerating speed of scholarly communication, the time researchers dedicate to literature surveys has become increasingly substantial. Automat...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
With the relentless growth in the volume of academic publications and the accelerating speed of scholarly communication, the time researchers dedicate to literature surveys has become increasingly substantial. Automatic literature survey generation offers a valuable solution, liberating researchers from the time-intensive task of manually surveying the literature. We organized the NLPCC2024 Shared Task 6 for scientific literature survey generation. This paper will summarize the task information, the data set, the methods used by participants and the final results. Furthermore, we will discuss key findings and challenges for scientific literature survey generation in the scientific domain.
Currently, the application of Large language Models (LLMs) faces significant security threats. Harmful questions and adversarial attack prompts can induce the LLMs to generate toxic responses. Therefore, detoxifying L...
详细信息
ISBN:
(纸本)9789819794423;9789819794430
Currently, the application of Large language Models (LLMs) faces significant security threats. Harmful questions and adversarial attack prompts can induce the LLMs to generate toxic responses. Therefore, detoxifying LLMs is a critical research topic to ensure their safe and widespread application. In this paper, we propose an alignment-based detoxification method for LLMs. We utilize Kahneman-Tversky Optimization (KTO) to align LLMs. During the construction of the training dataset, we take into account both the detoxification performance and the potential side effect on the LLMs. For detoxification, we make the LLM preferentially generate safe responses rather than toxic contents when asked with harmful questions and attack prompts. To mitigate the potential side effect on the conversational capabilities of LLMs, we incorporate normal questions into the training data, and ensure that the LLM generate normal answers, rather than safety refusals or unsafe responses. Experimental results show that our method showcase the best detoxification performance among all baseline methods while exerting little negative impact on the LLMs. Moreover, our method even enhance the LLMs' general abilities such as question answering and language understanding. Our proposed method achieve the first place in the NLPCC 2024 Share Task 10 Track 2 with an average score of 52.31.
Large language models(LLMs) have exhibited notable general-purpose task-solving abilities in language understanding and generation, including processing recommendation tasks. The majority of existing research relies o...
详细信息
ISBN:
(纸本)9789819794331;9789819794348
Large language models(LLMs) have exhibited notable general-purpose task-solving abilities in language understanding and generation, including processing recommendation tasks. The majority of existing research relies on training-free recommendation models that treat LLMs as reasoning engines and directly given the recommended task response. This approach heavily relies on pre-trained knowledge and may lead to excessive costs. As such, we propose a two-stage fine-tuning framework leveraging LLaMA2 and GPT-4 Knowledge Enhancement for recommendation. In particular, we use GPT-4 Instruction-Following data to tune the LLM in first-stage instruction tuning process, achieving lower training costs and better inference performance. In the second stage, through a elaborately designed prompt template, we fine-tune LLM from the first stage in a few-shot setting by interactive sequences based on user ratings. To validate the effectiveness of our framework, we compare against state-of-the-art baseline methods on benchmark datasets. The results demonstrate that our framework has promising recommendation capabilities. Our experiments are executed on a single RTX4090 with LLaMA2-7B.
Federated Learning (FL) is a new pivotal paradigm for decentralized training on heterogeneous data. Recently fine-tuning of Vision-language Models (VLMs) has been extended to the federated setting to improve overall p...
详细信息
暂无评论