检索结果-内蒙古大学图书馆

13th International conference on natural language processing and Chinese Computing

作者： Jeptoo, Korir Nancy Su, Chengjie Harbin Inst Technol Fac Comp Harbin Peoples R China

ISBN: (纸本)9789819794331;9789819794348

Large language models (LLMs) are showing dramatic progress in terms of language generation and in reasoning tasks. Existing works on fake news detection mostly focus on fine-tuning small language models such as BERT. One downside of fine-tuning is that it requires a lot of data which might not always be available. With the prevalent spread of fake news and misinformation, alternative ways are needed especially in cases where there is lack of enough training data. In this paper, we propose using multi-agent debate strategies to enhance fake news detection by leveraging the capabilities of LLMs. We introduce two approaches: a uniform prompt multi-agent debate and a diverse prompt multi-agent debate where each LLM agent adopts distinct roles such as fact-checker, journalist, or data scientist. These methods are bench-marked against single LLM evaluations to assess the impact of collaborative reasoning. Our experiments on the PolitiFact and GossipCop datasets reveal that the multi-agent debate methods outperform single LLM assessments. Notably, the diverse persona debate approach achieves the highest performance, demonstrating the value of incorporating different perspectives in reasoning. These results suggest that multi-agent debates can effectively harness the strengths of single LLMs to improve the reliability of fake news detection systems.

关键词： Fake News Detection Large language Models Multi-Agent Debate

来源：评论

学校读者我要写书评

暂无评论

ASRLM: ASR-Robust language Model Pre-training via Generative and Discriminative Learning 13th

ASRLM: ASR-Robust Language Model Pre-training via Generative...

引用

13th International conference on natural language processing and Chinese Computing

作者： Hu, Qian Han, Xue Wang, Yiting Wang, Yitong Deng, Chao Feng, Junlan China Mobile Res Inst JiuTian Team Beijing Peoples R China

ISBN: (纸本)9789819794362;9789819794379

The rise of voice interface applications has renewed interest in improving the robustness of spoken language understanding(SLU). Many advances have come from end-to-end speech-language joint training, such as inferring semantics directly from speech signals and post-editing automatic speech recognition (ASR) output. Despite their performance achievements, these methods either suffer from the unavailability of a large number of paired error-prone ASR transcriptions and ground-truth annotations or are computationally costly. To mitigate these issues, we propose an ASR-robust pre-trained language model (ASRLM), which involves a generator generating simulated ASR transcriptions from ground-truth annotations and a sample-efficient discriminator distinguishing reasonable ASR errors from unrealistic ones. Experimental results demonstrate that ASRLM improves performance on a wide range of SLU tasks in the presence of ASR errors while saving 27% of the computation cost compared to baselines. Analysis also shows that our proposed generator is better than other simulation methods, including both BERT and GPT4-based, at simulating real-world ASR error situations.

关键词： Spoken language understanding Automatic speech recognition Pre-trained language model

来源：评论

学校读者我要写书评

暂无评论

Trust at Your Own Peril: A Mixed methods Exploration of the Ability of Large language Models to Generate Expert-Like Systems Engineering Artifacts and a Characterization of Failure Modes

引用

SYSTEMS ENGINEERING 2025年第0期

作者： Topcu, Taylan G. Husain, Mohammed Ofsa, Max Wach, Paul Virginia Tech Grad Dept Ind & Syst Engn Blacksburg VA 24061 USA OpenAI Global Affairs Washington DC USA Virginia Tech Natl Secur Inst Blacksburg VA USA

Multi-purpose large language models (LLMs), a subset of generative artificial intelligence (AI), have recently made significant progress. While expectations for LLMs to assist systems engineering (SE) tasks are paramount;the interdisciplinary and complex nature of systems, along with the need to synthesize deep-domain knowledge and operational context, raise questions regarding the efficacy of LLMs to generate SE artifacts, particularly given that they are trained using data that is broadly available on the internet. To that end, we present results from an empirical exploration, where a human expert-generated SE artifact was taken as a benchmark, parsed, and fed into various LLMs through prompt engineering to generate segments of typical SE artifacts. This procedure was applied without any fine-tuning or calibration to document baseline LLM performance. We then adopted a two-fold mixed-methods approach to compare AI generated artifacts against the benchmark. First, we quantitatively compare the artifacts using natural language processing algorithms and find that when prompted carefully, the state-of-the-art algorithms cannot differentiate AI-generated artifacts from the human-expert benchmark. Second, we conduct a qualitative deep dive to investigate how they differ in terms of quality. We document that while the two-material appear very similar, AI generated artifacts exhibit serious failure modes that could be difficult to detect. We characterize these as: premature requirements definition, unsubstantiated numerical estimates, and propensity to overspecify. We contend that this study tells a cautionary tale about why the SE community must be more cautious adopting AI suggested feedback, at least when generated by multi-purpose LLMs.

关键词： generative artificial intelligence (AI) human-AI collaboration large language models (LLMs) problem formulation systems engineering

来源：评论

学校读者我要写书评

暂无评论

Local or Global Optimization for Dialogue Discourse Parsing 13th

Local or Global Optimization for Dialogue Discourse Parsing

引用

13th International conference on natural language processing and Chinese Computing

作者： Wang, Chengrui Ji, Shaoming Kong, Fang Soochow Univ Lab Nat Language Proc Suzhou Peoples R China Soochow Univ Sch Comp Sci & Technol Suzhou Peoples R China

ISBN: (纸本)9789819794300;9789819794317

Dialogue Discourse Parsing aims to identify the discourse links and relations between utterances, which has attracted more interest in recent years. Previous studies either adopt local optimization to independently select one parent for each utterance or use global optimization to directly get the tree representing the dialogue structure. However, the influence of these two optimization methods remains less explored. In this paper, we aim to systematically inspect their performance. Specifically, for local optimization, we use local loss during the training stage and a greedy strategy during the inference stage. For global optimization, We implement optimization of unlabeled and labeled trees by structured losses including Max-Margin and TreeCRF, and exploit Chu-Liu-Edmonds algorithm during the inference stage. Experiments shows that the performance of these two optimization methods is closely related to the characteristics of the dataset, and global optimization can reduce the burden of identifying long-range dependency relations.

关键词： Dialogue discourse parsing Local optimization Global optimization

来源：评论

学校读者我要写书评

暂无评论

Mathematical Reasoning via Multi-step Self Questioning and Answering for Small language Models 13th

Mathematical Reasoning via Multi-step Self Questioning and A...

引用

13th International conference on natural language processing and Chinese Computing

作者： Chen, Kaiyuan Wang, Jin Zhang, Xuejie Yunnan Univ Sch Informat Sci & Engn Kunming Yunnan Peoples R China

ISBN: (纸本)9789819794393;9789819794409

Mathematical reasoning is challenging for large language models (LLMs), while the scaling relationship concerning LLM capacity is under-explored. Existing works have tried to leverage the rationales of LLMs to train small language models (SLMs) for enhanced reasoning abilities, referred to as distillation. However, most existing distillation methods have not considered guiding the small models to solve problems progressively from simple to complex, which can be a more effective way. This study proposes a multi-step self questioning and answering (M-SQA) method that guides SLMs to solve complex problems by starting from simple ones. Initially, multi-step self-questioning and answering rationales are extracted from LLMs based on complexity-based prompting. Subsequently, these rationales are employed for distilling SLMs in a multi-task learning framework, during which the model learns to multi-step reason in a self questioning and answering way and answer each sub-question in a single step iteratively. Experiments on current mathematical reasoning tasks demonstrate the effectiveness of the proposed approach.

关键词： Mathematical Reasoning Knowledge Distillation Small language Models

来源：评论

学校读者我要写书评

暂无评论

Resolving Unseen Rumors with Retrieval-Augmented Large language Models 13th

Resolving Unseen Rumors with Retrieval-Augmented Large Langu...

引用

13th International conference on natural language processing and Chinese Computing

作者： Chen, Lei Wei, Zhongyu Fudan Univ Sch Data Sci Shanghai Peoples R China Fudan Univ Res Inst Intelligent & Complex Syst Shanghai Peoples R China

ISBN: (纸本)9789819794393;9789819794409

Social media has become the primary source of information for individuals, yet much of this information remains unverified. The rise of generative artificial intelligence has further accelerated the creation of unverified content. Adaptive rumor resolution systems are imperative for maintaining information integrity and public trust. Traditional methods have relied on encoder-based frameworks to enhance rumor representation and propagation characteristics. However, these models are often small in scale and lack generalizability for unforeseen events. Recent advances in Large language Models show promise but are unreliable in discerning truth from falsehood. Our work leverages LLMs by creating a testbed for predicting unprecedented rumors and designing a retrieval-augmented framework that integrates historical knowledge and collective intelligence. Experiments on two real-world datasets demonstrate the effectiveness of our proposed framework.

关键词： Rumor Detection Retrieval-Augmented LLMs

来源：评论

学校读者我要写书评

暂无评论

RAVL: A Retrieval-Augmented Visual language Model Framework for Knowledge-Based Visual Question Answering 13th

RAVL: A Retrieval-Augmented Visual Language Model Framework ...

引用

13th International conference on natural language processing and Chinese Computing

作者： Chai, Naiquan Zou, Dongsheng Liu, Jiyuan Wang, Hao Yang, Yuming Song, Xinyi Chongqing Univ Sch Comp Sci Chongqing Peoples R China

ISBN: (纸本)9789819794362;9789819794379

Knowledge-based visual question answering (VQA) requires external knowledge in addition to the image content to answer questions. Recent studies convert images to text descriptions and then generate answers or acquire implicit knowledge using a large language model (LLM). These methods achieve encouraging results with the strong knowledge retrieval and reasoning capabilities of LLMs. However, methods that incorporate LLMs are limited by the discrepancies between images and their text descriptions presented to LLMs. To address this challenge, we present RAVL, a retrieval-augmented visual language model (VLM) framework for knowledge-based VQA. Specifically, we first fine-tune a VLM on the knowledge-based VQA task with inputs consisting of retrieved knowledge and image-question pairs to adapt the VLM to inputs with retrieved knowledge. After that, we adapt the retrieval module to the fine-tuned VLM using supervision signals provided by the VLM, enabling the retrieved knowledge to improve the VLM perplexity. RAVL overcomes the limitation of visual information loss and improves the effectiveness of VLMs with external knowledge. We conduct experiments on OK-VQA dataset and our method achieves 65.73% accuracy, surpassing the previous state-of-the-art method (+3.63%).

关键词： Knowledge-based visual question answering Retrieval augmentation Visual language model

来源：评论

学校读者我要写书评

暂无评论

What is the Best Model? Application-Driven Evaluation for Large language Models 13th

What is the Best Model? Application-Driven Evaluation for La...

引用

13th International conference on natural language processing and Chinese Computing

作者： Lian, Shiguo Zhao, Kaikai Liu, Xinhui Lei, Xuejiao Yang, Bikun Zhang, Wenjing Wang, Kai Liu, Zhaoxiang China Unicom AI Innovat Ctr Beijing 100013 Peoples R China China Unicom Unicom Digital Technol Beijing 100013 Peoples R China

ISBN: (纸本)9789819794362;9789819794379

General large language models enhanced with supervised fine-tuning and reinforcement learning from human feedback are increasingly popular in academia and industry as they generalize foundation models to various practical tasks in a prompt manner. To assist users in selecting the best model in practical application scenarios, i.e., choosing the model that meets the application requirements while minimizing cost, we introduce A-Eval, an application-driven LLMs evaluation benchmark for general large language models. First, we categorize evaluation tasks into five main categories and 27 sub-categories from a practical application perspective. Next, we construct a dataset comprising 678 question-and-answer pairs through a process of collecting, annotating, and reviewing. Then, we design an objective and effective evaluation method and evaluate a series of LLMs of different scales on A-Eval. Finally, we reveal interesting laws regarding model scale and task difficulty level and propose a feasible method for selecting the best model. Through A-Eval, we provide clear empirical and engineer guidance for selecting the best model, reducing barriers to selecting and using LLMs and promoting their application and development. Our benchmark is publicly available at https://***/UnicomAI/UnicomBenchmark/tree/main/A-Eval.

关键词： Large language Models application-driven evaluation benchmark selecting the best model

来源：评论

学校读者我要写书评

暂无评论

Sentiment Analysis on Moroccan Dialect of Arabic Combining NLP and ML methods 8th

Sentiment Analysis on Moroccan Dialect of Arabic Combining N...

引用

8th International conference on Arabic language processing

作者： Ladrham, Khalil Gueddah, Hicham Mohammed V Univ Intelligent Proc & Secur Syst Fac Sci BP 8007Ave Nations Unies Rabat Morocco ENS Mohammed V Univ Intelligent Proc & Secur Syst Team BP 8007Ave Nations Unies Rabat Morocco

ISBN: (纸本)9783031791635;9783031791642

In recent years, machine learning and websites have developed rapidly. This resulted in continual and explosive growth in the sharing of ideas and views on products and services over the worldwide web in an array of sectors. As a result, there is an enormous flow of internet data attainable for analytical research. Sentiment analysis (SA) is a part of natural language processing (NLP) that requires to process enormous amounts of data in order to identify people's opinions and sentiments. Several studies have been conducted to deal with the negative effects of social networks. This field of research is increasing popularity in both the public and private sectors, leading to the creation of several challenges. However, the majority of the available datasets were in English. Whereas the Arabic Moroccan dialect (Darija) ones were not. Following that, we created models combining NLP and Marching learning techniques to detect and classify sentiments. We evaluated the models using the most used metrics: accuracy, loss, F1-score, precision, and recall. The results of the experiment revealed modest scores between 87% and 89%. These findings imply that the models require to be upgraded due to a lack of accessible datasets and pre-processing techniques to handle the Moroccan dialect of Arabic (Darija).

关键词： Sentiment analysis Arabic Moroccan dialect natural language processing NLP Maching Learning ML pre-processing datasets

来源：评论

学校读者我要写书评

暂无评论

Parameter-Efficient Fine-Tuning of Pre-trained Large language Models for Financial Text Analysis 5th

Parameter-Efficient Fine-Tuning of Pre-trained Large Languag...

引用

5th Southern African conference for Artificial Intelligence Research

作者： Langa, Kelly Wang, Hairong Okuboyejo, Olaperi Univ Witwatersrand ZA-2017 Johannesburg South Africa

ISBN: (纸本)9783031782541;9783031782558

Recent advancements in natural language processing (NLP) have been driven by large language models (LLMs) that excel in understanding the complexities of natural language. These models have transformed NLP tasks through transfer learning, where pre-trained LLMs are fine-tuned on domain-specific datasets. Financial sentiment analysis is particularly challenging due to the complexity of financial language, requiring more advanced methods than traditional sentiment analysis approaches. Fine-tuning LLMs can enhance performance in the financial domain, but the high computational cost of standard full fine-tuning is a barrier. This study explores the effectiveness of four (4) parameter-efficient fine-tuning (PEFT) methods, namely, Low-Rank Adaptation (LoRA), prompt tuning, prefix tuning, and adapters, for financial sentiment analysis. The findings show that PEFT methods can match or exceed the performance of full fine-tuning while significantly reducing computational requirements. Specifically, adapting the Open Pre-trained Transformers (OPT) model with LoRA achieved the highest accuracy of 89% using only 0.19% of the model's parameters. PEFT methods also resulted in substantial graphics processing unit (GPU) memory savings of up to 80%. Small-scale fine-tuned LLMs outperformed cutting-edge large-scale general-purpose models like ChatGPT, highlighting the value of domain-specific fine-tuning. LLMs demonstrated superiority over conventional long short-term memory (LSTM) models by achieving a 18% increase in accuracy, thereby validating their higher implementation costs.

关键词： natural language processing Large language models Parameter efficient fine tuning Transfer learning Financial sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：