文献详情 >Performance Evaluation and App... 收藏

Performance Evaluation and Application Potential of Small Large Language Models in Complex Sentiment Analysis Tasks

作者：Yang, Yunchu Li, Jiaxuan Guo, Jielong Pang, Patrick Cheong-Iao Wang, Yapeng Yang, Xu Im, Sio-Kei

作者机构：Macao Polytech Univ Fac Appl Sci Macau Peoples R China Macao Polytech Univ Engn Res Ctr Appl Technol Machine Translat & Artif Minist Educ Macau Peoples R China

出版物：《IEEE ACCESS》 (IEEE Access)

年卷期：2025年第13卷

页面：49007-49017页

核心收录：

基　　金：Macao Polytechnic University [RP/FCA-04/2022 fca.1adb.4a97.8]

主　　题：Sentiment analysis Analytical models Reviews Load modeling Large language models Data privacy Accuracy Performance evaluation Data models Computational modeling Small large language model aspect-based sentiment analysis sentiment analysis natural language processing resource-constrained environments data privacy

摘要：Sentiment analysis using Large Language Models (LLMs) has gained significant attention in recent research due to its outstanding performance and ability to understand complex texts. However, popular LLMs, such as ChatGPT, are typically closed-source and come with substantial API costs, posing challenges for resource-limited scenarios and raising concerns about privacy. To address this, our study evaluates the feasibility of using small LLMs (sLLMs) as alternatives to GPT for aspect-based sentiment analysis in Chinese healthcare reviews. We compared several Chinese sLLMs of varying sizes with GPT-3.5, using GPT-4o s results as the benchmark, and assessed their classification accuracy by computing F1 scores for each individual aspect as well as an overall F1 score. Additionally, we examined sLLMs instruction-following capabilities, VRAM requirements, generation times, and the impact of temperature settings on the performance of top-performing sLLMs. The results demonstrate that several sLLMs can effectively follow instructions and even surpass GPT-3.5 in accuracy. For instance, InternLM2.5 achieved an F1 score of 0.85 with zero-shot prompting, while the smaller Qwen2.5-3B model performed well despite its minimal size. Prompt strategies significantly influenced smaller and older models like Qwen2.5-1.5B and ChatGLM3.5 but had limited impact on newer models. Temperature settings showed minimal effect, while older models generated responses faster, and newer models offered higher accuracy. This study underscores the potential of sLLMs as resource-efficient, privacy-preserving alternatives to closed-source LLMs in specialized domains. Our work demonstrates versatility, with potential applications across domains such as finance and education, and tasks like sentiment analysis, credit risk assessment, and learning behavior analysis, offering valuable insights for real-world use cases.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Performance Evaluation and Application Potential of Small Large Language Models in Complex Sentiment Analysis Tasks

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Performance Evaluation and Application Potential of Small Large Language Models in Complex Sentiment Analysis Tasks

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：