版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Macao Polytech Univ Fac Appl Sci Macau Peoples R China Macao Polytech Univ Engn Res Ctr Appl Technol Machine Translat & Artif Minist Educ Macau Peoples R China
出 版 物:《IEEE ACCESS》 (IEEE Access)
年 卷 期:2025年第13卷
页 面:49007-49017页
核心收录:
基 金:Macao Polytechnic University [RP/FCA-04/2022 fca.1adb.4a97.8]
主 题:Sentiment analysis Analytical models Reviews Load modeling Large language models Data privacy Accuracy Performance evaluation Data models Computational modeling Small large language model aspect-based sentiment analysis sentiment analysis natural language processing resource-constrained environments data privacy
摘 要:Sentiment analysis using Large Language Models (LLMs) has gained significant attention in recent research due to its outstanding performance and ability to understand complex texts. However, popular LLMs, such as ChatGPT, are typically closed-source and come with substantial API costs, posing challenges for resource-limited scenarios and raising concerns about privacy. To address this, our study evaluates the feasibility of using small LLMs (sLLMs) as alternatives to GPT for aspect-based sentiment analysis in Chinese healthcare reviews. We compared several Chinese sLLMs of varying sizes with GPT-3.5, using GPT-4o s results as the benchmark, and assessed their classification accuracy by computing F1 scores for each individual aspect as well as an overall F1 score. Additionally, we examined sLLMs instruction-following capabilities, VRAM requirements, generation times, and the impact of temperature settings on the performance of top-performing sLLMs. The results demonstrate that several sLLMs can effectively follow instructions and even surpass GPT-3.5 in accuracy. For instance, InternLM2.5 achieved an F1 score of 0.85 with zero-shot prompting, while the smaller Qwen2.5-3B model performed well despite its minimal size. Prompt strategies significantly influenced smaller and older models like Qwen2.5-1.5B and ChatGLM3.5 but had limited impact on newer models. Temperature settings showed minimal effect, while older models generated responses faster, and newer models offered higher accuracy. This study underscores the potential of sLLMs as resource-efficient, privacy-preserving alternatives to closed-source LLMs in specialized domains. Our work demonstrates versatility, with potential applications across domains such as finance and education, and tasks like sentiment analysis, credit risk assessment, and learning behavior analysis, offering valuable insights for real-world use cases.