咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Enhancing User Experience in A... 收藏

Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method

作     者:Ahmed Alhussen Arshiya Sajid Ansari Mohammad Sajid Mohammadi 

作者机构:Department of Computer EngineeringCollege of Computer and Information SciencesMajmaah UniversityAl-Majmaah11952Saudi Arabia Department of Information TechnologyCollege of Computer and Information SciencesMajmaah UniversityAl-Majmaah11952Saudi Arabia Department of Computer ScienceCollege of Engineering and Information TechnologyOnaizah CollegesQassim51911Saudi Arabia 

出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))

年 卷 期:2025年第82卷第2期

页      面:2909-2929页

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:The author Dr.Arshiya S.Ansari extends the appreciation to the Deanship of Postgraduate Studies and Scientific Research at Majmaah University for funding this research work through the project number(R-2025-1538) 

主  题:Human-computer communication(HCC) vocal emotions live vocal artificial intelligence(AI) deep learning(DL) selfish herd optimization-tuned long/short K term memory(SHO-LSTM) 

摘      要:Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分