文献详情 >An Investigation of Data Requi... 收藏

An Investigation of Data Requirements for the Detection of Depression from Social Media Posts

作者：Dalal, Sumit Jain, Sarika Dave, Mayank

作者机构：Department of Computer Applications National Institute of Technology Hasyana Kurukshetra136119 India Department of Computer Engineering National Institute of Technology Kurukshetra India

出版物：《Recent Patents on Engineering》 (Recent Pat. Eng.)

年卷期：2023年第17卷第3期

页面：89-101页

核心收录：

学科分类：0710[理学-生物学] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学] 0835[工学-软件工程] 0836[工学-生物工程] 0812[工学-计算机科学与技术（可授工学、理学学位）]

主　　题：Deep learning

摘要：Background: Only a fraction of the produced social media data is usable in mental health assessment. So the problem of sufficient training data for deep learning approaches arises. Data sufficiency can be presented in terms of number of users or the number of posts per user. Objective: We examine the data need of machine learning and deep learning models for a practi-cal system and let researcher choose best fitting models depending on the dataset type available with them. We perform distinct experiments to find the effect of these issues on depression classification by various approaches. Methods: We explored various machine learning and deep learning techniques on various data set versions, taken from Twitter and Reddit, with varying numbers of users and posts per user. Diagnosed and control users are taken in different ratios to assess the impact of an imbalanced dataset. Results: The results reveal that SVM achieved 68% accuracy in depression classification for 70 users each from diagnosed and control group. It decreases for 150 users from each group, but then regains performance for 350 and 550 users from each group. Whereas Naive Bayes got 64% for the same dataset fragment (1). We observed that accuracy decreases for 150 diagnosed users, but then regains performance for 350 and 550 users. However from deep learning algorithms, HAN and BiLSTM perform better, compared to other algorithms, as the imbalance ratio increases. Conclusion: We found, mainly, that classification accuracy increases with the number of users, number of posts per user and imbalance in the number of diagnosed versus control users. We also found that posts from Reddit have better accuracy compared to tweets. © 2023 Bentham Science Publishers.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

An Investigation of Data Requirements for the Detection of Depression from Social Media Posts

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

An Investigation of Data Requirements for the Detection of Depression from Social Media Posts

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：