检索结果-内蒙古大学图书馆

RETRACTED: Diverse ensemble classifier driven Email spam classification using multiple word embedding's with COCOB optimizer (Retracted Article)

引用

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2024年第1期46卷 2941-2954页

作者： Vinitha, V. Sri Renuka, D. Karthika Bannari Amman Inst Technol Dept IT Sathyamangalam Tamil Nadu India PSG Coll Technol Dept IT Coimbatore Tamil Nadu India

Spam Email is a serious concern which can steal user's personal information and cause huge financial loss due to the increasing rate of internet users. Therefore, the demand for accurate spam filtering has become more sophisticated for the Email spam detection. In the existing techniques, it is difficult to intricate the relationship between words in the Email using certain word embedding techniques and learning rate tuning is one of the greatest challenges of stochastic optimization. To overcome this difficulty, the proposed framework uses diverse ensemble based Email spam classification by incorporating multiple word embedding's with Continuous Coin Betting optimizer. Word2Vec is used to produce the first set of 200D, next set of 200Dword embedding is produced by Glove and 768D is produced by using bidirectional encoder representations from transformers (BERT) respectively. After generating word embedding, then it is classified through diverse ensemble based classifier with base level classifier consists of Long Short Term Memory (LSTM) Networks, Gated Recurrent Unit (GRU) and Bi-directional Gated Recurrent Unit (Bi-GRU) and LSTM as Meta-classifier using COCOB optimizer. Experiments were conducted on 3 benchmark Email dataset and result shows that the proposed system outperforms well with a low false positive rate.

关键词： Word2Vec bidirectional encoder representations from transformers global vectors gated recurrent unit bidirectional gated recurrent unit long short term memory continuous coin betting

来源：评论

学校读者我要写书评

暂无评论

Using BERT to identify drug-target interactions from whole PubMed

引用

BMC BIOINFORMATICS 2022年第1期23卷 245-245页

作者： Aldahdooh, Jehad Vaha-Koskela, Markus Tang, Jing Tanoli, Ziaurrehman Univ Helsinki Fac Med Res Program Syst Oncol Helsinki Finland Univ Helsinki Inst Mol Med Finland Helsinki Finland Univ Helsinki Doctoral Programme Comp Sci Helsinki Finland BioICAWtech Helsinki Finland

Background: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied bidirectional encoder representations from transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. Results: Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained similar to 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. Conclusion: The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.

关键词： BERT bidirectional encoder representations from transformers BERT for biomedical data Drug target interaction prediction Mining drug target interactions Biomedical text mining Bioactivity data Drug repurposing

来源：评论

学校读者我要写书评

暂无评论

An ensemble-based hotel recommender system using sentiment analysis and aspect categorization of hotel reviews

引用

APPLIED SOFT COMPUTING 2021年 98卷 106935-106935页

作者： Ray, Biswarup Garain, Avishek Sarkar, Ram Jadavpur Univ Dept Comp Sci & Engn Kolkata 700032 W Bengal India

Finding a suitable hotel based on user's need and affordability is a complex decision-making process. Nowadays, the availability of an ample amount of online reviews made by the customers helps us in this regard. This very fact gives us a promising research direction in the field of tourism called hotel recommendation system which also helps in improving the information processing of consumers. Real-world reviews may showcase different sentiments of the customers towards a hotel and each review can be categorized based on different aspects such as cleanliness, value, service, etc. Keeping these facts in mind, in the present work, we have proposed a hotel recommendation system using Sentiment Analysis of the hotel reviews, and aspect-based review categorization which works on the queries given by a user. Furthermore, we have provided a new rich and diverse dataset of online hotel reviews crawled from ***. We have followed a systematic approach which first uses an ensemble of a binary classification called bidirectional encoder representations from transformers (BERT) model with three phases for positive-negative, neutral-negative, neutral-positive sentiments merged using a weight assigning protocol. We have then fed these pre-trained word embeddings generated by the BERT models along with other different textual features such as word vectors generated by Word2vec, TF-IDF of frequent words, subjectivity score, etc. to a Random Forest classifier. After that, we have also grouped the reviews into different categories using an approach that involves fuzzy logic and cosine similarity. Finally, we have created a recommender system by the aforementioned frameworks. Our model has achieved a Macro F1-score of 84% and test accuracy of 92.36% in the classification of sentiment polarities. Also, the results of the categorized reviews have formed compact clusters. The results are quite promising and much better compared to state-of-the-art models. The relevant codes a

关键词： bidirectional encoder representations from transformers Categorization Ensemble Fuzzy Hotel reviews Random Forest classifier Recommender system Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

Designing a deep learning-based application for detecting fake online reviews

引用

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2024年 134卷

作者： Thao, Le Quang Kien, Do Trung Thuy, Dang Thi Thanh Thuy, Luong Thi Minh Bach, Ngo Chi Duc, Tran Tri Bach, Hoang Gia Cuong, Duong Duc VNU Univ Sci Fac Phys Hanoi 100000 Vietnam Vietnam Natl Univ Hanoi 100000 Vietnam Nguyen Tat Thanh High Sch Hanoi 100000 Vietnam HNUE High Sch Gifted Hanoi 100000 Vietnam

In the era of prevalent online commerce, online reviews significantly influence purchasing decisions. Unfortunately, this has also led to the emergence of fake reviews, which can deceive consumers and undermine trust in online platforms. Our study addresses this issue by developing DenyBERT, a deep learning-based software that enhances the bidirectional encoder representations from transformers (BERT) framework with Deep and Light Transformation (DeLighT) and Knowledge Distillation (KD) techniques. These innovations not only reduce computational demands but also improve the model 's accuracy in identifying fake reviews, making it ideally suited for real-world applications. Significantly, DenyBERT requires only 16.01M parameters-significantly fewer than its predecessors such as BERT and TinyBERT-yet achieves a robust accuracy of 96.12% and an F1-score of 96.47%. This efficiency makes it particularly suited for deployment on devices with limited processing capabilities. The software, developed in Python, features a flexible input mechanism that allows for analyzing reviews directly from websites via URL or via manual input of review paragraphs. Our findings indicate that DenyBERT outperforms existing models in both speed and accuracy, making it a powerful tool for combating fake reviews in real -time scenarios. This advancement not only enhances user trust in online review systems but also supports ecommerce platforms in maintaining a fair and transparent market environment.

关键词： Fake online review Natural language processing bidirectional encoder representations from transformers Deep and light transformation Knowledge distillation Single head-attention

来源：评论

学校读者我要写书评

暂无评论

SignBERT: A BERT-Based Deep Learning Framework for Continuous Sign Language Recognition

引用

IEEE ACCESS 2021年 9卷 161669-161682页

作者： Zhou, Zhenxing Tam, Vincent W. L. Lam, Edmund Y. Univ Hong Kong Dept Elect & Elect Engn Pokfulam Hong Kong Peoples R China

Continuous sign language recognition (CSLR) is a very challenging task in intelligent systems, since it requires to produce real-time responses while performing computationally intensive video analytics and language modeling. Previous studies mainly focus on adopting hidden Markov models or recurrent neural networks with a limited capability to model specific sign languages, and the accuracy can drop significantly when recognizing the signs performed by different signers with non-standard gestures or non-uniform speeds. In this work, we develop a deep learning framework named SignBERT, integrating the bidirectional encoder representations from transformers (BERT) with the residual neural network (ResNet), to model the underlying sign languages and extract spatial features for CSLR. We further propose a multimodal version of SignBERT, which combines the input of hand images with an intelligent feature alignment, to minimize the distance between the probability distributions of the recognition results generated by the BERT model and the hand images. Experimental results indicate that when compared to the performance of alternative approaches for CSLR, our method has better accuracy with significantly lower word error rate on three challenging continuous sign language datasets.

关键词： Feature extraction Assistive technologies Hidden Markov models Gesture recognition Bit error rate Training Three-dimensional displays bidirectional encoder representations from transformers continuous sign language recognition deep learning video analytics

来源：评论

学校读者我要写书评

暂无评论

Enabling Remote School Education using Knowledge Graphs and Deep Learning Techniques

引用

Procedia Computer Science 2022年 215卷 618-625页

作者： Lekshmi S Nair M K Shivani Sr Jo Cheriyan Department of Computer Science and Engineering Amrita Vishwa Vidyapeetham Amritapuri-690546 Kerala India Department of Computer Science and Engineering Saintgits College of Engineering Kottayam-686532 Kerala India

An automated question-answering system allows students to learn as an integral part of digitized learning. This system responds to queries using text. We also include a knowledge graph, which significantly enhances the model's intrigue and improves learners’ understanding. The features of knowledge entity extraction, information point evaluation and analysis, knowledge graph construction from unstructured text, and knowledge entity integration are all explored. The question-answering paradigm we suggest in this study uses knowledge graphs and BERT (bidirectional encoder representations from transformers) to provide diverse learners with quick feedback on the subject. In order to facilitate non-native learners’ understanding, we also include an English to Hindi translation. As a result, access to and continued learning can be very beneficial for educators.

关键词： Knowledge Graph bidirectional encoder representations from transformers Long Short-Term Memory Natural Language Processing Machine Translation Question-Answering Remote Education

来源：评论

学校读者我要写书评

暂无评论

Exploratory Data Analysis and Classification of a New Arabic Online Extremism Dataset

引用

IEEE ACCESS 2021年 9卷 161613-161626页

作者： Aldera, Saja Emam, Ahmed Al-Qurishi, Muhammad Alrubaian, Majed Alothaim, Abdulrahman King Saud Univ Coll Business Adm Management Informat Syst Dept Riyadh 11451 Saudi Arabia King Saud Univ Coll Comp & Informat Sci Informat Syst Dept Riyadh 11451 Saudi Arabia

The dissemination of extremist ideas and causes online has intensified over the last decade. Extremist organizations use social media to gain publicity and new recruits, often with little interference from network providers. New techniques are being developed to identify extremist content, ensuring it can be promptly removed and its authors blocked from network access. However, most techniques are only compatible with the English language, despite the fact that extremist propaganda is frequently shared in other languages, including Arabic. Since the most effective methods for automated linguistic analysis use deep learning and require large, high-quality datasets, creating specialised data samples containing examples of extremist communication is an essential step toward a practical solution. In this paper, we present a dataset compiled for this purpose and discuss the classification methods that can be used for extremism detection. The manually annotated Arabic Twitter dataset consists of 89,816 tweets published between 2011 and 2021. Using guidelines, three expert annotators labelled the tweets as extremist or non-extremist. Exploratory data analysis was performed to understand the dataset's features. Classification algorithms were used with the dataset, including logistic regression, support vector machine, multinominal naive Bayes, random forest, and BERT. Among the traditional machine learning models, support vector machine with term frequency-inverse document frequency features achieved the highest accuracy (0.9729). However, BERT outperformed the traditional models with an accuracy of 0.9749. This dataset is expected to enhance the accuracy of Arabic online extremism classification in future research, and so we have made it publicly available.

关键词： Social networking (online) Blogs Support vector machines Data analysis Radio frequency Feature extraction Psychology Extremism radicalization benchmark dataset exploratory data analysis machine learning bidirectional encoder representations from transformers

来源：评论

学校读者我要写书评

暂无评论

Automated Recognition of Visual Acuity Measurements in Ophthalmology Clinical Notes Using Deep Learning

引用

OPHTHALMOLOGY SCIENCE 2024年第2期4卷 100371-100371页

作者： Bernstein, Isaac A. Koornwinder, Abigail Hwang, Hannah H. Wang, Sophia Y. Stanford Univ Byers Eye Inst Dept Ophthalmol Palo Alto CA USA Weill Cornell Med Dept Ophthalmol New York NY USA 2370 Watson Ct Palo Alto CA 94030 USA

Purpose: Visual acuity (VA) is a critical component of the eye examination but is often only documented in electronic health records (EHRs) as unstructured free-text notes, making it challenging to use in research. This study aimed to improve on existing rule-based algorithms by developing and evaluating deep learning models to perform named entity recognition of different types of VA measurements and their lateralities from free-text ophthalmology notes: VA for each of the right and left eyes, with and without glasses correction, and with and without pinhole. Design: Cross-sectional study. Subjects: A total of 319 756 clinical notes with documented VA measurements from approximately 90 000 patients were included. Methods: The notes were split into train, validation, and test sets. bidirectional encoder representations from transformers (BERT) models were fine-tuned to identify VA measurements from the progress notes and included BERT models pretrained on biomedical literature (BioBERT), critical care EHR notes (ClinicalBERT), both (BlueBERT), and a lighter version of BERT with 40% fewer parameters (DistilBERT). A baseline rule-based al-gorithm was created to recognize the same VA entities to compare against BERT models. Main Outcome Measures: Model performance was evaluated on a held-out test set using microaveraged precision, recall, and F1 score for all entities. Results: On the human-annotated subset, BlueBERT achieved the best microaveraged F1 score (F1 = 0.92), followed by ClinicalBERT (F1 = 0.91), DistilBERT (F1 = 0.90), BioBERT (F1 = 0.84), and the baseline model (F1 = 0.83). Common errors included labeling VA in sections outside of the examination portion of the note, difficulties labeling current VA alongside a series of past VAs, and missing nonnumeric VAs. Conclusions: This study demonstrates that deep learning models are capable of identifying VA measurements from free-text ophthalmology notes with high precision and recall, achieving significant perfor

关键词： Deep learning Electronic health records Natural language processing Ophthalmology Visual acuity BERT bidirectional encoder representations from transformers EHR electronic health record NER named entity recognition OD right eye OS left eye TOVA Total VA Extraction Algorithm VA visual acuity

来源：评论

学校读者我要写书评

暂无评论

Elevating Offensive Language Detection: CNN-GRU and BERT for Enhanced Hate Speech Identification

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2024年第5期15卷 1164-1172页

作者： Madhavi, M. Agal, Sanjay Odedra, Niyati Dhirubhai Chowdhary, Harish Ruprah, Taranpreet Singh Vuyyuru, Veera Ankalu El-Ebiary, Yousef A. Baker Velagapudi Ramakrishna Siddhartha Engn Coll Dept CSE Vijayawada Andhra Pradesh India Parul Inst Engn & Technol PIET Dept Comp Sci & Engn PO Limda Ta Waghodia 391760 Gujarat India Dr V R Godhania Coll Engn & Technol Dept Comp Engn Porbandar Gujarat India Rashtriya Raksha Univ Gandhinagar Gujarat India Rajarambapu Inst Technol Sakharale Sakharale India Koneru Lakshmaiah Educ Fdn Dept Comp Sci & Engn Vaddeswaram Andhra Pradesh India UniSZA Univ Fac Informat & Comp Kuala Terengganu Malaysia

Upholding a secure and accepting digital environment is severely hindered by hate speech and inappropriate information on the internet. A novel approach that combines Convolutional Neural Network with GRU and BERT from transformers proposed for enhancing the identification of offensive content, particularly hate speech. The method utilizes the strengths of both CNN-GRU and BERT models to capture complex linguistic patterns and contextual information present in hate speech. The proposed model first utilizes CNN-GRU to extract local and sequential features from textual data, allowing for effective representation learning of offensive language. Subsequently, BERT, advanced transformer-based model, is employed to capture contextualized representations of the text, thereby enhancing the understanding of detailed linguistic nuances and cultural contexts associated with hate speech. Fine tuning BERT model using hugging face transformer. To execute tests using datasets for hate speech identification that are made accessible to the public and show how well the method works to identify inappropriate content. By assisting with the continuing efforts to prevent the dissemination of hate speech and undesirable language online, the proposed framework promotes a more diverse and secure digital environment. The proposed method is implemented using python. The method achieves 98% competitive performance compared to existing approaches LSTM and RNN, CNN, LSTM and GBAT, showcasing its potential for real-world applications in combating online hate speech. Furthermore, it provides insights into the interpretability of the model's predictions, highlighting key linguistic and contextual factors influencing offensive language detection. The study contributes to advancing hate speech detection by integrating CNN-GRU and BERT models, giving a robust solution for enhancing offensive content identification in online platforms.

关键词： bidirectional encoder representations from transformers convolutional neural network Gated Recurrent Unit hate speech hugging face transformer

来源：评论

学校读者我要写书评

暂无评论

A Survey on Sentiment Analysis and Opinion Mining in Greek Social Media

引用

INFORMATION 2021年第8期12卷 331页

作者： Alexandridis, Georgios Varlamis, Iraklis Korovesis, Konstantinos Caridakis, George Tsantilas, Panagiotis Univ Aegean Dept Cultural Technol & Commun Mitilini 81100 Greece Harokopio Univ Athens Dept Informat & Telemat Kallithea 17671 Greece Palo Serv Ltd Athens 10562 Greece

As the amount of content that is created on social media is constantly increasing, more and more opinions and sentiments are expressed by people in various subjects. In this respect, sentiment analysis and opinion mining techniques can be valuable for the automatic analysis of huge textual corpora (comments, reviews, tweets etc.). Despite the advances in text mining algorithms, deep learning techniques, and text representation models, the results in such tasks are very good for only a few high-density languages (e.g., English) that possess large training corpora and rich linguistic resources;nevertheless, there is still room for improvement for the other lower-density languages as well. In this direction, the current work employs various language models for representing social media texts and text classifiers in the Greek language, for detecting the polarity of opinions expressed on social media. The experimental results on a related dataset collected by the authors of the current work are promising, since various classifiers based on the language models (naive bayesian, random forests, support vector machines, logistic regression, deep feed-forward neural networks) outperform those of word or sentence-based embeddings (word2vec, GloVe), achieving a classification accuracy of more than 80%. Additionally, a new language model for Greek social media has also been trained on the aforementioned dataset, proving that language models based on domain specific corpora can improve the performance of generic language models by a margin of 2%. Finally, the resulting models are made freely available to the research community.

关键词： sentiment analysis opinion mining bidirectional encoder representations from transformers text embeddings transformers Greek social media

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：