File labeling techniques have a long history in analyzing the anthological trends in computational *** situation becomes worse in the case of files downloaded into systems from the ***,most users either have to change...
详细信息
File labeling techniques have a long history in analyzing the anthological trends in computational *** situation becomes worse in the case of files downloaded into systems from the ***,most users either have to change file names manually or leave a meaningless name of the files,which increases the time to search required files and results in redundancy and duplications of user ***,no significant work is done on automated file labeling during the organization of heterogeneous user files.A few attempts have been made in topic ***,one major drawback of current topic modeling approaches is better *** rely on specific language types and domain similarity of the *** this research,machine learning approaches have been employed to analyze and extract the information from heterogeneous corpus.A different file labeling technique has also been used to get the meaningful and`cohesive topic of the *** results show that the proposed methodology can generate relevant and context-sensitive names for heterogeneous data files and provide additional insight into automated file labeling in operating systems.
Cardiac diseases are one of the greatest global health *** to the high annual mortality rates,cardiac diseases have attracted the attention of numerous researchers in recent *** article proposes a hybrid fuzzy fusion ...
详细信息
Cardiac diseases are one of the greatest global health *** to the high annual mortality rates,cardiac diseases have attracted the attention of numerous researchers in recent *** article proposes a hybrid fuzzy fusion classification model for cardiac arrhythmia *** fusion model is utilized to optimally select the highest-ranked features generated by a variety of well-known feature-selection *** ensemble of classifiers is then applied to the fusion’s *** proposed model classifies the arrhythmia dataset from the University of California,Irvine into normal/abnormal classes as well as 16 classes of ***,at the preprocessing steps,for the miss-valued attributes,we used the average value in the linear attributes group by the same class and the most frequent value for nominal ***,in order to ensure the model optimality,we eliminated all attributes which have zero or constant values that might bias the results of utilized *** preprocessing step led to 161 out of 279 attributes(features).Thereafter,a fuzzy-based feature-selection fusion method is applied to fuse high-ranked features obtained from different heuristic feature-selection *** short,our study comprises three main blocks:(1)sensing data and preprocessing;(2)feature queuing,selection,and extraction;and(3)the predictive *** proposed method improves classification performance in terms of accuracy,F1measure,recall,and precision when compared to state-of-the-art *** achieves 98.5%accuracy for binary class mode and 98.9%accuracy for categorized class mode.
With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities. To addr...
详细信息
The advent of the Internet has significantly stream-lined daily tasks through the rapid increase of online services. Everyday activities, such as purchasing goods and scheduling appointments with healthcare profession...
The advent of the Internet has significantly stream-lined daily tasks through the rapid increase of online services. Everyday activities, such as purchasing goods and scheduling appointments with healthcare professionals, have become more speedy, efficient and user-friendly with the integration of the Internet. The continuous improvement of online services has led to many people moving towards digital activities. As a result, it has heightened the recording of personal and payment transaction data across various storage mediums, including databases and log files. The protection and regulation of this sensitive data are imperative, aligning with the guidelines outlined in GDPR and PCI-DSS compliances. Recognizing exposed personal data poses a considerable challenge. This research introduces a novel approach to identifying payment card industry data (PCI) and personally identifiable information (PII). The research project proposes a machine learning-based text classification model utilizing the Convolutional Neural Network (CNN) model to discern PII and PCI data within a given text. The CNN model has been constructed and compared against Naive Bayes, Gradient Boost, Random Forest, and Support Vector Machine (SVM) models. The CNN model achieved the highest accuracy at 0.96 (96%). Additionally, the F1 scores for each class were significant, with PII scoring 0.94, PCI scoring 0.95, and Normal scoring 0.99. Following the model's construction and training, it was employed with the saved tokenizer's word indexes and label encoders in the developed classification tool. This tool successfully delivered the promised results, identifying exposed PII and PCI data.
Web services have significantly expanded and become a key enabling technology for online data, application and resource sharing. Designing new methods for efficient and reliable web service recommendation has been of ...
详细信息
Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-co...
详细信息
Graph Transformers, which incorporate self-attention and positional encoding, have recently emerged as a powerful architecture for various graph learning tasks. Despite their impressive performance, the complex non-convex interactions across layers and the recursive graph structure have made it challenging to establish a theoretical foundation for learning and generalization. This study introduces the first theoretical investigation of a shallow Graph Transformer for semi-supervised node classification, comprising a self-attention layer with relative positional encoding and a two-layer perceptron. Focusing on a graph data model with discriminative nodes that determine node labels and non-discriminative nodes that are class-irrelevant, we characterize the sample complexity required to achieve a desirable generalization error by training with stochastic gradient descent (SGD). This paper provides the quantitative characterization of the sample complexity and number of iterations for convergence dependent on the fraction of discriminative nodes, the dominant patterns, and the initial model errors. Furthermore, we demonstrate that self-attention and positional encoding enhance generalization by making the attention map sparse and promoting the core neighborhood during training, which explains the superior feature representation of Graph Transformers. Our theoretical results are supported by empirical experiments on synthetic and real-world benchmarks. Copyright 2024 by the author(s)
This work investigates the performance of simultaneous wireless information and power transfer (SWIPT) in a reconfigurable intelligent surface (RIS)-aided Internet of Things (IoT) communications under imperfect channe...
详细信息
As natural and manmade disasters grew in number, as a result the problem of how to quickly and effectively respond to disaster has become fresh. This is precisely the purpose of this research: Using IoT technologies a...
详细信息
The work reported in this article addresses the challenge of building models for non-trivial aerobatic aircraft maneuvers in an automated fashion. It is built using a Behavioural Cloning approach where human pilots pr...
详细信息
Sophisticated cyber threats are seen on Online Social Networks (OSNs) social media accounts automated to imitate human behaviours has an impactful effect on distorting public thoughts and opinions. OSNs are weaponized...
Sophisticated cyber threats are seen on Online Social Networks (OSNs) social media accounts automated to imitate human behaviours has an impactful effect on distorting public thoughts and opinions. OSNs are weaponized to diffuse deception, misinformation, and malicious activities, that forms a serious threat to society. The deceptive nature of imitating human behaviour has become a challenging and crucial task to detect automated accounts (socialbots). This research, however, proposes a hybrid metaheuristic optimisation algorithm for socialbot detection. Specifically, a hybrid B-Hill Climbing (B-HC) optimisation algorithm works in tandem with a k-NN nearest neighbour classifier to accurately select a relevant feature subset. It is applied to be tested for fake followers account on Twitter data. Experimental results showed that the proposed method is better than the traditional and the latest feature selection techniques as well as the rule-set methods. The B-HC alongside with k-NN method achieved promising results using only relevant feature subset.
暂无评论