The challenge of bankruptcy prediction, critical for averting financial sector losses, is amplified by the prevalence of imbalanced datasets, which often skew prediction models. Addressing this, our study introduces t...
详细信息
Graphconvolutional networks(GCNs)have become prevalent in recommender system(RS)due to their superiority in modeling collaborative *** improving the overall accuracy,GCNs unfortunately amplify popularity bias-tail ite...
详细信息
Graphconvolutional networks(GCNs)have become prevalent in recommender system(RS)due to their superiority in modeling collaborative *** improving the overall accuracy,GCNs unfortunately amplify popularity bias-tail items are less likely to be *** effect prevents the GCN-based RS from making precise and fair recommendations,decreasing the effectiveness of recommender systems in the long *** this paper,we investigate how graph convolutions amplify the popularity bias in *** theoretical analyses,we identify two fundamental factors:(1)with graph convolution(i.e.,neighborhood aggregation),popular items exert larger influence than tail items on neighbor users,making the users move towards popular items in the representation space;(2)after multiple times of graph convolution,popular items would affect more high-order neighbors and become more *** two points make popular items get closer to almost users and thus being recommended more *** rectify this,we propose to estimate the amplified effect of popular nodes on each node's representation,and intervene the effect after each graph ***,we adopt clustering to discover highly-influential nodes and estimate the amplification effect of each node,then remove the effect from the node embeddings at each graph convolution *** method is simple and generic-it can be used in the inference stage to correct existing models rather than training a new model from scratch,and can be applied to various GCN *** demonstrate our method on two representative GCN backbones LightGCN and UltraGCN,verifying its ability in improving the recommendations of tail items without sacrificing the performance of popular *** are open-sourced^(1)).
Internet of Things (IoT) enabled Wireless Sensor Networks (WSNs) is not only constitute an encouraging research domain but also represent a promising industrial trend that permits the development of various IoT-based ...
详细信息
Predicting RNA binding protein(RBP) binding sites on circular RNAs(circ RNAs) is a fundamental step to understand their interaction mechanism. Numerous computational methods are developed to solve this problem, but th...
详细信息
Predicting RNA binding protein(RBP) binding sites on circular RNAs(circ RNAs) is a fundamental step to understand their interaction mechanism. Numerous computational methods are developed to solve this problem, but they cannot fully learn the features. Therefore, we propose circ-CNNED, a convolutional neural network(CNN)-based encoding and decoding framework. We first adopt two encoding methods to obtain two original matrices. We preprocess them using CNN before fusion. To capture the feature dependencies, we utilize temporal convolutional network(TCN) and CNN to construct encoding and decoding blocks, respectively. Then we introduce global expectation pooling to learn latent information and enhance the robustness of circ-CNNED. We perform circ-CNNED across 37 datasets to evaluate its effect. The comparison and ablation experiments demonstrate that our method is superior. In addition, motif enrichment analysis on four datasets helps us to explore the reason for performance improvement of circ-CNNED.
Ransomware is one of the most advanced malware which uses high computer resources and services to encrypt system data once it infects a system and causes large financial data losses to the organization and individuals...
详细信息
Detecting plagiarism in documents is a well-established task in natural language processing (NLP). Broadly, plagiarism detection is categorized into two types (1) intrinsic: to check the whole document or all the pass...
详细信息
Detecting plagiarism in documents is a well-established task in natural language processing (NLP). Broadly, plagiarism detection is categorized into two types (1) intrinsic: to check the whole document or all the passages have been written by a single author;(2) extrinsic: where a suspicious document is compared with a given set of source documents to figure out sentences or phrases which appear in both documents. In the pursuit of advancing intrinsic plagiarism detection, this study addresses the critical challenge of intrinsic plagiarism detection in Urdu texts, a language with limited resources for comprehensive language models. Acknowledging the absence of sophisticated large language models (LLMs) tailored for Urdu language, this study explores the application of various machine learning, deep learning, and language models in a novel framework. A set of 43 stylometry features at six granularity levels was meticulously curated, capturing linguistic patterns indicative of plagiarism. The selected models include traditional machine learning approaches such as logistic regression, decision trees, SVM, KNN, Naive Bayes, gradient boosting and voting classifier, deep learning approaches: GRU, BiLSTM, CNN, LSTM, MLP, and large language models: BERT and GPT-2. This research systematically categorizes these features and evaluates their effectiveness, addressing the inherent challenges posed by the limited availability of Urdu-specific language models. Two distinct experiments were conducted to evaluate the impact of the proposed features on classification accuracy. In experiment one, the entire dataset was utilized for classification into intrinsic plagiarized and non-plagiarized documents. Experiment two categorized the dataset into three types based on topics: moral lessons, national celebrities, and national events. Both experiments are thoroughly evaluated through, a fivefold cross-validation analysis. The results show that the random forest classifier achieved an ex
Iris biometrics allow contactless authentication, which makes it widely deployed human recognition mechanisms since the couple of years. Susceptibility of iris identification systems remains a challenging task due to ...
详细信息
Author Profiling (AP) is a subsection of digital forensics that focuses on the detection of the author’s personalinformation, such as age, gender, occupation, and education, based on various linguistic features, e.g....
详细信息
Author Profiling (AP) is a subsection of digital forensics that focuses on the detection of the author’s personalinformation, such as age, gender, occupation, and education, based on various linguistic features, e.g., stylistic,semantic, and syntactic. The importance of AP lies in various fields, including forensics, security, medicine, andmarketing. In previous studies, many works have been done using different languages, e.g., English, Arabic, French,***, the research on RomanUrdu is not up to the ***, this study focuses on detecting the author’sage and gender based on Roman Urdu text messages. The dataset used in this study is Fire’18-MaponSMS. Thisstudy proposed an ensemble model based on AdaBoostM1 and Random Forest (AMBRF) for AP using multiplelinguistic features that are stylistic, character-based, word-based, and sentence-based. The proposed model iscontrasted with several of the well-known models fromthe literature, including J48-Decision Tree (J48),Na飗e Bays(NB), K Nearest Neighbor (KNN), and Composite Hypercube on Random Projection (CHIRP), NB-Updatable,RF, and AdaboostM1. The overall outcome shows the better performance of the proposed AdaboostM1 withRandom Forest (ABMRF) with an accuracy of 54.2857% for age prediction and 71.1429% for gender predictioncalculated on stylistic features. Regarding word-based features, age and gender were considered in 50.5714% and60%, respectively. On the other hand, KNN and CHIRP show the weakest performance using all the linguisticfeatures for age and gender prediction.
Accidents caused by drivers who exhibit unusual behavior are putting road safety at ever-greater risk. When one or more vehicle nodes behave in this way, it can put other nodes in danger and result in potentially cata...
详细信息
Higher-order patterns reveal sequential multistep state transitions,which are usually superior to origin-destination analyses that depict only first-order geospatial movement *** methods for higher-order movement mode...
详细信息
Higher-order patterns reveal sequential multistep state transitions,which are usually superior to origin-destination analyses that depict only first-order geospatial movement *** methods for higher-order movement modeling first construct a directed acyclic graph(DAG)of movements and then extract higher-order patterns from the ***,DAG-based methods rely heavily on identifying movement keypoints,which are challenging for sparse movements and fail to consider the temporal variants critical for movements in urban *** overcome these limitations,we propose HoLens,a novel approach for modeling and visualizing higher-order movement patterns in the context of an urban *** mainly makes twofold contributions:First,we designed an auto-adaptive movement aggregation algorithm that self-organizes movements hierarchically by considering spatial proximity,contextual information,and tem-poral ***,we developed an interactive visual analytics interface comprising well-established visualization techniques,including the H-Flow for visualizing the higher-order patterns on the map and the higher-order state sequence chart for representing the higher-order state *** real-world case studies demonstrate that the method can adaptively aggregate data and exhibit the process of exploring higher-order patterns using *** also demonstrate the feasibility,usability,and effectiveness of our approach through expert interviews with three domain experts.
暂无评论