Encrypted network traffic classification is an essential task in modern communications, which is used in a wide range of applications, such as network resource allocation, QoS (Quality of Service), malicious detection...
详细信息
In this paper, a deep BiLSTM ensemble method was proposed to detect anomaly of drinking water quality. First, a convolutional neural network (CNN) is utilized as a feature extractor in order to process the raw data of...
详细信息
Crowdsourcing has become an efficient paradigm to utilize human intelligence to perform tasks that are challenging for machines. Many incentive mechanisms for crowdsourcing systems have been proposed. However, most of...
详细信息
In recent years, attribute-based access control (ABAC) models have been widely used in big data and cloud computing. However, with the growing importance of data content, using data content to assist authorization for...
详细信息
Understanding contents in social networks by inferring high-quality latent topics from short texts is a significant task in social analysis, which is challenging because social network contents are usually extremely s...
详细信息
With the advent of the era of big data, how to process massive image, video and other multimedia data timely and accurately has become a new challenge in related fields. Aiming at the computational bottleneck and inef...
详细信息
Software defect prediction(SDP)is used to perform the statistical analysis of historical defect data to find out the distribution rule of historical defects,so as to effectively predict defects in the new ***,there ar...
详细信息
Software defect prediction(SDP)is used to perform the statistical analysis of historical defect data to find out the distribution rule of historical defects,so as to effectively predict defects in the new ***,there are redundant and irrelevant features in the software defect datasets affecting the performance of defect *** order to identify and remove the redundant and irrelevant features in software defect datasets,we propose ReliefF-based clustering(RFC),a clusterbased feature selection ***,the correlation between features is calculated based on the symmetric *** to the correlation degree,RFC partitions features into k clusters based on the k-medoids algorithm,and finally selects the representative features from each cluster to form the final feature *** the experiments,we compare the proposed RFC with classical feature selection algorithms on nine National Aeronautics and Space Administration(NASA)software defect prediction datasets in terms of area under curve(AUC)and *** experimental results show that RFC can effectively improve the performance of SDP.
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic...
详细信息
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the *** optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.
This paper addresses maximum likelihood (ML) estimation based model fitting in the context of extrasolar planet detection. This problem is featured by the following properties: (1) the candidate models under considera...
详细信息
Novel coronavirus disease 2019(COVID-19)is an ongoing health *** studies are related to ***,its molecular mechanism remains *** rapid publication of COVID-19 provides a new way to elucidate its mechanism through compu...
详细信息
Novel coronavirus disease 2019(COVID-19)is an ongoing health *** studies are related to ***,its molecular mechanism remains *** rapid publication of COVID-19 provides a new way to elucidate its mechanism through computational *** paper proposes a prediction method for mining genotype information related to COVID-19 from the perspective of molecular mechanisms based on machine *** method obtains seed genes based on prior *** genes are mined from biomedical *** candidate genes are scored by machine learning based on the similarities measured between the seed and candidate ***,the results of the scores are used to perform functional enrichment analyses,including KEGG,interaction network,and Gene Ontology,for exploring the molecular mechanism of *** results show that the method is promising for mining genotype information to explore the molecular mechanism related to COVID-19.
暂无评论