Recent research has shown the significant vulnerabilities of collaborative recommender systems in the face of profile injection attacks, in which malicious users insert fake profiles into the rating database in order ...
详细信息
Recent research has shown the significant vulnerabilities of collaborative recommender systems in the face of profile injection attacks, in which malicious users insert fake profiles into the rating database in order to bias the system's output. To reduce this risk, a number of approaches have been proposed to detect such attacks. Although the existing detection approaches can detect the standard type of these attacks effectively, they perform badly when detecting the recently proposed obfuscated type of these attacks, for example, average over popular items (AoP) attack. With this problem in mind, in this study the author propose a supervised approach to detect such attack. First, he uses the theory of term frequency inverse document frequency (TFIDF) to extract the features of AoP attack. Second, he uses the training set to train support vector machine (svm) to generate a svm-based classifier. Finally, he uses the generated classifier to detect the AoP attack. The experimental results on MovieLens dataset show that the proposed approach can detect AoP attack with high recall and precision.
With the accumulation of maintenance data from the operation of vehicle on-board equipment (VOBE), it plays an important role in fault diagnosis and prognosis. However, natural language in maintenance data is a big ch...
详细信息
ISBN:
(纸本)9781467365963
With the accumulation of maintenance data from the operation of vehicle on-board equipment (VOBE), it plays an important role in fault diagnosis and prognosis. However, natural language in maintenance data is a big challenge for fault diagnosis due to its irregular feature and uncertainty semantics. Some researchers have introduced text mining methods to deal with this problem, but they lose sight of the real meaning of the topics and some prior knowledge related to these topics which are important to efficient feature extraction. In this paper, we put forward prior Latent Dirichlet Allocation (prior LDA) and Support Vector Machine (svm) based fault diagnosis. Firstly, Term Frequency & Inverse Topic Frequency (TFITF) method is proposed to extract prior knowledge around fault symptom, which then is integrated into basic Latent Dirichlet Allocation (LDA) to build prior LDA model. Next, we extract feature for classifiers with the prior LDA model from maintenance records. Thirdly, we give hierarchical classification model based on svm and feature fusion method which are used for fault diagnosis. Finally, F-measure method is introduced to evaluate the performance of the proposed model with real data from high speed railway system in Guangzhou Railway Corporation. Experiments show that the proposed method outperforms text mining method which reckons without prior knowledge and other common methods of fault diagnosis.
Purpose: The purpose of the study is to explore the potential use of nature language process(NLP) and machine learning(ML) techniques and intents to find a feasible strategy and effective approach to fulfill the NER t...
详细信息
Purpose: The purpose of the study is to explore the potential use of nature language process(NLP) and machine learning(ML) techniques and intents to find a feasible strategy and effective approach to fulfill the NER task for Web oriented person-specific information ***/methodology/approach: An svm-based multi-classification approach combined with a set of rich NLP features derived from state-of-the-art NLP techniques has been proposed to fulfill the NER task. A group of experiments has been designed to investigate the influence of various NLP-based features to the performance of the system,especially the semantic features. Optimal parameter settings regarding with svm models,including kernel functions,margin parameter of svm model and the context window size,have been explored through experiments as ***: The svm-based multi-classification approach has been proved to be effective for the NER task. This work shows that NLP-based features are of great importance in datadriven NE recognition,particularly the semantic features. The study indicates that higher order kernel function may not be desirable for the specific classification problem in practical application. The simple linear-kernel svm model performed better in this case. Moreover,the modified svm models with uneven margin parameter are more common and flexible,which have been proved to solve the imbalanced data problem *** limitations/implications: The svm-based approach for NER problem is only proved to be effective on limited experiment data. Further research need to be conducted on the large batch of real Web data. In addition,the performance of the NER system need be tested when incorporated into a complete IE ***/value: The specially designed experiments make it feasible to fully explore the characters of the data and obtain the optimal parameter settings for the NER task,leading to a preferable rate in recall,precision and F1measures. The overall syste
This paper presents a novel scheme for feature extraction, namely, the generalized two-dimensional Fisher's linear discriminant (G-2DFLD) method and its use for face recognition using multi-class support vector ma...
详细信息
This paper presents a novel scheme for feature extraction, namely, the generalized two-dimensional Fisher's linear discriminant (G-2DFLD) method and its use for face recognition using multi-class support vector machines as classifier. The G-2DFLD method is an extension of the 2DFLD method for feature extraction. Like 2DFLD method, G-2DFLD method is also based on the original 2D image matrix. However, unlike 2DFLD method, which maximizes class separability either from row or column direction, the G-2DFLD method maximizes class separability from both the row and column directions simultaneously. To realize this, two alternative Fisher's criteria have been defined corresponding to row and column-wise projection directions. Unlike 2DFLD method, the principal components extracted from an image matrix in G-2DFLD method are scalars;yielding much smaller image feature matrix. The proposed G-2DFLD method was evaluated on two popular face recognition databases, the AT&T (formerly ORL) and the UMIST face databases. The experimental results using different experimental strategies show that the new G-2DFLD scheme outperforms the PCA, 2DPCA, FLD and 2DFLD schemes, not only in terms of computation times, but also for the task of face recognition using multi-class support vector machines (svm) as classifier. The proposed method also outperforms some of the neural networks and other svm-based methods for face recognition reported in the literature. (C) 2010 Elsevier B. V. All rights reserved.
In this paper, we extend LELC (PU Learning by Extracting Likely Positive and Negative Micro-Clusters) method to cope with positive and unlabeled data streams. Our developed approach, which is called vote-based LELC, w...
详细信息
暂无评论