Security on mobile devices is becoming increasingly important. HTML5 are widely used to develop mobile applications due to its portability on multi platforms. However it is allowed to mix data and code together in web...
详细信息
ISBN:
(纸本)9781467385374
Security on mobile devices is becoming increasingly important. HTML5 are widely used to develop mobile applications due to its portability on multi platforms. However it is allowed to mix data and code together in web technology. HTML5-based applications are prone to suffer from code injection attacks that are similar to XSS. In this paper, at first, we introduce a more hidden type of code injection attacks, coding-based attacks. In the new type of code injection attacks, JavaScript code is encoded in a human-unreadable form. Then we use classification algorithms of machine learning to determine whether an app suffers from the code injection attack or not. The experimental result shows that the Precision of our detection method reaches 95.3%. Compare with the other method, our approach improves a lot in detection speed with the precision nearly unchanged. Furthermore, an improved access control model is proposed to mitigate the attack damage. In addition, filters are adopted to remove JavaScript code from data to prevent the attacks. The effectiveness and rationality have been validated through extensive simulations.
k-Nearest Neighbor (k-NN) classification technique is one of the most elementary and straightforward classification methods. Although distance learning is in the core of k-NN classification, similarity can be preferre...
详细信息
k-Nearest Neighbor (k-NN) classification technique is one of the most elementary and straightforward classification methods. Although distance learning is in the core of k-NN classification, similarity can be preferred upon distance in several practical applications. This paper proposes a novel algorithm for learning a class of an instance based on a similarity measure which does not calculate distance, for k-Nearest Neighbor (k-NN) classification. (C) 2015 The Authors. Published by Elsevier B.V.
Online Social Networks (OSNs) are deemed to be the most sought-after societal tool used by the masses world over to communicate and transmit information. Our dependence on these platforms for seeking opinions, news, u...
详细信息
ISBN:
(纸本)9781479971718
Online Social Networks (OSNs) are deemed to be the most sought-after societal tool used by the masses world over to communicate and transmit information. Our dependence on these platforms for seeking opinions, news, updates, etc. is increasing. While it is true that OSNs have become a new medium for dissemination of information, at the same time, they are also fast becoming a playground for the spread of misinformation, propaganda, fake news, rumors, unsolicited messages, etc. Consequently, we can say that an OSN platform comprises of two kinds of users namely, Spammers and Non-Spammers. Spammers, out of malicious intent, post either unwanted (or irrelevant) information or spread misinformation on OSN platforms. As part of our work, we propose mechanisms to detect such users (Spammers) in Twitter social network (a popular OSN). Our work is based on a number of features at tweet-level and user-level like Followers/Followees, URLs, Spam Words, Replies and HashTags. In our work, we have applied three learning algorithms namely Naive Bayes, Clustering and Decision trees. Furthermore, to improve detection of Spammers, a novel integrated approach is proposed which "combines" the advantages of the three learning algorithms mentioned above. Improvement of spam detection is measured on the basis of Total Accuracy, Spammers Detection Accuracy and Non-Spammers Detection Accuracy. Results, thus obtained, show that our novel integrated approach that combines all algorithms outperforms other classical approaches in terms of overall accuracy and detect Non-Spammers with 99% accuracy with an overall accuracy of 87.9%.
The paper gives lower bounds for the minimum number m of weighings that are necessary for identification of up to t non-standard objects out of the total number of n objects being tested. For the problem with fixed de...
详细信息
The paper gives lower bounds for the minimum number m of weighings that are necessary for identification of up to t non-standard objects out of the total number of n objects being tested. For the problem with fixed deviation of weights of non-standard objects we construct a perfect algorithms with parameters n = 11, m = 5, t = 2 corresponding to the parameters of the ternary Virtakallio-Golay code. The nonexistence of a perfect weighing code with such parameters is proved.
A novel parallelize gender recognition method with MapReduce is presented, which successfully comprise several machine leaning algorithms which are employed for gender recognition. The mass of face sample images are g...
详细信息
ISBN:
(纸本)9781479983544
A novel parallelize gender recognition method with MapReduce is presented, which successfully comprise several machine leaning algorithms which are employed for gender recognition. The mass of face sample images are gathered and separated as train dataset and test dataset, and Local Binary Pattern (LBP) features are extracted when those sample sets are pre-processed and made ready for following operations. And Principle Component Analysis (PCA) is applied to train dataset to extract the most distinguishing features. Three classification algorithms: Support Vector Machine(SVM), k-Nearest Neighborhood (k-NN) and Adaboost are implemented and compared to determine the most suitable and successful algorithm for gender parallelize machine learning (GPML). To achieve the shortest execution time, we propose to apply GPML with MapReduce to avoid parallelizing above three algorithms while also improving their scalability to big datasets. The results show that this method reduces the training computational complexity significantly when the number of computing nodes increases while gaining better speedup rates and extending performance than those on parallelize Adaboost.
Microblog has become an important part of social media, a large number of users send and thus spread information on this platform. Nowadays, the network environment of microblog is impacted by the presence of anomalie...
详细信息
Microblog has become an important part of social media, a large number of users send and thus spread information on this platform. Nowadays, the network environment of microblog is impacted by the presence of anomalies in users seriously. So the research on identifying the types of microblog users is of great significance. Based on the example of microblog, this paper selects some microblog users as research objects and thus analyzes and extracts the features of the selected users. Meanwhile, it uses statistical methods and classification methods in data mining to analyze user data. With the breakthrough point, the classification method C4.5 Decision Tree, this paper has trained the history data to form a classifier to make prognostic classification of new sample, which has realized high accuracy.
k -Nearest Neighbor ( k -NN) classification technique is one of the most elementary and straightforward classification methods. Although distance learning is in the core of k -NN classification, similarity can be pref...
详细信息
k -Nearest Neighbor ( k -NN) classification technique is one of the most elementary and straightforward classification methods. Although distance learning is in the core of k -NN classification, similarity can be preferred upon distance in several practical applications. This paper proposes a novel algorithm for learning a class of an instance based on a similarity measure which does not calculate distance, for k -Nearest Neighbor ( k -NN) classification.
The work proposes a new method for vehicle classification, which allows treating vehicles uniformly at the stage of defining the vehicle classes, as well as during the classification itself and the assessment of its c...
详细信息
The work proposes a new method for vehicle classification, which allows treating vehicles uniformly at the stage of defining the vehicle classes, as well as during the classification itself and the assessment of its correctness. The sole source of information about a vehicle is its magnetic signature normalised with respect to the amplitude and duration. The proposed method allows defining a large number (even several thousand) of classes comprising vehicles whose magnetic signatures are similar according to the assumed criterion with precisely determined degree of similarity. The decision about the degree of similarity and, consequently, about the number of classes, is taken by a user depending on the classification purpose. An additional advantage of the proposed solution is the automated defining of vehicle classes for the given degree of similarity between signatures determined by a user. Thus the human factor, which plays a significant role in currently used methods, has been removed from the classification process at the stage of defining vehicle classes. The efficiency of the proposed approach to the vehicle classification problem was demonstrated on the basis of a large set of experimental data.
We present an empirical comparison of classification algorithms when training data contains attribute noise levels not representative of field data. To study algorithm sensitivity, we develop an innovative experimenta...
详细信息
We present an empirical comparison of classification algorithms when training data contains attribute noise levels not representative of field data. To study algorithm sensitivity, we develop an innovative experimental design using noise situation, algorithm, noise level, and training set size as factors. Our results contradict conventional wisdom indicating that investments to achieve representative noise levels may not be worthwhile. ill general, over representative training noise Should be avoided while under representative training noise is less of a concern. However, interactions among algorithm, noise level, and training set size indicate that these general results may not apply to particular practice situations. (c) 2008 Elsevier B.V. All rights reserved.
Wheat is one of the most important cereals worldwide for human nutrition. Tetraploid wheat (Triticum turgidum L. asp. durum, 2n = 28, genomes AABB) is mainly used to produce pasta. The main objective of durum wheat br...
详细信息
Wheat is one of the most important cereals worldwide for human nutrition. Tetraploid wheat (Triticum turgidum L. asp. durum, 2n = 28, genomes AABB) is mainly used to produce pasta. The main objective of durum wheat breeding programs is to develop varieties with good quality and high yields. Yield is a very complex trait, and depends on different yield components that are genetically controlled and affected by environmental constraints. In this context, machine learning constitutes an excellent alternative for the analysis of a high number of traits in order to extract the most relevant ones as confident predictors of the performance of this crop, allowing a better agricultural planning. Thus, we propose the use of machine learning algorithms for the classification of yield components and for the search of new rules to infer high yields at harvest of durum wheat. The main objective of this work was to obtain rules for predicting durum wheat yield through different machine learning algorithms, and compare them to detect the one that best fits the model. In order to achieve this goal, One-R, J48, Ibk and A priori algorithms were run with data collected by our research group of a RIL (recombinant inbreed lines) population growing in six different environments from the Province of Buenos Aires in Argentina. The results indicate that the A priori method obtains the best performance for all locations, and the classificators generated using the different algorithms share a common set of selected traits. Moreover, comparing these results with the previous ones obtained using different techniques, mainly QTL mapping, the traits indicated to be the most significant ones were the same. The analysis of the resulting rules shows the soundness in the agronomic relevance of the extracted knowledge. (C) 2013 Elsevier B.V. All rights reserved.
暂无评论