Network security is becoming an increasingly important issue, since the rapid development of the Internet. Network Intrusion Detection System (IDS), as the main security defending technique, is widely used against suc...
详细信息
ISBN:
(纸本)9781424455379
Network security is becoming an increasingly important issue, since the rapid development of the Internet. Network Intrusion Detection System (IDS), as the main security defending technique, is widely used against such malicious attacks. datamining and machinelearning technology has been extensively applied in network intrusion detection and prevention systems by discovering user behavior patterns from the network traffic data. Association rules and sequence rules are the main technique of datamining for intrusion detection. Considering the classical Apriori algorithm with bottleneck of frequent itemsets mining, we propose a Length-Decreasing Support to detect intrusion based on datamining, which is an improved Apriori algorithm. Experiment results indicate that the proposed method is efficient.
To build the mining research universities, we have to enhance not only the teachers39; abilities of teaching and research, but also the students39; abilities of learning and research. According to this request, we...
详细信息
This paper presents a study on using a concept feature to detect web phishing problem. Following the features introduced in Carnegie Mellon Anti-phishing and Network Analysis Tool (CANTINA), we applied additional doma...
详细信息
ISBN:
(纸本)9780769539232
This paper presents a study on using a concept feature to detect web phishing problem. Following the features introduced in Carnegie Mellon Anti-phishing and Network Analysis Tool (CANTINA), we applied additional domain top-page similarity feature to a machinelearning based phishing detection system. We preliminarily experimented with a small set of 200 web data, consisting of 100 phishing webs and another 100 non-phishing webs. The evaluation result in terms of f-measure was up to 0.9250, with 7.50% of error rate.
This research investigates the detection of student meta-cognitive planning processes in real-time using log tracing techniques. We use fine and coarse-grained data distillation, in combination with coarse-grained tex...
详细信息
ISBN:
(纸本)9780615375298
This research investigates the detection of student meta-cognitive planning processes in real-time using log tracing techniques. We use fine and coarse-grained data distillation, in combination with coarse-grained text replay coding, in order to develop detectors for students' planning of experiments in Science Assistments, an assessment and tutoring system for scientific inquiry. The goal is to recognize student inquiry planning behavior in real-time as the student conducts inquiry in a micro-world;the eventual goal is to provide real-time scaffolding of scientific inquiry.
Some commercial web search engines rely on sophisticated machinelearning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these...
详细信息
ISBN:
(纸本)9781605588896
Some commercial web search engines rely on sophisticated machinelearning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to speedup the ranking process without sacrificing much from the quality of results. In this paper, we propose optimization strategies that allow short-circuiting score computations in additive learning systems. The strategies are evaluated over a state-of-the-art machinelearning system and a large, real-life query log, obtained from Yahoo. By the proposed strategies, we are able to speedup the score computations by more than four times with almost no loss in result quality. Copyright 2010 ACM.
datamining is a new and very dynamic discipline that is oriented on finding new knowledge in databases. The MIT Technology Review considers datamining to be one of the ten new technologies that will change the curre...
详细信息
ISBN:
(纸本)9781618390370
datamining is a new and very dynamic discipline that is oriented on finding new knowledge in databases. The MIT Technology Review considers datamining to be one of the ten new technologies that will change the current world. datamining is a set of methods originating from practically attained knowledge. They are based though on a thorough statistical basis and machinelearning. Under preparation is a new monograph entitled datamining Experimental data Sets that should explain the basic statistical concepts of datamining known also as knowledge discovery (process) in databases. The monograph should comprise of analytical steps that are usually used in datamining. Starting with problem definition, tasks allocation, processes description, variables assignment, up to learning and evaluating the ensemble models, all phases should be explained in detail. The monograph shall explain the theory and usage of different datamining methods, like clustering, classification, association rules, naïve Bayesian classifier, genetic algorithms, decision trees, artificial neuronal networks, Kohonen's maps, discriminant analysis, regression, etc.
Nowadays, the usage of neural network strategies in patternrecognition is a widely considered solution. In this paper we propose three different strategies to select more efficiently the patterns for a fast learning ...
详细信息
The proceedings contain 5 papers. The topics discussed include: do not feel the trolls;end-user programming and the advent of sharable, social machines;computing FOAF co-reference relations with rules and machine lear...
The proceedings contain 5 papers. The topics discussed include: do not feel the trolls;end-user programming and the advent of sharable, social machines;computing FOAF co-reference relations with rules and machinelearning;mapping tweets to conference talks: a goldmine for semantics;and extracting semantic relations for mining of social data.
data clustering aims at finding the hidden patterns in a large collection of data and a large body of effective algorithms have been proposed to partition the data in the past three decades. However, most of the algor...
详细信息
ISBN:
(纸本)9780769539232
data clustering aims at finding the hidden patterns in a large collection of data and a large body of effective algorithms have been proposed to partition the data in the past three decades. However, most of the algorithms fail to handle data that expose a manifold structure which is common in many data-driven application, such as interpretation and recognition of video, handwritten character and image data. In this paper, we study the problem of clustering on manifold that aims to partition a set of input data into several clusters each of which contains data points from a simple low-dimensional manifold. We apply the basic assumption of local and global consistency on the manifold. A novel algorithm name CMLGC is proposed to find the proper clusters on the manifold. Our research can also be seen as an instance of manifold learning. The encouraging results on several synthetic and real-world data set are obtained which validate our proposed algorithm.
We describe a machinelearning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model to learn from past user clicks on adver...
详细信息
ISBN:
(纸本)9781605588896
We describe a machinelearning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model to learn from past user clicks on advertisements. We present a novel approach using translation models to learn user click propensity from sparse click logs. Our relevance predictions are then applied to multiple sponsored search applications in both offline editorial evaluations and live online user tests. The predicted relevance score is used to improve the quality of the search page in three areas: filtering low quality ads, more accurate ranking for ads, and optimized page placement of ads to reduce prominent placement of low relevance ads. We show significant gains across all three tasks. Copyright 2010 ACM.
暂无评论