the recent evolution of various communication devices makes both network monitoring and configuration difficult. Although these network management tasks require deep understanding of network status, various new networ...
详细信息
ISBN:
(纸本)9781424409723
the recent evolution of various communication devices makes both network monitoring and configuration difficult. Although these network management tasks require deep understanding of network status, various new network devices make it difficult by increasing the complexity of whole network system. this paper proposes a visualization technique of network status that uses a frequent itemset mining algorithm to find important phenomena in the network. We also show that a simple interface withthe proposed technique can visualize not only the ordinal network status but also the various security incidents. the understanding of the ordinal network status and the rapid finding of security incident helps the naive users in monitoring and configuring network equipments. Finding of various security incidents from real Internet traffic data are described as the typical usage of the proposed visualization technique.
Recently, most of the studies on mining frequent patterns focus on improving the efficiency of frequent itemtsets generations, but the I/O cost of database scanning has been a bottle-neck problem in datamining. Many ...
详细信息
ISBN:
(纸本)9781424409723
Recently, most of the studies on mining frequent patterns focus on improving the efficiency of frequent itemtsets generations, but the I/O cost of database scanning has been a bottle-neck problem in datamining. Many algorithms proposed recently are based on Apriori and FP_tree, and the FP_growth algorithm based on FP_tree is more efficient than Apriori because the candidates are not generated. But the construction of FP_tree may spend much time. therefore, the goal of our research is to propose a fast algorithm. In this paper, Level FP_tree that is constructed level by level (abbreviate LFP_tree) is proposed. the algorithm contains two main parts. the first is to scan the database only once for generating equivalence classes of each item. the second is to delete the non-frequent items and rewrite the equivalence classes of the frequent items, and then construct the LFP_tree. Experimental results have proved that LFP_tree is more efficient and scalable than FP_tree.
Clustering analysis is an important function of datamining. Various clustering methods are need for different domains and applications. A clustering algorithm for datamining based on swarm intelligence called Ant-Cl...
详细信息
A perfect hash function for processing string is constructed by applying the Chinese Remainder theorem, and a fast string matching algorithm, which is suited to process the successive sequences like the network traffi...
详细信息
ISBN:
(纸本)9781424409723
A perfect hash function for processing string is constructed by applying the Chinese Remainder theorem, and a fast string matching algorithm, which is suited to process the successive sequences like the network traffic data, is presented. the theoretical analysis shows that this algorithm not only obtains the determinate match results but also holds a linear time complexity in the worst case. the experiment results for matching a sequence database in the network intrusion detection systems also shows that this algorithm is efficient.
the traditional forecasting of revenue growth rate (RGR) is based on normal distribution. Due to emergence of information technology today, datamining has become one of important research trends. therefore, this pape...
详细信息
ISBN:
(纸本)9781424409723
the traditional forecasting of revenue growth rate (RGR) is based on normal distribution. Due to emergence of information technology today, datamining has become one of important research trends. therefore, this paper mainly forecasts revenue growth rate of firms in stock trading systems by classification techniques. It is very important instrument for investors that correctly predict future growing firms from data of fundamental analysis in trading systems, because the accurate prediction of RGR will bring huge profit for investors in the future. this paper proposes a process to predict RGR of firms, which employs Decision tree C4.5, Bayes net, Multilayer perceptron and Rough sets techniques. Moreover, the paper uses the actual RGR dataset In Taiwan stock market to illustrate the proposed process. From the results, we recommend the rough set as analysis tool because the performance is superior to the listing methods and understandable rules are produced.
the leave-one-out cross-validation in nested sets of data models is traditionally considered in machinelearning as the basic instrument of finding the most appropriate subset of features or regressors in pattern reco...
详细信息
ISBN:
(纸本)9781424409723
the leave-one-out cross-validation in nested sets of data models is traditionally considered in machinelearning as the basic instrument of finding the most appropriate subset of features or regressors in patternrecognition and regression estimation. We extend the notion of a nested set of models onto the problem of time-varying regression estimation, which implies, in addition to the generic challenge of choosing the subset of regressors, also the inevitable necessity to choose the appropriate level of model volatility, ranging from the full stationarity of instant models in time to their absolute independence of each other. So, there are, at least, two axes of model nesting in the problem of nonstationary regression estimation, first, the relevant size of the set of regressors and, second, the level of model volatility in time. We use the leave-one-out measure of the model fit as quality indicator along both nesting axes. We apply the proposed technique to analysis of a hedge fund's returns and reverse-engineering its strategies.
Semantic pattern is kind of template with semantic information, and it can be used to structurize a sentence. In this paper, semantic pattern based question and its corresponding answers are used as the data resource ...
详细信息
ISBN:
(纸本)9781424409723
Semantic pattern is kind of template with semantic information, and it can be used to structurize a sentence. In this paper, semantic pattern based question and its corresponding answers are used as the data resource of knowledge base. the exact answer is extracted from plain text answer by matching of question and answer. Two kinds of matching methods and the experiment results are simple presented. Extracted exact answer and semantic pattern based question are composed of a structured information unit which can be inserted into knowledge base withthe assistance of ontology. the disease ontology is studied for illustration. Our approach to knowledge base construction is concise and effective.
this paper proposes a framework for agent-based distributed machinelearning and datamining based on (i) the exchange of meta-level descriptions of individual learning processes among agents and (ii) online reasoning...
详细信息
Tandem repeats occur frequently in the human genome. the functions of them are still largely unclear, but some of them have been shown to cause human disease, and have relationship with regulatory functions. thus, det...
详细信息
ISBN:
(纸本)9781424409723
Tandem repeats occur frequently in the human genome. the functions of them are still largely unclear, but some of them have been shown to cause human disease, and have relationship with regulatory functions. thus, detecting tandem repeats has considerable significance. Because of the undetermined length of repeat pattern and indels and substitutions existing in a tandem repeat, identifying a tandem repeat in genomic sequence data is a difficult task. In this paper, an efficient algorithm is proposed, which is based on the autoregressive (AR) model. We analyze residual errors of the AR model with different orders for a DNA sequence. According to changes of residual errors, we can determine whether a sequence contains a tandem repeat and what pattern size is. Examples show this algorithm can not only detect exact tandem repeats but also approximate ones.
During graduation design, the quality of the final dissertation is connected withthe appropriate reference reading. therefore, the tutors should cost a great deal of time to choose reference for the students' res...
详细信息
ISBN:
(纸本)9781424409723
During graduation design, the quality of the final dissertation is connected withthe appropriate reference reading. therefore, the tutors should cost a great deal of time to choose reference for the students' research. the paper trial used to carry out the reference auto recommendation model within a graduation design intelligent tutor system according to the personalized recommendation technique which based on the user characteristic;it assisted to the teacher's reference offer work availably. though the test use, it preferably completed the auto recommendation of the student reference weekly, and the satisfaction of the teachers' and the students' is high.
暂无评论