Given a large spatio-temporal database of events, where each event consists of the following fields: event-ID, time, location, event-type, mining spatio-temporal sequential patterns is to identify significant event ty...
详细信息
ISBN:
(纸本)9780898716115
Given a large spatio-temporal database of events, where each event consists of the following fields: event-ID, time, location, event-type, mining spatio-temporal sequential patterns is to identify significant event type sequences. Such spatio-temporal sequential patterns are crucial to investigate spatial and temporal evolutions of phenomena in many application domains. In this paper, we propose a sequence index as the significance measure for spatio-temporal sequential patterns, which is meaningful due to its interpretability using spatial statistics. We propose two algorithms, namely STS-Miner and Slicing-STS-Miner, to tackle the algorithmic design challenges under the spatial sequence index which does not preserve the downward closure property. We evaluate the algorithms by experimentally conducting performance evaluations using both synthetic and real world datasets.
the proceedings contain 68 papers. the topics discussed include: a comparative analysis of data distribution methods in an agent-based neural system for classification tasks;stochastic differential portfolio games wit...
详细信息
ISBN:
(纸本)0769526624
the proceedings contain 68 papers. the topics discussed include: a comparative analysis of data distribution methods in an agent-based neural system for classification tasks;stochastic differential portfolio games with regime switching model;extracting symbolic rules from clustering of gene expression data;a novel microarray gene selection method based on consistency;combining greedy method and genetic algorithm to identify transcription factor binding sites;investigation of a new artificial immune system model applied to patternrecognition;RLM: a new method of encoding weights in DNA strands;shape representation and distance measure based on retational graph;fast modeling of curved object from two images;research on an improved gray gradient orientation algorithm in anisotropic high-pass filtering;and image color reduction based on self-organizing maps and growing self-organizing neural networks.
Choosing an appropriate kernel is one of the key problems in kernel-based methods. Most existing kernel selection methods require that the class labels of the training examples are known. In this paper, we propose an ...
详细信息
Clustering is one branch of unsupervised machinelearningtheory, which has a wide variety of applications in patternrecognition, image processing, economics, document categorization, web mining, etc. Today, we const...
详细信息
ISBN:
(纸本)3540465359
Clustering is one branch of unsupervised machinelearningtheory, which has a wide variety of applications in patternrecognition, image processing, economics, document categorization, web mining, etc. Today, we constantly face how to handle a large number of similar data items, which drives many researchers to contribute themselves to this field. Support vector machine provides a new pathway for clustering, however, it behaves bad in handling massive data. As an emergent theory, artificial immune system can effectively recognize antigens and produce the memory antibodies. this mechanism is constantly used to achieve representative or feature data from raw data. A combinational clustering method is proposed in this paper based on artificial immune system and support vector machine, Experimentation in functionality and performance is done in detail. Finally a more challenging application in elevator industry is conducted. the results strongly indicate that this combinational clustering in this paper is of feasibility and of practice.
A conceptual clustering program CLUSTER3 is described that, given a set of objects represented by attribute-value tuples, groups them into clusters described by generalized conjunctive descriptions in attributional ca...
详细信息
ISBN:
(纸本)1845641787
A conceptual clustering program CLUSTER3 is described that, given a set of objects represented by attribute-value tuples, groups them into clusters described by generalized conjunctive descriptions in attributional calculus. the descriptions are optimized according to a user-designed multi-criterion clustering quality measure. the clustering process in CLUSTER3 depends on a viewpoint underlying the clustering goal, and employs the view-relevant attribute subsetting method (VAS) that selects for clustering only attributes relevant to this viewpoint. the program is illustrated by a simple designed problem and by its application to clustering of US Congressional voting records. the ongoing research concerns application of CLUSTER3 to large and complex datasets such as collections of web pages.
this paper summarises the current literature on immune system function and behaviour, including patternrecognition receptors, danger theory, central and peripheral tolerance, and memory cells. An artificial immune sy...
详细信息
ISBN:
(纸本)3540473319
this paper summarises the current literature on immune system function and behaviour, including patternrecognition receptors, danger theory, central and peripheral tolerance, and memory cells. An artificial immune system framework is then presented based on the analogies of these natural system components and a rule and feature-based problem representation. A data set for intrusion detection is used to highlight the principles of the framework.
Document clustering has been traditionally studied as a centralized process. there are scenarios when centralized clustering does not serve the required purpose;e.g. documents spanning multiple digital libraries need ...
详细信息
ISBN:
(纸本)9780898716115
Document clustering has been traditionally studied as a centralized process. there are scenarios when centralized clustering does not serve the required purpose;e.g. documents spanning multiple digital libraries need not be clustered in one location, but rather clustered at each location, then enriched by receiving more information from other locations. A distributed collaborative approach for document clustering is proposed in this paper. the main objective here is to allow peers in a network to form independent opinions of local document grouping, followed by exchange of cluster summaries in the form of keyphrase vectors. the nodes then expand and enrich their local solution by receiving recommended documents from their peers based on the peer judgement of the similarity of local documents to the exchanged cluster summaries. Results show improvement in final clustering after merging peer recommendations. the approach allows independent nodes to achieve better local clustering by having access to distributed data without the cost of centralized clustering, while maintaining the initial local clustering structure and coherency.
Automatic segmentation and classification of recorded meetings provides a basis that enables effective browsing and querying in a meeting archive. Yet, robustness of today's approaches is often not reliable enough...
详细信息
ISBN:
(纸本)9781424403660
Automatic segmentation and classification of recorded meetings provides a basis that enables effective browsing and querying in a meeting archive. Yet, robustness of today's approaches is often not reliable enough. We therefore strive to improve on this task by introduction of a hybrid approach combining the discriminative abilities of artificial neural nets and warping capabilities of hidden markov models. Dividing the task into two layers and defining a proper set of individual actions helps to cope withthe problem of lack of data and overcomes conventional single-layered approaches. Extensive test runs on the public M4 Scripted Meeting Corpus prove the great performance gain applying our suggested novel approach compared to other similar methods.
Classification of data with imbalanced class distribution has posed a significant drawback of the performance attainable by most standard classifier learning algorithms, which assume a relatively balanced class distri...
详细信息
Recent events have made it clear that some kinds of technical texts, generated by machine and essentially meaningless, can be confused with authentic, technical texts written by humans. We identify this as a potential...
详细信息
ISBN:
(纸本)9780898716115
Recent events have made it clear that some kinds of technical texts, generated by machine and essentially meaningless, can be confused with authentic, technical texts written by humans. We identify this as a potential problem. since no existing systems for, say the web, can or do discriminate on this basis. We believe that there are subtle, short- and long-range word or even string repetitions extant in human texts, but not in many classes of computer generated texts, that can be used to discriminate based on meaning. In this paper we employ universal lossless source coding to generate features in a high-dimensional space and then apply support vector machines to discriminate between the classes of authentic and inauthentic expository texts. Compression profiles for the two kinds of text are distinct the authentic texts being bounded by various classes of more compressible or less compressible texts that are computer generated. this in turn led to the high prediction accuracy of our models which support a conjecture that there exists a relationship between meaning and compressibility. Our results show that the learning algorithm based upon the compression profile outperformed standard term-frequency text categorization on several non-trivial classes of inauthentic texts. Availability: http://***/predrag/***.
暂无评论