Many clustering algorithms have been proposed to partition a set of static data points into groups. In this paper, we consider an evolutionary clustering problem where the input data points may move, disappeare, and e...
详细信息
ISBN:
(纸本)9780898716306
Many clustering algorithms have been proposed to partition a set of static data points into groups. In this paper, we consider an evolutionary clustering problem where the input data points may move, disappeare, and emerge. Generally, these changes should result in a smooth evolution of the clusters. Mining this naturally smooth evolution is valuable for providing an aggregated view of the numerous individual behaviors. We solve this novel and generalized form of clustering problem by converting it into a Bayesian learning problem. Analogous to that the EM clustering algorithm clusters static data points by learning a Gaussian mixture model, our method mines the evolution of clusters from dynamic data points by learning a hidden semi-Markov model (HSMM). By utilizing characteristics of the evolutionary clustering problem, we derive a new unsupervised learning algorithm which is much more efficient than the algorithms used to learn traditional variable-duration HSMMs. Because the HSMM models the probabilistic relationship between the dynamic data set and corresponding evolving clusters, we can interpret the learned parameters as the evolving clusters intuitively using the Viterbi filtering technique. Because learning an HSMM is in fact learning an optimal Viterbi filter, the learned cluster evolutions are smooth and fit well with the data. We demonstrate the effectiveness of this method by experiments on both synthetic data and real data.
Comparing directed acyclic graphs (DAGs) is essential in various fields such as healthcare, social media, finance, biology, and marketing. DAGs often result from contagion processes over networks, including informatio...
详细信息
Comparing directed acyclic graphs (DAGs) is essential in various fields such as healthcare, social media, finance, biology, and marketing. DAGs often result from contagion processes over networks, including information spreading, retweet activity, disease transmission, financial crisis propagation, malware spread, and gene mutations. For instance, in disease spreading, an infected patient can transmit the disease to contacts, making it crucial to analyze and predict scenarios. Similarly, in finance, understanding the effects of saving or not saving specific banks during a crisis is vital. Experts often need to identify small differences between DAGs, such as changes in a few nodes or edges. Even the presence or absence of a single edge can be significant. visualization plays a crucial role in facilitating these comparisons. However, standard hierarchical layout algorithms struggle to visualize subtle changes effectively. The typical hierarchical layout, with the root on top, is preferred due to its performance in comparison to other layouts. Nevertheless, these standard algorithms prioritize single-graph aesthetics over comparison suitability, making it challenging for users to spot changes. To address this issue, we propose a layout that enhances shape changes in DAGs while minimizing the impact on aesthetics. Our approach involves outwardly swapping changes, altering the DAG’s shape. We introduce new drawing criteria: 1. Criteria for maximizing outward swaps of graph changes. 2. Criteria for reshaping the DAG by repositioning swapped changes. 3. Criteria for handling changes that cannot be outwardly swapped. Our layout builds upon a Sugiyama-like hierarchical layout and implements these criteria through two extensions. We designed it this way to maintain interchangeability and accommodate future optimizations, such as pseudo-nodes for edge crossing minimization. In our evaluations, our layout achieves excellent results, with edge crossing aesthetics averaging arou
暂无评论