Version control for models is not yet supported in an adequate way. In this paper, we address three-way merging of model versions. Based on a common base version b, two alternative versions a(1) and a(2) were develope...
详细信息
ISBN:
(纸本)9789897580659
Version control for models is not yet supported in an adequate way. In this paper, we address three-way merging of model versions. Based on a common base version b, two alternative versions a(1) and a(2) were developed by copying and modifying the base version. To reconcile these changes, a merged version m is to be created as a common successor of a(1) and a(2). We present a graph algorithm to solve an important subproblem which occurs in three-way model merging: merging of (linearly) ordered collections. To create the merged version, a generalized topological sort is performed. Conflicts occur if the order of elements cannot be deduced automatically;these conflicts are resolved either interactively or by default rules. We have implemented the merge algorithm in our tool BTMerge, which performs a consistency-preserving merge of versions of EMF models being instances of arbitrary Ecore models. By taking arbitrary move operations into account, the algorithm considerably goes beyond the functionality of contemporary merge tools which are based on common subsequences and thus cannot adequately handle move operations.
First-order logic captures a vast number of computational problems on graphs. We study the time complexity of deciding graph properties definable by first-order sentences in prenex normal form with k variables. The tr...
详细信息
ISBN:
(纸本)9781450328869
First-order logic captures a vast number of computational problems on graphs. We study the time complexity of deciding graph properties definable by first-order sentences in prenex normal form with k variables. The trivial algorithm for this problem runs in O(n(k)) time on n-node graphs (the big-O hides the dependence on k). Answering a question of Miklos Ajtai, we give the first algorithms running faster than the trivial algorithm, in the general case of arbitrary first-order sentences and arbitrary graphs. One algorithm runs in O(n(k-3+omega)) <= O(n(k-0.627)) time for all k >= 3, where omega < 2.373 is the n x n matrix multiplication exponent. By applying fast rectangular matrix multiplication, the algorithm can be improved further to run in n(k-1+o(1)) time, for all k >= 9. Finally, we observe that the exponent of k - 1 is optimal, under the popular hypothesis that CNF satisfiability with n variables and m clauses cannot be solved in (2 - epsilon)(n) . poly(m) time for some epsilon > 0.
The world of Big Data is changing dramatically right before our eyes-from the amount of data being produced to the way in which it is structured and used. The trend of "big data growth" presents enormous cha...
详细信息
ISBN:
(纸本)9780769552071
The world of Big Data is changing dramatically right before our eyes-from the amount of data being produced to the way in which it is structured and used. The trend of "big data growth" presents enormous challenges, but it also presents incredible scientific and business opportunities. Together with the data explosion, we are also witnessing a dramatic increase in data processing capabilities, thanks to new powerful parallel computer architectures and more sophisticated algorithms. In this paper we describe the algorithmic design and the optimization techniques that led to the unprecedented processing rate of 15.3 trillion edges per second on 64 thousand BlueGene/Q nodes, that allowed the in-memory exploration of a petabyte-scale graph in just a few seconds. This paper provides insight into our parallelization and optimization techniques. We believe that these techniques can be successfully applied to a broader class of graph algorithms.
Computer-assisted chemical structure elucidation has been intensively studied since the first use of computers in chemistry in the 1960s. Most of the existing elucidators use a structure-spectrum database to obtain cl...
详细信息
Computer-assisted chemical structure elucidation has been intensively studied since the first use of computers in chemistry in the 1960s. Most of the existing elucidators use a structure-spectrum database to obtain clues about the correct structure. Such a structure-spectrum database is expected to grow on a daily basis. Hence, the necessity to develop an efficient structure elucidation system that can adapt to the growth of a database has been also growing. Therefore, we have developed a new elucidator using practically efficient graph algorithms, including the convex bipartite matching, weighted bipartite matching, and Bron-Kerbosch maximal clique algorithms. The utilization of the two matching algorithms especially is a novel point of our elucidator. Because of these sophisticated algorithms, the elucidator exactly produces a correct structure if all of the fragments are included in the database. Even if not all of the fragments are in the database, the elucidator proposes relevant substructures that can help chemists to identify the actual chemical structures. The elucidator, called the CAST/CNMR Structure Elucidator, plays a complementary role to the CAST/CNMR Chemical Shift Predictor, and together these two functions can be used to analyze the structures of organic compounds.
In recent years, real-time data mining for large-scale time evolving graphs is becoming a hot research topic. Most of the prior arts target relatively static graphs and also process them in store-and-process batch pro...
详细信息
ISBN:
(纸本)9781450327459
In recent years, real-time data mining for large-scale time evolving graphs is becoming a hot research topic. Most of the prior arts target relatively static graphs and also process them in store-and-process batch processing model. In this paper we propose a method of applying on-the-fly and incremental graph stream computing model to such dynamic graph analysis. To process large-scale graph streams on a cluster of nodes dynamically in a scalable fashion, we propose an incremental large-scale graph processing model called "Incremental GIM-V (Generalized Iterative Matrix-Vector Multiplication)". We also design and implement UNICORN, a system that adopts the proposed incremental processing model on top of IBM InfoSphere Streams. Our performance evaluation demonstrates that our method achieves up to 48% speedup on PageRank with Scale 16 Log-normal graph (vertexes=65,536, edges=8,364,525) with 4 nodes, 3023% speedup on Random walk with Restart with Kronecker graph with Scale 18 (vertexes=262,144, edges=8,388,608) with 4 nodes against original GIM-V.
Evaluating the performance of researchers and measuring the impact of papers written by scientists is the main objective of citation analysis. Various indices and metrics have been proposed for this. In this paper, we...
详细信息
ISBN:
(纸本)9781467364300
Evaluating the performance of researchers and measuring the impact of papers written by scientists is the main objective of citation analysis. Various indices and metrics have been proposed for this. In this paper, we propose a new citation index CITEX, which gives normalized scores to authors and papers to determine their rankings. To the best of our knowledge, this is the first citation index which simultaneously assigns scores to both authors and papers. Using these scores, we can get an objective measure of the reputation of an author and the impact of a paper. We model this problem as an iterative computation on a publication graph, whose vertices are authors and papers, and whose edges indicate which author has written which paper. We prove that this iterative computation converges in the limit, by using a powerful theorem from linear algebra. We run this algorithm on several examples, and find that the author and paper scores match closely with what is suggested by our intuition. The algorithm is theoretically sound and runs very fast in practice. We compare this index with several existing metrics and find that CITEX gives far more accurate scores compared to the traditional metrics.
Sweep coverage is one of the important and recent issue for network monitoring in wireless sensor network (WSN). Sweep coverage is efficiently applicable where periodic monitoring is sufficient than continuous monitor...
详细信息
ISBN:
(纸本)9781509004881
Sweep coverage is one of the important and recent issue for network monitoring in wireless sensor network (WSN). Sweep coverage is efficiently applicable where periodic monitoring is sufficient than continuous monitoring. Interference minimization is another main objective of topology control problem in wireless sensor network. Reducing interference in turn reduces the number of collision and packet retransmission, which decreases the energy consumption and increases lifetime of the network. In this paper, we introduce the Interference minimization global t-Gsweep coverage problem (IMGtSCP) where the objective is to minimize the interference number induced by set of mobile and static sensors with the constraint that the given set of point of interests (PoIs) guarantee t-Gsweep coverage. We prove that the IMGtSCP problem is NP-hard and cannot be approximated with a factor 2. We propose a new heuristic called Interference Minimization Sweep Coverage with Static and Mobile Sensors. We have shown that the time complexity of this heuristics is O(n~3 log n), We also conduct simulation to show the performance of our proposed algorithm.
The increasing energy consumption of high performance computing has resulted in rising operational and environmental costs. Therefore, reducing the energy consumption of computation is an emerging area of interest. We...
详细信息
ISBN:
(纸本)9783642552243
The increasing energy consumption of high performance computing has resulted in rising operational and environmental costs. Therefore, reducing the energy consumption of computation is an emerging area of interest. We study the approach of data sampling to reduce the energy costs of sparse graph algorithms. The resulting error levels for several graph metrics are measured to analyze the trade-off between energy consumption reduction and error. The three types of graphs studied, real graphs, synthetic random graphs, and synthetic small-world graphs, each show distinct behavior. Across all graphs, the error cost is initially relatively low. For example, four of the five real graphs studied needed less than a third of total energy to retain a degree centrality rank correlation coefficient of 0.85 when random vertices were removed. However, the error incurred for further energy reduction grows at an increasing rate, providing diminishing returns.
The present paper addresses the class of two-stage robust optimization problems which can be formulated as mathematical programs with uncertainty on the right-hand side coefficients (RHS uncertainty). The wide variety...
详细信息
The present paper addresses the class of two-stage robust optimization problems which can be formulated as mathematical programs with uncertainty on the right-hand side coefficients (RHS uncertainty). The wide variety of applications and the fact that many problems in the class have been shown to be NP-hard, motivates the search for efficiently solvable special cases. Accordingly, the first objective of the paper is to provide an overview of the most important applications and of various polynomial or pseudo-polynomial special cases identified so far. The second objective is to introduce a new subclass of polynomially solvable robust optimization problems with RHS uncertainty based on the concept of state-space representable uncertainty sets. A typical application to a multi period energy production problem under uncertain customer load requirements is described into details, and computational results including a comparison between optimal two-stage solutions and exact optimal multistage strategies are discussed.
Techniques for network security analysis have historically focused on the actions of the network hosts. Outside of forensic analysis, little has been done to detect or predict malicious or infected nodes strictly base...
详细信息
Techniques for network security analysis have historically focused on the actions of the network hosts. Outside of forensic analysis, little has been done to detect or predict malicious or infected nodes strictly based on their association with other known malicious nodes. This methodology is highly prevalent in the graph analytics world, however, and is referred to as community detection. In this paper, we present a method for detecting malicious and infected nodes on both monitored networks and the external Internet. We leverage prior community detection and graphical modeling work by propagating threat probabilities across network nodes, given an initial set of known malicious nodes. We enhance prior work by employing constraints that remove the adverse effect of cyclic propagation that is a byproduct of current methods. We demonstrate the effectiveness of probabilistic threat propagation on the tasks of detecting botnets and malicious web destinations.
暂无评论