Let P be a set of points in R-d. We propose GEOFILTERKRUSKAL, an algorithm that computes the minimum spanning tree of P using well separated pair decomposition in combination with a simple modification of Kruskal'...
详细信息
ISBN:
(纸本)9783642131929
Let P be a set of points in R-d. We propose GEOFILTERKRUSKAL, an algorithm that computes the minimum spanning tree of P using well separated pair decomposition in combination with a simple modification of Kruskal's algorithm. When P is sampled from uniform random distribution, we show that our algorithm takes one parallel sort plus a linear number of additional steps, with high probability, to compute the minimum spanning tree. Experiments show that our algorithm works better in practice for most data distributions compared to the current state of the art [31]. Our algorithm is easy to parallelize and to our knowledge, is currently the best practical algorithm on multi-core machines for d > 2.
Analyzing data from large experimental suites is a daily task for anyone doing experimental algorithmics. In this paper we report on several approaches we tried for this seemingly mundane task in a similarity search s...
详细信息
ISBN:
(数字)9783030609368
ISBN:
(纸本)9783030609368
Analyzing data from large experimental suites is a daily task for anyone doing experimental algorithmics. In this paper we report on several approaches we tried for this seemingly mundane task in a similarity search setting, reflecting on the challenges it poses. We conclude by proposing a workflow, which can be implemented using several tools, that allows to analyze experimental data with confidence.
We address the problem of implementing data structures resilient to memory faults, which may arbitrarily corrupt memory locations. In this framework, we focus on the implementation of dictionaries and perform a thorou...
详细信息
The purpose of this special issue of Algorithms was to attract papers presenting original research in the area of algorithm engineering. In particular, submissions concerning the design, analysis, implementation, tuni...
详细信息
The purpose of this special issue of Algorithms was to attract papers presenting original research in the area of algorithm engineering. In particular, submissions concerning the design, analysis, implementation, tuning, and experimental evaluation of discrete algorithms and data structures, and/or addressing methodological issues and standards in algorithmic experimentation were encouraged. Papers dealing with advanced models of computing, including memory hierarchies, cloud architectures, and parallel processing were also welcome. In this regard, we solicited contributions from all most prominent areas of applied algorithmic research, which include but are not limited to graphs, databases, computational geometry, big data, networking, combinatorial aspects of scientific computing, and computational problems in the natural sciences or engineering.
Computing shortest-path distances is a fundamental primitive in the context of graph data mining, since this kind of information is essential in a broad range of prominent applications, which include social network an...
详细信息
Computing shortest-path distances is a fundamental primitive in the context of graph data mining, since this kind of information is essential in a broad range of prominent applications, which include social network analysis, data routing, web search optimization, database design and route planning. Standard algorithms for shortest paths (e.g., Dijkstra's) do not scale well with the graph size, as they take more than a second or huge memory overheads to answer a singlequery on the distancefor large-scale graph datasets. Hence, they are not suited to mine distances from big graphs, which are becoming the norm in most modern application contexts. Therefore, to achieve faster query answering, smarter and more scalable methods have been designed, the most effective of them based on precomputing and querying a compact representation of the transitive closure of the input graph, called the 2-hop-coverlabeling. To use such approaches in realistictime-evolvingscenarios, when the managed graph undergoes topological modifications over time, specificdynamic algorithms, carefully updating the labeling as the graph evolves, have been introduced. In fact, recomputing from scratch the 2-hop-coverstructure every time the graph changes is not an option, as it induces unsustainable time overheads. While the state-of-the-art dynamic algorithm to update a 2-hop-coverlabeling againstincrementalmodifications (insertions of arcs/vertices, arc weights decreases) offers very fast update times, the only known solution fordecrementalmodifications (deletions of arcs/vertices, arc weights increases) is still far from being considered practical, as it requires up to tens of seconds of processing per update in several prominent classes of real-world inputs, as experimentation shows. In this paper, we introduce a new dynamic algorithm to update 2-hop-coverlabelings against decremental changes. We prove its correctness, formally analyze its worst-case performance, and assess its effectiveness throug
This paper studies the journey planning problem in the context of transit networks. Given the timetable of a schedule-based transportation system (consisting, e.g., of trains, buses, etc.), the problem seeks journeys ...
详细信息
This paper studies the journey planning problem in the context of transit networks. Given the timetable of a schedule-based transportation system (consisting, e.g., of trains, buses, etc.), the problem seeks journeys optimizing some criteria. Specifically, it seeks to answer natural queries such as, for example, "find a journey starting from a source stop and arriving at a target stop as early as possible". The fastest approach for answering to these queries, yielding the smallest average query time even on very large networks, is the Public Transit Labeling framework, proposed for the first time in Delling et al., SEA 2015. This method combines three main ingredients: (i) a graph-based representation of the schedule of the transit network;(ii) a labeling of such graph encoding its transitive closure (computed via a time-consuming pre-processing);(iii) an efficient query algorithm exploiting both (i) and (ii) to answer quickly to queries of interest at runtime. Unfortunately, while transit networks' timetables are inherently dynamic (they are often subject to delays or disruptions), ptl is not natively designed to handle updates in the schedule-even after a single change, precomputed data may become outdated and queries can return incorrect results. This is a major limitation, especially when dealing with massively sized inputs (e.g., metropolitan or continental sized networks), as recomputing the labeling from scratch, after each change, yields unsustainable time overheads that are not compatible with interactive applications. In this work, we introduce a new framework that extends ptl to function in delay-prone transit networks. In particular, we provide a new set of algorithms able to update both the graph and the precomputed labeling whenever a delay affects the network, without performing any recomputation from scratch. We demonstrate the effectiveness of our solution through an extensive experimental evaluation conducted on real-world networks. Our experiments
Motivation: The biologically meaningful algorithmic study of genome rearrangement should take into account the distribution of sizes of the rearranged genomic fragments. In particular, it is important to know the prev...
详细信息
Motivation: The biologically meaningful algorithmic study of genome rearrangement should take into account the distribution of sizes of the rearranged genomic fragments. In particular, it is important to know the prevalence of short inversions in order to understand the patterns of gene order disruption observed in comparative genomics. Results: We find a large excess of short inversions, especially those involving a single gene, in comparison with a random inversion model. This is demonstrated through comparison of four pairs of bacterial genomes, using a specially-designed implementation of the Hannenhalli-Pevzner theory, and validated through experimentation on pairs of random genomes matched to the real pairs.
Sorting is a fundamental algorithmic task. Many general-purpose sorting algorithms have been developed, but efficiency gains can be achieved by designing algorithms for specific kinds of data, such as strings. In prev...
详细信息
ISBN:
(纸本)9780909925949
Sorting is a fundamental algorithmic task. Many general-purpose sorting algorithms have been developed, but efficiency gains can be achieved by designing algorithms for specific kinds of data, such as strings. In previous work we have shown that our burstsort, a trie-based algorithm for sorting strings, is for large data sets more efficient than all previous algorithms for this task. In this paper we re-evaluate some of the implementation details of burstsort, in particular the method for managing buckets held at leaves. We show that better choice of data structures further improves the efficiency, at a small additional cost in memory. For sets of around 30,000,000 strings, our improved burstsort is nearly twice as fast as the previous best sorting algorithm.
We propose two variants of K-d trees where fingers are used to improve the performance of orthogonal range search and nearest neighbor queries when they exhibit locality of reference. The experiments show that the sec...
详细信息
暂无评论