Most bioinformatics algorithms are developed in a serial form due to a fast pace of changes in the subject domain and the fact that many bioinformatics tasks can be parallelized as collections of serial jobs communica...
详细信息
Most bioinformatics algorithms are developed in a serial form due to a fast pace of changes in the subject domain and the fact that many bioinformatics tasks can be parallelized as collections of serial jobs communicating at the file system level (High-Throughput Computing, HTC). Recently, a MapReduce-MPI library was made available by Sandia Lab to ease porting of a large class of serial applications to the High Performance Computing (HPC) architectures dominating large federated resources such as NSF Tera Grid. Using this library, we have created two open-source bioinformatics applications. The first one addresses a problem of adapting existing complex and highly optimized serial bioinformatics algorithm to HPC architecture in a minimally invasive way. We built a parallel BLAST implementation that calls the high-level methods of unmodified NCBI C++ Toolkit. We demonstrated scaling for up to 1000 cores on TACC Ranger cluster when processing the sufficiently large input datasets. Using unmodified NCBI Toolkit ensures that the results are compatible across the multitude of settings in the original serial algorithm, and that future versions of the upstream code can be easily integrated. The second application is a Self-Organizing Map (SOM) machine-learning algorithm, popular in bioinformatics applications such as metagenomic binning. The nature of the SOM requires a global synchronization step with a frequency that necessitates the use of an HPC environment. Our implementation of the "batch SOM" uses a mix of MapReduce-MPI and direct MPI calls and scales to 1000 cores as well. This allows easy processing of datasets with a size that is out of range of the serial SOM implementations. Both implementations are available in the open source at http://***/mgtaxa/.
Pattern databases (PDBs) store heuristic estimates that are used to improve the performance of heuristic search algorithms. They are key to the success of heuristic search in many application domains. While it is know...
详细信息
ISBN:
(纸本)9781605589428
Pattern databases (PDBs) store heuristic estimates that are used to improve the performance of heuristic search algorithms. They are key to the success of heuristic search in many application domains. While it is known [12] that the efiectiveness of PDBs critically depends on their size, current implementations use only small PDBs because they require random access to main memory. We present two MapReduce implementations that do not require random memory access and therefore enable larger PDBs than were previously possible. The first one, named MR-BFFS, is a parallel breadth-first frontier search. It is used for generating arbitrarily large PDBs out-of-core. The second one, MR-IDA*, uses out-of-core PDBs to perform a breadth-first iterative-deepening A* search. Both scale perfectly on massively parallelsystems and they make use of all available resources like CPUs, distributed memories, and disks. We demonstrate the performance of our algorithms and provide, as a byproduct of this research, the first complete evaluation of dual additive PDBs for the 8-puzzle. We also provide data on larger problem spaces and discuss the efiectiveness of PDBs for improving the search. Copyright 2010 ACM.
A wide range of real-time applications process stream-based data. To process this stream-based data in an application-independent manner, many stream processing systems have been built. However, none of them reached a...
详细信息
This paper investigates scheduling loosely coupled task-bundles in highly heterogeneous distributedsystems. Two allocation quality metrics are used in pay-per-service distributed applications: efficiency in terms of ...
详细信息
As the most widely used parallel job scheduling strategy in production schedulers, EASY has achieved great success, not only because it can balance fairness and performance, but also because it is universally applicab...
详细信息
ISBN:
(纸本)9781605589428
As the most widely used parallel job scheduling strategy in production schedulers, EASY has achieved great success, not only because it can balance fairness and performance, but also because it is universally applicable to most HPC systems. However, unfairness still exists in EASY. For real workloads used in this work, our simulation shows that a blocked job can be delayed by later jobs for more than 90 hours. In addition, EASY cannot directly employ parallel job runtime prediction techniques, because this would lead to a serious situation called reservation violation. In this paper, we aim at guaranteeing strict fairness (no job is delayed by any jobs of lower priority) while achieving attractive performance, and employing prediction without causing reservation violation in parallel job scheduling. We propose two novel strategies, shadow load preemption (SLP) and venture backfilling (VB), which are together integrated into EASY to construct a preemptive venture EASY backfilling (PV-EASY) strategy. Experimental results on three workloads of real HPC systems demonstrate that: First, PV-EASY guarantees strict fairness, in addition to avoiding reservation violation when employing job runtime prediction techniques in scheduling;second, PV-EASY achieves the same performance as EASY, and outperforms prediction employed EASY;Third, the preemption in PV-EASY is not resource costly and simple enough to be implemented in all HPC systems where EASY works. These advantages make PV-EASY more attractive than EASY in parallel job scheduling, both from academic and industry perspectives. Copyright 2010 ACM.
We investigate the problem of maintaining a topology with small degree as well as small diameter in a dynamic distributed system such that the system always stays connected and processes that wish to leave the system ...
详细信息
The federated database architecture has been introduced to maintain the autonomy of individual data sources yet accomplish federated task for diverse applications from traditional enterprises to computational sciences...
详细信息
ISBN:
(纸本)9781605589008
The federated database architecture has been introduced to maintain the autonomy of individual data sources yet accomplish federated task for diverse applications from traditional enterprises to computational sciences. We identify two challenging problems of query optimization in large-scale database federation systems. First, run-time conditions of data sources have a profound effect on the performance of database federations, yet the distributed environment of database federations makes it prohibitively expensive for the optimizer to gather rapidly fluctuating run-time conditions from remote data sources. second, large-scale database federation systems are often widely distributed and built on heterogeneous networks, thus efficiently utilizing network resources is of ever increasing importance for query scheduling. In this paper, we propose to exploit the clustered hierarchical structure of database federations to solve these two problems. Our Cluster-and-Conquer strategy coordinates hierarchical clusters of data sources to optimize and process queries cooperatively. Within each cluster we employ an I/O-bound cost model with run-time conditions being accessible with relatively little delay. While among clusters a network-bound cost model is instead utilized to capture the network heterogeneity and optimize the query plans for efficient network utilization. The experimental study on the prototype database federation system with real-world network settings shows the effectiveness of our Cluster-and Conquer strategy for scheduling data-intensive queries, as well as demonstrates the performance benefits of our proposed strategies over existing state-of-art solutions.
Exploiting parallelism in route planning algorithms is a challenging algorithmic problem with obvious applications in mobile navigation and timetable information systems. In this work, we present a novel algorithm for...
详细信息
The pervasive use of Peer-to-Peer (P2P) systems and the growing demand on personalization for consumers has made future business focus on the niche market instead of the mass market. The recommender system, which is a...
详细信息
The pervasive use of Peer-to-Peer (P2P) systems and the growing demand on personalization for consumers has made future business focus on the niche market instead of the mass market. The recommender system, which is able to timely select data of the interest to each individual user, has become the key to any successful business. However, currently most recommendation systems are based on a centralized architecture; nonetheless, they are not suitable for P2P environments. In this article, we propose a distributed semantic P2P overlay, which can provide music search and recommendation services by considering both of user preference and diversity of interests. We then propose a 3C hybrid recommendation procedure that adapts three traditional filtering techniques to fitting the requirements in a distributed semantic overlay. First, we choose a set of proper meta-data to represent a music object and use them to construct the characteristic-vector-based content filter. second, a dominant attribute, which is one of the attributes in the characteristic vector of a music object, is used to build the profile of a peer. With the idea of the social network, a P2P profile-based collaborative filter is proposed. Finally, we explore the item-to-item relationship to construct a history-based cooperative filter. We use simulations and a real database called Audio Scrobbler, which tracks users' listening habits, to evaluate the performance of the recommendation system. The results demonstrate the effectiveness of the proposed approach compared with existing recommendation systems.
Network emulation is an efficient method for evaluating distributed applications and communication protocols by combining the benefits of real world experiments and network simulation. The process of network emulation...
详细信息
Network emulation is an efficient method for evaluating distributed applications and communication protocols by combining the benefits of real world experiments and network simulation. The process of network emulation involves execution of thousands of connected virtual nodes running the software under test in a controlled environment. Along with the quality of the experiment results, the runtime of network experiments strongly influences the convenience of users and operators of emulation testbeds. The goal of this paper is, therefore, to minimize the experiment runtime of network emulations. In order to achieve this goal, we make the following contributions in this paper: First, we present a highly scalable emulation architecture to efficiently support network emulation testbeds with multicore CPUs. second, we propose a detailed and generic cost model for the communication costs of emulation testbeds. Third, we present an efficient placement strategy (NETplace) to assign virtual nodes to physical nodes of the testbed while minimizing the runtime of network experiments. Therefore, we combine graph partitioning and greedy approaches. Our evaluations show that our placement strategy outperforms existing methods by reducing the experiment runtime up to 64%.
暂无评论