The development of descriptive languages for resource characterization is one of the most active research field in distributed computer science. It is mainly considered in the so called Semantic Web scenario, where th...
详细信息
ISBN:
(纸本)9781424422487
The development of descriptive languages for resource characterization is one of the most active research field in distributed computer science. It is mainly considered in the so called Semantic Web scenario, where the availability of self-describing resources is seen to leverage integration, sharing and reuse. The semantic web vision can be applied to the Knowledge Discovery in databases (KDD) field as well, when collaboration among distributed resources has to be taken into account. The present paper takes this perspective, proposing a descriptive language for the characterization of KDD services, tools and algorithms. In order to do so, first a systematization of the bulk of information about services, tools and algorithms is given. Finally, the paper discusses how the language relates to existing standards and how it can be exploited to give support to users of a collaborative distributed environment.
Exploratory Visualization is an approach for helping users learn about distributed computations without requiring users to examine source code. Instead visualizations provide intuition about the program's behavior...
详细信息
ISBN:
(纸本)0769506348
Exploratory Visualization is an approach for helping users learn about distributed computations without requiring users to examine source code. Instead visualizations provide intuition about the program's behavior and serve as an interface through which the programs are controlled. We have developed an exploratory visualization system with the goals of providing an intuitive and user-friendly user interface and developing an infrastructure that minimizes perturbation. We present a case-study to describe how a naive user can interact with the system to learn about and experiment with the running computation.
''parallel I/O'' is the support of a single parallel application run an many nodes;application data is distributed among the nodes, and is read or written to a single logical fife, itself spread across...
详细信息
ISBN:
(纸本)0818681187
''parallel I/O'' is the support of a single parallel application run an many nodes;application data is distributed among the nodes, and is read or written to a single logical fife, itself spread across nodes and and disks. parallel I/O is a mapping problem from the data layout in node memory to the file layout on disks. Since the mapping can be quite complicated and involve significant data movement, optimizing the mapping is critical for performance. We discuss our general model of the problem, describe four Collective Buffering algorithms we designed, and report experiments testing their performance on an Intel Paragon and an IBM SP2 both housed at NASA Ames Research Center. Our experiments show improvements of up to two order of magnitude over standard techniques and the potential to deliver peak performance with minimal hardware support.
In this paper we present recovery techniques for distributed main-memory databases, specifically for client-server and shared-disk architectures. We present a recovery scheme for client-server architectures which is b...
详细信息
ISBN:
(纸本)081867475X
In this paper we present recovery techniques for distributed main-memory databases, specifically for client-server and shared-disk architectures. We present a recovery scheme for client-server architectures which is based on shipping log records to the server, and two recovery schemes for shared-disk architectures-one based on page shipping, and the other based on broadcasting of the log of updates. The schemes offer different tradeoffs, based on factors such as update rates. Our techniques are extensions to a distributed-memory setting of a centralized recovery scheme for main-memory databases, which has been implemented in the Dali main-memory database system. Our centralized as well as distributed-memory recovery schemes have several attractive features-they support an explicit multi-level recovery abstraction for high concurrency, reduce disk I/O by writing only redo log records to disk during normal processing, and use per-transaction redo and undo logs to reduce contention on the system log. Further, the techniques use a fuzzy checkpointing scheme that writes only dirty pages to disk, yet minimally interferes with normal processing-all but one of our recovery schemes do not require updaters to even acquire a latch before updating a page. Our log shipping/broadcasting schemes also support concurrent updates to the same page at different sites.
As more and more object-oriented transactional processing monitors are being developed, users in industries such as banking and telecommunications need systematic and critical evaluations of the strengths and weakness...
详细信息
ISBN:
(纸本)0769501915
As more and more object-oriented transactional processing monitors are being developed, users in industries such as banking and telecommunications need systematic and critical evaluations of the strengths and weaknesses of these products. This paper presents the Middleware Evaluation Project (MEP) which aims to provide an impartial evaluation based on rigorously derived tests and benchmarks. The evaluation framework based on TPC's benchmark C will firstly be presented followed by discussions on the set of evaluation criteria. Preliminary results on the OTM product OrbixOTM will also be given.
A new class of Java multithreading based parallel approximate inverse preconditioning is introduced, for efficiently solving sparse arrow-type linear systems. The parallel Explicit Preconditioned Biconjugate Conjugate...
详细信息
ISBN:
(纸本)9780769529172
A new class of Java multithreading based parallel approximate inverse preconditioning is introduced, for efficiently solving sparse arrow-type linear systems. The parallel Explicit Preconditioned Biconjugate Conjugate Gradient - STAB method for shared memory systems is presented in order to examine the parallel behavior of this scheme using explicit approximate inverses as the suitable preconditioner. Design and implementation issues of Java's multithreading techniques are also discussed The performance in terms of speedups and parallel efficiencies of the method using Java multithreading, is illustrated by solving sparse arrow-type linear systems. Static and dynamic workload scheduling systems implemented in Java and the results of their use are presented and discussed
Social networks and business analytics typically need to process vast amounts of data that are often modeled as graphs. The scale of the data that such applications have to handle requires large-scale distributed comp...
详细信息
ISBN:
(纸本)9781479964536
Social networks and business analytics typically need to process vast amounts of data that are often modeled as graphs. The scale of the data that such applications have to handle requires large-scale distributed computing systems, together with scalable parallel algorithms, to efficiently process the graphs. Representative of the graph-based analytics class of applications is the Graph 500 benchmark (Murphy, ***., 2010), which is designed to assess the performance of supercomputing systems by solving the Breadth-First Search (BFS) graph traversal problem. In this work, we analyze the network data motion of a Graph 500 MPI version of the graph traversal problem, using a large-scale high-performance computing system, i.e., the MareNostrum III supercomputer (http://***/marenostrum-support-services/mn3). We focus our analysis on the node-to-node communication and show that the application runtime is communication-bound, the communication making up as much as 80 % of the execution time of each BFS iteration. We also show that the dominating communication pattern is an overall all-to-all exchange (every process communicates to every other process and roughly the same amount of data is exchanged between any two processes), thus providing preliminary guidance for future application or network design optimization efforts.
Pattern databases (PDBs) store heuristic estimates that are used to improve the performance of heuristic search algorithms. They are key to the success of heuristic search in many application domains. While it is know...
详细信息
ISBN:
(纸本)9781605589428
Pattern databases (PDBs) store heuristic estimates that are used to improve the performance of heuristic search algorithms. They are key to the success of heuristic search in many application domains. While it is known [12] that the efiectiveness of PDBs critically depends on their size, current implementations use only small PDBs because they require random access to main memory. We present two MapReduce implementations that do not require random memory access and therefore enable larger PDBs than were previously possible. The first one, named MR-BFFS, is a parallel breadth-first frontier search. It is used for generating arbitrarily large PDBs out-of-core. The second one, MR-IDA*, uses out-of-core PDBs to perform a breadth-first iterative-deepening A* search. Both scale perfectly on massively parallelsystems and they make use of all available resources like CPUs, distributed memories, and disks. We demonstrate the performance of our algorithms and provide, as a byproduct of this research, the first complete evaluation of dual additive PDBs for the 8-puzzle. We also provide data on larger problem spaces and discuss the efiectiveness of PDBs for improving the search. Copyright 2010 ACM.
The paper is devoted to scalability analysis of a typical linear algebra algorithm on heterogeneous clusters. We proof that traditional scalability metrics proposed for analysis of linear algebra algorithms is applica...
详细信息
ISBN:
(纸本)0769522106
The paper is devoted to scalability analysis of a typical linear algebra algorithm on heterogeneous clusters. We proof that traditional scalability metrics proposed for analysis of linear algebra algorithms is applicable on heterogeneous platform and investigate influence of three heterogeneous strategies computation distribution to Scalable Universal Matrix Multiplication Algorithm (SUMMA) scalability.
暂无评论