Algorithmic skeletons intend to simplify parallel programming by providing recurring forms of program structure as predefined components. We present a new distributed task parallel skeleton for a very general class of...
详细信息
ISBN:
(纸本)9789898111517
Algorithmic skeletons intend to simplify parallel programming by providing recurring forms of program structure as predefined components. We present a new distributed task parallel skeleton for a very general class of divide and conquer algorithms for MIMD machines with distributed memory. Our approach combines skeletal internal task parallelism with stream parallelism. This approach is compared to alternative topologies for a task parallel divide and conquer skeleton with respect to their aptitude of solving streams of divide and conquer problems. Based on experimental results for matrix chain multiplication problems, we show that our new approach enables a better processor load and memory utilization of the engaged solvers, and reduces communication costs.
The ever increasing complexity of distributed systems mandates to formally verify their design and implementation. Unfortunately, the common approaches and existing tools to formally establish the correctness of these...
详细信息
ISBN:
(纸本)9781479984909
The ever increasing complexity of distributed systems mandates to formally verify their design and implementation. Unfortunately, the common approaches and existing tools to formally establish the correctness of these systems remain hardly applicable to the kind of legacy applications that are commonly found in the HPC community. We present how system-level memory introspection can be achieved directly at runtime without relying on the source code analysis. We use this mechanism to detect the equality of the application's state at system level. As the storage of the system state may be memory expensive, we compact the memory by sharing unchanged memory pages between snapshots. This enables the automated verification of safety and liveness properties on legacy distributedapplications written in Fortran or C/C++ using the MPI standard. We demonstrate the effectiveness of our approach on several programs from the MPICH3 test suite.
parallel video coding has emerged from the need to map video algorithms in many/multi-core architectures and achieve ever-growing performance goals in video-based applications. Several parallelization methods have bee...
详细信息
ISBN:
(纸本)9781479984909
parallel video coding has emerged from the need to map video algorithms in many/multi-core architectures and achieve ever-growing performance goals in video-based applications. Several parallelization methods have been proposed around H. 264 algorithm but it was only until the new HEVC video standard that two parallelization strategies - Tiles and Wavefront parallelprocessing (WPP) - became part of the specification. Effective selection and usage of Tiles or WPP is an open issue. In this paper we evaluate the performance of both strategies in terms of video decoding speed-up including their correlation with additional optimization possibilities like parallel filtering and low-level SIMD operations.
distributed systems must provide some kind of inter process communication (IPC) mechanisms to enable communication between local and especially geographically dispersed and physically distributed processes. These mech...
详细信息
ISBN:
(纸本)9781424417513
distributed systems must provide some kind of inter process communication (IPC) mechanisms to enable communication between local and especially geographically dispersed and physically distributed processes. These mechanisms may be implemented at different levels of distributed systems namely at application level, library level, operating system interface level, or kernel level. Upper level implementations are intuitively simpler to develop but are less efficient. This paper provides hard evidence on this intuition. It considers two renowned IPC mechanisms, one implemented at library level, called MPI, and the other implemented at kernel level, called DIPC. It shows that the time taken to calculate the Pi number by a distributed system that uses MPI to program and run the calculation of Pi number in parallel is on average 35% slower than by the same distributed system that uses DIPC to program and run the calculation of Pi number in parallel. It is concluded that if distributed systems are to become an appropriate platform for high performance scientific computing of all kinds, it is necessary to try harder and implement IPC mechanisms at kernel level, even ignoring so many other factors in favor of kernel level implementations like safety, privilege, reliability, and primitiveness.
DiET provides a framework to experiment with extended transaction models and also to synthesize new models. As case studies nested and split-join transaction types have been implemented. DiET is a framework loosely co...
详细信息
DiET provides a framework to experiment with extended transaction models and also to synthesize new models. As case studies nested and split-join transaction types have been implemented. DiET is a framework loosely coupled with a distributed storage manager and PVM. Such a coupling enables DiET to cope up with a wide variety of storage manager and distributed process manager without any difficulty. The performance measures indicate high speedup for complex applications.
Both TCP and ICMP are applied in network measurement, while investigating differences between the measured results of them is important but has been less addressed. To compare the differences between TCP and ICMP when...
详细信息
ISBN:
(纸本)3540297693
Both TCP and ICMP are applied in network measurement, while investigating differences between the measured results of them is important but has been less addressed. To compare the differences between TCP and ICMP when they are used in measuring host connectivity, RTT, and packet loss rate, we designed two groups of comparison programs, after careful evaluating of the program parameters, we executed a lot of experiments on the Internet. The experimental results shows, there are significant differences between the host connectivity measured using TCP or ICMP;in general, the accuracy of TCP is 20%-30% higher than that of ICMP. The case of RTI' and packet loss rate is complicated, which are related to path loads and destination host loads. While commonly, the RTT and packet loss rate measured using TCP or ICMP are very close. We also give some advices on protocol selection for conducting accurate network measurements.
Developing applications for modern complex networked robotic systems is more challenging due to the introduction of possibly sophisticated communication and coordination aspects. In this paper, we propose EmSBoT, a li...
详细信息
ISBN:
(纸本)9781467385237
Developing applications for modern complex networked robotic systems is more challenging due to the introduction of possibly sophisticated communication and coordination aspects. In this paper, we propose EmSBoT, a lightweight embedded component-based software framework targeting resource-constrained networked robotic systems. EmSBoT provides a unified Application Program Interface (API) that hides the heterogeneous distributed environment from applications. Its OS abstraction layer endows it with OS independence and portability. A port-based communication mechanism is adopted to exchange message between loosely coupled components, making the system with fault-tolerance capability. By isolating the communication channels as separate agents, the framework provides uniform and transparent message-passing for agents over node boundaries. We describe the architecture, programming model and core features of EmSBoT in this paper, together with the performance evaluation and behavior validation to demonstrate its efficiency and feasibility.
This paper describes a methodological approach that uses Petri Nets (PNs) and Time Petri Nets (TPNs) for the modeling, the analysis and the behavior control of fault tolerant Computer Supported Synchronous Cooperative...
详细信息
ISBN:
(纸本)078036449X
This paper describes a methodological approach that uses Petri Nets (PNs) and Time Petri Nets (TPNs) for the modeling, the analysis and the behavior control of fault tolerant Computer Supported Synchronous Cooperative Work (CSSCW) architectures inside which a high level of interactivity between users is required Modeling allows architectures to be formally studied under different functioning conditions (normal communications and deficient communications). Results show that the model is able to predict interlocking and state inconsistencies in the presence of errors. TPNs are used to extend PNs models in order to detect communication errors and avoid subsequent dysfunctions. The approach is illustrated through the improvement of a recently presented collaborative application dedicated to biomedical signal visualization and analysis.
General purpose parallelprocessing machines are increasingly being used to speedup a variety of VLSI CA D applications. This paper addresses the mapping of logic simulation using the time first algorithm on parallel ...
详细信息
暂无评论