A parallel version of a new automatic Harris-based corner detector is presented. A scheduler to dynamically and homogeneously distribute high computational workload on heterogeneous parallel architectures such as Grid...
详细信息
ISBN:
(纸本)9781424452910
A parallel version of a new automatic Harris-based corner detector is presented. A scheduler to dynamically and homogeneously distribute high computational workload on heterogeneous parallel architectures such as Grid systems has been implemented to speedup the whole procedure. Experimental results show the robustness of the underlying scheduler, which can be easily exploited in various automatic image analysis systems.
Withthe advent of multi-core processors, desktop application developers must finally face parallelcomputing and its challenges. A large portion of the computational load in a program rests within iterative computati...
详细信息
ISBN:
(纸本)9781424452910
Withthe advent of multi-core processors, desktop application developers must finally face parallelcomputing and its challenges. A large portion of the computational load in a program rests within iterative computations. In object-oriented languages these are commonly handled using iterators which are inadequate for parallel programming. Consequently, the powerful parallel Iterator concept was developed. this paper presents various developments of the parallel Iterator, such as parallel traversal of complex collections with partial ordering (such as a tree). Other features include reductions, parallel remove semantics and exception handling. Along withthe ease of use, the results reveal great speedup in comparison to traditional Java parallelism approaches.
Withparallelcomputing system scaling up, the system reliability drastically decreases, so parallelapplications running on such system must tolerate hardware failures. Checkpointing is widely used in the domain of l...
详细信息
ISBN:
(纸本)9781424452910
Withparallelcomputing system scaling up, the system reliability drastically decreases, so parallelapplications running on such system must tolerate hardware failures. Checkpointing is widely used in the domain of large-scale parallelcomputing, which periodically saves the state of computation to stable storage. this produces innegligible fault tolerance overhead. the traditional speedup only measures the performance of failure-free system. In this paper, we firstly propose the speedup metric taking into account checkpointing overhead. the new metric unifies the performance and reliability measures, and evaluates the practical speedup of parallel application with checkpointing. Furthermore, this paper classifies and analyzes existing parallel systems according to the proposed speedup metric, and makes suggestions on system design and fault tolerance techniques improvement. Finally, we validate the analysis of this new speedup metric by experiment. the experimental results indicate that the proposed speedup for parallel application with checkpointing is an effective metric.
the paper introduces an example of a 'distributed tangible technology' as a new type of technology that enables children in different physical locations to engage in physical interaction. A virtual tug of war ...
详细信息
ISBN:
(纸本)9781605583952
the paper introduces an example of a 'distributed tangible technology' as a new type of technology that enables children in different physical locations to engage in physical interaction. A virtual tug of war game is an example of a distributed tangible technology that is played by groups of children pulling a rope from two separate locations. the game was launched when teams in Finland and South Africa competed during an international science festival. the paper describes the design and implementation of the tug of war game. It explores the challenges combining tangible user interfaces withdistributedcomputing and distributedtechnologies to be overcome in future educational applications for children. Copyright 2009 ACM.
the main contribution of this paper is to present an efficient parallel sorting "psort" compatible withthe standard qsort. Our parallel sorting "psort" is implemented such that its interface is co...
详细信息
ISBN:
(纸本)9781424452910
the main contribution of this paper is to present an efficient parallel sorting "psort" compatible withthe standard qsort. Our parallel sorting "psort" is implemented such that its interface is compatible with "qsort" in C Standard Library. therefore, any application program that uses standard "qsort" can be accelerated by simply replacing "qsort" call by our "psort". Also, "psort" uses standard "qsort" as a "subroutine for local sequential sorting. So, if the performance of "qsort" is improved by anyone in the community, then that of our "psort" is also automatically improved. To evaluate the performance of our "psort", we have implemented our parallel sorting in a Linux server with two Intel quad-core processors (i.e. eight processor cores). the experimental results show that our "psort" is approximately 6 times faster than standard "qsort" using 8 processors. Since the speed up factor cannot be more than 8 if we use 8 cores, our algorithm is close to optimal. Also, as far as we know, no previously published parallel implementations achieve a speed up factor less than 4 using 8 cores.
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabil...
详细信息
ISBN:
(纸本)9781424452910
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabilizing emulation technique that provides reliable communication on a virtual topology in the presence of transient faults. the proposed protocol improves the execution slowdown of previous protocols [7], [8] and provides adaptive message delivery delay on the emulated channels, which is a new type of adaptability against transient faults.
Many data mining techniques have been proposed for parallelapplications performance analysis, the most interesting being clustering analysis. Most cases have been used to detect processors with similar behavior. In p...
详细信息
ISBN:
(纸本)9781424452910
Many data mining techniques have been proposed for parallelapplications performance analysis, the most interesting being clustering analysis. Most cases have been used to detect processors with similar behavior. In previous work, we presented a different approach: clustering was used to detect the computation structure of the applications and how these different computation phases behave. In this paper, we present a method to evaluate the accuracy of this structure detection. this new method is based on the Single Program Multiple Data (SPMD) paradigm exhibited by real parallel programs. Assuming an SPMD structure, we expect that all tasks of a parallel application execute the same operation sequence. Using a Multiple Sequence Alignment (MSA) algorithm, we check the sequence ordering of the detected clusters to evaluate the quality of the clustering results.
the proceedings contain 37 papers. the topics discussed include: stable protocols for the medium access control in wireless networks;new approaches of parallel calculus in groups of firms;mathematical theory of inform...
ISBN:
(纸本)9789604741342
the proceedings contain 37 papers. the topics discussed include: stable protocols for the medium access control in wireless networks;new approaches of parallel calculus in groups of firms;mathematical theory of information technology;AI, granular computing, and automata with structured memory;using cloud computing for E-Learning systems;quality model for M-Learning applications;robustness of information systems and technologies;using some web content mining techniques for Arabic text classification;a modified C-Means clustering algorithm;new implementation of unsupervised ID3 algorithm (NIU-ID3) using Visual ***;distributed algorithms for power saving optimization in sensor network;secure automatic ticketing system;and secure distribution of confidential information via self-destructing data.
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for para...
详细信息
ISBN:
(纸本)9781424452910
nowadays, more and more supercomputers are built on multi-core processors with shared caches. However, the conflict accesses to shared cache from different threads or processes become a performance bottleneck for parallelapplications. Cache partitioning can be used to allocate cache resources for different processes exclusively according to the demands of the processes. Conflicted accesses are avoided by restricting cache accesses to distinct private part of shared caches. this paper studies the problem of shared cache partition for balanced MPI parallelapplications in CMP architecture, presenting the performance oriented cache partitioning framework, including Spatial-Level Cache Partitioning(SLCP), Time-level Cache Partitioning(TLCP) and the evaluation of them. We evaluate SLCP and TLCP based on a quad-core simulator. Experiment shows that the SLCP and TLCP outperforms traditional LRU cache replacement policy in IPC throughput and miss rate metric. Specifically, for large workloads, TLCP outperforms LRU by up to 20% and on average 8.7%.
there are increasingly demanding for huge computing capabilities and complex processes managing technologies along withthe development of large-scale parallel scientific computingapplications. Taking Ensemble Predic...
详细信息
ISBN:
(纸本)9780769537665
there are increasingly demanding for huge computing capabilities and complex processes managing technologies along withthe development of large-scale parallel scientific computingapplications. Taking Ensemble Prediction in climate domain for example, we discuss BPEL-based workflow for complex scientific computing management. the contributions of this paper are double-folds: On the framework side, we propose a job wrapping paradigm to wrap the environment depending applications with JSDL, and use a standard job submission system called GridSAM for job submitting and monitoring, which enables scientists to compose, monitor and run the applications easily with workflow system. On the algorithm side, we propose a job scheduling algorithm for the large-scale parallelapplications based on job pool in the heterogeneous environment. As a result, the resources can be utilized according to their capabilities and we can achieve the load balance of resources.
暂无评论