As distributedapplications became more commonplace and more sophisticated, new programming languages and models for distributed programming were created. In the context of continuously increasing data flows in parall...
详细信息
ISBN:
(纸本)9781479959198
As distributedapplications became more commonplace and more sophisticated, new programming languages and models for distributed programming were created. In the context of continuously increasing data flows in parallelapplications, there is a renewed interest in the dataflow paradigm. this paper shows why AGAPIA language is suitable for dataflow programming. AGAPIA is capable of expressing massive parallelism in a manageable way for programmers, allowing building dynamic nodes and links in the data flow graph at runtime. the nodes of the dataflow graph, also called programs, are modular and reusable. the communication is transparent for users allowing them to concentrate on the high level flow and algorithm. A complete application and an analysis in terms of productivity and performance are presented in order to demonstrate the AGAPIA's capabilities.
Advances in communication for parallel programming have yielded one-sided messaging systems. the MPI bindings for Ruby have been augmented to include the remote memory access functions of MPI-2.
ISBN:
(纸本)0780321754
Advances in communication for parallel programming have yielded one-sided messaging systems. the MPI bindings for Ruby have been augmented to include the remote memory access functions of MPI-2.
Interest management is essential for real-time large-scale distributed virtual environments (DVEs) which seeks to filter irrelevant messages on the network. Many existing interest management schemes such as HLA DDM fo...
详细信息
ISBN:
(纸本)9780769538686
Interest management is essential for real-time large-scale distributed virtual environments (DVEs) which seeks to filter irrelevant messages on the network. Many existing interest management schemes such as HLA DDM focus on providing precise message filtering mechanisms. However, this leads to a second problem: the computational overhead of the interest matching process. If the CPU cost of interest matching is too high, it would be unsuitable for real-time applications such as multiplayer online games for which runtime performance is important. this paper evaluates the performance of existing interest matching algorithms and proposes a new algorithm based on parallelprocessing. the new algorithm is expected to have better computational efficiency than existing algorithms and maintain the same accuracy of message filtering as them. Experimental evidence shows that our approach works well in practice.
We improve the efficiency of time parallel simulation using some concepts of monotonicity of simulation models. the time parallel simulation technique partitions the simulation timespan into simulation periods which a...
详细信息
ISBN:
(纸本)9780769538686
We improve the efficiency of time parallel simulation using some concepts of monotonicity of simulation models. the time parallel simulation technique partitions the simulation timespan into simulation periods which are independently executed. Such a technique relies on strong stochastic assumptions: regeneration or short influence of the initial point a on sample path. If these assumptions are not satisfied, we only obtain an approximation. We prove that if the model is monotone we can increase the parallelization of the simulations and we can prove some bounds on the result.
this article describes problems of distributing a real-time, human-in-the-loop simulator. the simulator itself is a dynamics simulator solving the dynamics of Multi-Body Systems in real-time. Techniques used to distri...
详细信息
ISBN:
(纸本)9780769538686
this article describes problems of distributing a real-time, human-in-the-loop simulator. the simulator itself is a dynamics simulator solving the dynamics of Multi-Body Systems in real-time. Techniques used to distribute computation are presented and their suitability for general purpose interactive simulation are discussed.
We consider energy minimization by speed-scaling of an open shop multiprocessor with n jobs and m machines. the paper studies the complexity of a primal-dual solution algorithm of [4], which was an open question in th...
详细信息
ISBN:
(纸本)9781479959198
We consider energy minimization by speed-scaling of an open shop multiprocessor with n jobs and m machines. the paper studies the complexity of a primal-dual solution algorithm of [4], which was an open question in that paper. We prove that in a neighbourhood of the solution the complexity of the algorithm is O(mn log 1/epsilon) if n not equal m and epsilon is the roundoff error of the computer. the paper demonstrates how linearization can be used to investigate the complexity of an algorithm close to the optimum. An estimate of the size of the neighbourhood where the linearization error is smaller than the computer's roundoff error is also given.
Specialized encryption processors offer both low latency and high throughput at the expense of higher cost. A modern x86 system that encompasses several compute architectures (SISD/SIMD) might be able to perform well ...
详细信息
ISBN:
(纸本)9781479959198
Specialized encryption processors offer both low latency and high throughput at the expense of higher cost. A modern x86 system that encompasses several compute architectures (SISD/SIMD) might be able to perform well compared to a dedicated encryption unit at the fraction of the cost. this paper presents how one might accelerate AES ECB 128 bit using modern commodity hardware found in today's x86 computers. Focus architecture is AMD A6 5400K, coupled with a discrete GPU AMD R7 250. Benchmark results compare CPU OpenSSL execution, CPU AES-NI acceleration, integrated, discrete GPU and heterogeneous combinations of the above processing units. We present multiple test results and attempt to explain some inconsistencies of what would be expected.
Scalability is very important for parallel and distributed simulations. Several techniques have been proposed to develop scalable synchronization strategies, communication services or fundamental algorithms, while lit...
详细信息
ISBN:
(纸本)9780769538686
Scalability is very important for parallel and distributed simulations. Several techniques have been proposed to develop scalable synchronization strategies, communication services or fundamental algorithms, while little has been seen to deal withthe modeling stage of the application. Learning from the HPC (High Performance Computing) lesson, it is clear that the time spent in developing a simulation application must be considered in evaluating the scalability of the application. there are many discrete event simulation platforms built for large parallel and distributed simulations, such as SPEEDES (Synchronous parallel Environment for Emulation and Discrete Event Simulation), GTW (Georgia tech Time Warp), and YHSUPE, etc. they take Event-Scheduling as their modeling paradigm and have achieved great runtime performance, but lack in providing efficient modeling methods. To deal withthis issue, a component-based specification, which can support hierarchical decomposition of large models and facilitate model reuse, is presented. this paper extends the DEVS (Discrete Event simulation specification) and proposes a component-based formalism, called EDEVS (Event-Scheduling Discrete Event simulation Specification) for the existing Event-Scheduling parallel and distributed simulation platforms.
Increasingly, a number of applications rely on, or can potentially benefit from, analysis and monitoring of data streams. Moreover, many of these applications involve high volume data streams and require distributed p...
详细信息
ISBN:
(纸本)0780321754
Increasingly, a number of applications rely on, or can potentially benefit from, analysis and monitoring of data streams. Moreover, many of these applications involve high volume data streams and require distributedprocessing of data arising from a distributed set of sources. thus, we believe that a grid environment is well suited for flexible and adaptive analysis of these streams. this paper reports the design and initial evaluation of a middleware for processingdistributed data streams. Our system is referred to as GATES (Grid-based Adaptive Execution on Streams). this system is designed to use the existing grid standards and tools to the extent possible. It flexibly achieves the best accuracy that is possible while maintaining the real-time constraint on the analysis. We have developed a self-adaptation algorithm for this purpose. Results from a detailed evaluation of this system demonstrate the benefits of distributedprocessing, and the effectiveness of our self-adaptation algorithm.
the hardware and software evolutions related to Graphics processing Units (GPUs), for general purpose computations, have changed the way the parallel programming issues are addressed. Many applications are being porte...
详细信息
ISBN:
(纸本)9781479959198
the hardware and software evolutions related to Graphics processing Units (GPUs), for general purpose computations, have changed the way the parallel programming issues are addressed. Many applications are being ported onto GPU for achieving performance gain. the GPU execution time is continuously optimized by the GPU programmers while optimizing pre-GPU computation overheads attracted the research community in the recent past. While GPU executes the programs given by a CPU, pre-GPU computation overheads does exists and should be optimized for a better usage of GPUs. the GPU framework proposed in this paper improves the overall performance of the application by optimizing pre-GPU computation overheads along with GPU execution time. this paper proposes a sparse matrix format prediction tool to predict an optimal sparse matrix format to be used for a given input matrix by analyzing the input sparse matrix and considering pre-GPU computation overheads. the sparse matrix format predicted by the proposed method is compared against the best performing sparse matrix formats posted in the literature. the proposed model is based on the static data that is available from the input directly and hence the prediction overhead is very small. Compared to GPU specific sparse format prediction, the proposed model is more inclusive and precious in terms of increasing overall application's performance.
暂无评论