Monte Carlo (MC) techniques are often used to price complex financial derivatives. the computational effort can be substantial when high accuracy is required. However, MC computations are latency tolerant, and are thu...
详细信息
Monte Carlo (MC) techniques are often used to price complex financial derivatives. the computational effort can be substantial when high accuracy is required. However, MC computations are latency tolerant, and are thus easily parallelize even withhigh communication overheads, such as in a distributed compacting environment. A drawback of MC is its relatively slow convergence rate, which can be overcome through the use of quasi Monte Carlo (QMC) techniques which use low discrepancy sequences. We discuss the issues that arise in parallelizing QMC, especially in a heterogeneous computing environment, and present results of empirical studies on arithmetic Asian options, using three parallel QMC techniques that have recently been proposed. We expect the conclusions to be valid for other applications too.
computer simulation is, in our days, one of the most important tools for the correct understanding of physical phenomena. We analyse the improvement of performance by the parallelization of an algorithm used to simula...
详细信息
computer simulation is, in our days, one of the most important tools for the correct understanding of physical phenomena. We analyse the improvement of performance by the parallelization of an algorithm used to simulate electronic properties from semiconductor systems.
We investigate the average-case scalability of parallel algorithms executing on multicomputer systems whose static networks are k-ary d-cubes. Our performance metrics are isoefficiency function and isospeed scalabilit...
详细信息
ISBN:
(纸本)0769515738
We investigate the average-case scalability of parallel algorithms executing on multicomputer systems whose static networks are k-ary d-cubes. Our performance metrics are isoefficiency function and isospeed scalability for the purpose of average-case performance analysis, we formally define the concepts of average-case isoefficiency function and average-case isospeed scalability. By modeling parallel algorithms on multicomputers using task interaction graphs, we are mainly interested in the effects of communication overhead and load imbalance on the performance of parallel computations. We focus on the topology of static networks whose limited connectivities are constraints to highperformance. In our probabilistic model, task computation and communication times are treated as random variables, so that we can analyze the average-case performance of parallel computations. We derive the expected parallel execution time on symmetric static networks and apply the result to k-ary d-cubes. We characterize the maximum tolerable communication overhead such that constant average-case efficiency and average-case average-speed could he maintained and that the number of tasks has a growth rate ⊗(P log P). where. P is the number of processors. It is found that the scalability of a parallel computation is essentially determined by the topology of a static network, i.e., the architecture of a parallel computer system. We also argue that under our probabilistic model, the number of tasks should grow at least in the rate of ⊗(P log P), so that constant average-case efficiency and average-speed can be maintained.
the work presented in this paper consists of a tool developed to help the process of prototyping a TINA system. this tool is responsible for generating Java code automatically for a general TINA system, whose objects ...
详细信息
the work presented in this paper consists of a tool developed to help the process of prototyping a TINA system. this tool is responsible for generating Java code automatically for a general TINA system, whose objects were previously described by the use of SDL language. the generated code is a distributed system that makes use of CORBA as the distributed environment and is completely functional.
We consider two models for the structure of the algorithm used for concurrent interpretation of MIMD code sequences on SIMD machines. the single-fetch model shares portions of the instruction execution among all the i...
详细信息
In this paper we formulate the static load balancing problem in single class job distributed systems as a cooperative game among computers. It is shown that the Nash Bargaining Solution (NBS) provides a Pareto optimal...
Future scalable, highthroughput, and highperformance applications are. likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements be...
this paper addresses the issue of fault-tolerance in applications that make use of network storage. A network storage abstraction called the Network Storage Stack is presented, along with its constituent parts. In par...
this research is to design a new two-level TLB (translation look-aside buffer) architecturethat integrates a 2-way banked filter TLB with a 2-way banked main TLB. One of the main objectives is to reduce power consump...
详细信息
this research is to design a new two-level TLB (translation look-aside buffer) architecturethat integrates a 2-way banked filter TLB with a 2-way banked main TLB. One of the main objectives is to reduce power consumption in embedded processors by distributing the accesses to the TLB entries across several banks in a balanced manner. thus, an advanced filtering technique is devised to reduce power dissipation by adopting a sub-bank structure at the filter TLB. And also a bank-associative structure is applied to each level of the TLB hierarchy. Simulation result shows that the miss ratio and Energy*Delay product can be improved by 59.26% and 24.9%, respectively, compared with a micro TLB with 4-32 entries, and 40.81% and 12.18%, compared with a micro TLB with16-32 entries.
this paper describes the architecture of a highperformance, particle simulation machine, DEM-1 for short-range particle interaction computations. All existing particle simulation machines have specialized pipelines t...
详细信息
this paper describes the architecture of a highperformance, particle simulation machine, DEM-1 for short-range particle interaction computations. All existing particle simulation machines have specialized pipelines to calculate long-range particle interactions effectively. However, their ability to perform particle simulations efficiently diminishes with short-range interactions. the communication cost component of particle simulations will play a significant role in performance when the computation cost becomes O(N). DEM-1's three dimensional torus high-speed network reduces the communication cost while 2048 local processors perform the time integration. the elimination of pipeline bubbles in DEM-1 is achieved by specially designed cut off judgment units. Each specialized pipeline consists of dedicated data path supported by position vector prefetch dual ported memory. the performance of DEM-1 is presented with very large-scale Embedded Atom Method (EAM) molecular dynamics simulations.
暂无评论