this paper investigates the problem of matching and scheduling of an application, which is composed of tasks with precedence constraints, to minimize both execution time and probability of failure of the application i...
详细信息
A novel run-time reconfiguable array of multipliers architecture is presented. the processor can be easily reconfigured to trade bitwidth for array size, thus maximizing the utilization of available hardware. Typicall...
详细信息
It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For an SCI-based PC cluster, it is possible to reduce the network access time by maintaining network cach...
详细信息
this paper presents the MPC parallel computer and its MPI implementation performed at the Laboratoire LIP6 of Univ. Pierre and Marie Curie, Paris. MPC is a low cost and highperformance parallel computer using standar...
详细信息
All-to-all communication is one of the most dense communication patterns and occurs in many important applications in parallel computing. In this paper, we present a new all-to-all broadcast algorithm in all-port mesh...
详细信息
All-to-all communication is one of the most dense communication patterns and occurs in many important applications in parallel computing. In this paper, we present a new all-to-all broadcast algorithm in all-port meshes and tori. the algorithm utilizes a controlled message flooding based on a novel broadcast pattern, which ensures a balanced traffic load in all dimensions in the network so that the optimal transmission time for all-to-all broadcast can be achieved. the broadcast pattern is described in a formal, generic way for each node in terms of a few simple operations and can be easily built into router hardware. Unlike existing all-to-all broadcast algorithms, the new algorithm overlaps message switching time with transmission time in a pipelined fashion to reduce the total communication delay of all-to-all broadcast. In most cases, the total communication delay is close to the lower bound of all-to-all broadcast within a small constant range. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the optimum in practice.
A major overhead of software DSM is the long remote access latency when the accessed page is not in the focal cache. One method for tolerating the remote access latency is to prefetch the pages before they are accesse...
详细信息
the proposed convergence algorithm quickly and accurately predicts the mean response times of Internet channels operating with request-response type of communication protocols. At first, how the convergence algorithm ...
详细信息
We compare three remote visualization strategies used for interactive exploration of large data sets in distributed environments: image-based rendering, parallel visualization servers, and subsampling. We review each ...
详细信息
this paper proposes a parallel data-mining algorithm and its implementation on a PC cluster. the decision tree is a widely used data-mining algorithm for classifying records in a database. Simple parallelization of de...
详细信息
暂无评论