this paper studies Strassen's matrix multiplication algorithm by implementing it in a variety of methods: sequential, workflow, and in parallel. All the methods show better performance than the well-known scientif...
详细信息
ISBN:
(纸本)9780889866386
this paper studies Strassen's matrix multiplication algorithm by implementing it in a variety of methods: sequential, workflow, and in parallel. All the methods show better performance than the well-known scientific libraries for medium to large size matrices. the sequential recursive program is implemented and compared with ATLAS's DGEMM subroutine. A workflow program in the NetSolve system and two parallel programs based on MPI and Scal-APACK are also implemented. By analyzing the time complexity and memory requirement of each method, we provide insight into how to utilize Strassen's Algorithm to speedup matrix multiplication based on existing high performance tools or libraries.
this paper describes a novel approach to deterministic multithreading for active replication of Java objects. Unlike other existing approaches, the presented deterministic thread scheduler fully supports the native Ja...
详细信息
ISBN:
(纸本)9780889866386
this paper describes a novel approach to deterministic multithreading for active replication of Java objects. Unlike other existing approaches, the presented deterministic thread scheduler fully supports the native Java synchronisation mechanisms, including reentrant locks, condition variables, and time bounds on wait operations. Furthermore, this paper proposes source-code transformation as a novel approach for intercepting Java synchronisation statements. this allows the reuse of existing object implementations and simplifies application development.
this paper deals with a novel communication timing control for wireless networks and radio interference problem. Communication timing control is based on the mutual synchronization of coupled phase oscillatory dynamic...
详细信息
ISBN:
(纸本)9780889866386
this paper deals with a novel communication timing control for wireless networks and radio interference problem. Communication timing control is based on the mutual synchronization of coupled phase oscillatory dynamics with a stochastic adaptation. through local and fully distributed interactions, the coupled phase dynamics self-organizes collision free communication timing. In a wireless communication, the influence of the interference wave causes unexpected collision. therefore, we propose a more effective timing control by selecting the interaction nodes according to received signal strength.
distributed execution of logic programs requires a match of granularity between a program and the distributed multi-processor it runs on to exploit its potential for performance fully. this paper presents methods to c...
详细信息
ISBN:
(纸本)0780350049
distributed execution of logic programs requires a match of granularity between a program and the distributed multi-processor it runs on to exploit its potential for performance fully. this paper presents methods to control the granularity of tasks on distributed heterogeneous processors effectively. It considers the characteristics of such platforms and relates the amount of local computation withthe significant communication overheads by introducing the notion of a collection of parallel tasks. the experimental results indicate that the proposed controls can model all kinds of predicates (reclusive, mutually recursive etc.) satisfactorily and improve the performance of various forms of parallelism (AND, OR, combinations).
We have developed a distributed asynchronous Web based training system. In order to improve the scalability and robustness of this system, all contents and a function that scores user's answers are realized on mob...
详细信息
ISBN:
(纸本)9780889866386
We have developed a distributed asynchronous Web based training system. In order to improve the scalability and robustness of this system, all contents and a function that scores user's answers are realized on mobile agents. these agents are distributed to computers, and they can obtain using a P2P network that modified Content-Addressable Network. In this system, although entire services do not become impossible even if some computers break down, the problem that contents disappear occurs with an agent's disappearance. In this study, as a solution for this problem, backups of agents are distributed to computers. If a failure of a computer is detected, other computer will continue service using backups of the agents belonged to the computer. the developed algorithm is examined by experiments.
the Nile system is a distributed environment for running very large, data-intensive applications across a network of commodity workstations. these applications process data from elementary particle collisions, generat...
详细信息
ISBN:
(纸本)0780350049
the Nile system is a distributed environment for running very large, data-intensive applications across a network of commodity workstations. these applications process data from elementary particle collisions, generated by the Cornell Electron Storage Ring, and are used by physicists of the CLEO experiment. the applications have a simple data-parallel structure, and so Nile executes them using as much parallelism as is available. Nile currently runs at any single site. It is being used by alpha testers and is scheduled for beta release in March 1998. this paper describes how we are adapting this local-area Nile system to allow for wide-area, multiple site interactions. In particular, we consider the two problems of scaling and of fault-tolerance.
Emerging multiple display infrastructures provide users with a large number of semi-public and private displays. Selecting what information to present on which display here becomes a real issue, especially when multip...
详细信息
ISBN:
(纸本)9780889866386
Emerging multiple display infrastructures provide users with a large number of semi-public and private displays. Selecting what information to present on which display here becomes a real issue, especially when multiple users with diverging interests have to be considered. this especially holds for dynamic ensembles of displays. We propose to cast the Display Mapping problem as an optimization task. We develop an explicit criterion for the global quality of a display mapping and then describe a distributed algorithm based on the GRASP framework that is able to approximate the global optimum through local interaction between display devices. We claim that such a distributed optimization approach, based on the definition of an explicit global quality measure, is a general concept for achieving coherent ensemble behavior.
For search-intensive applications such as data mining and bioinformatics, a SIMD Processor Array on a Chip may be an effective architecture, and if the application is control-intensive, a Multiple SIMD (MSIMD) archite...
详细信息
ISBN:
(纸本)9780889866386
For search-intensive applications such as data mining and bioinformatics, a SIMD Processor Array on a Chip may be an effective architecture, and if the application is control-intensive, a Multiple SIMD (MSIMD) architecture may further increase processor utilization. In this paper, we describe the implementation of an associative MSIMD architecture on the MASC Processor. the MASC Processor implemented using FPGAs, is easily scalable, and dynamically assigns tasks to Processing Elements as the program executes.
Clusters of nondedicated heterogeneous nodes promise high utilization and performance. A market-based resource allocation allows effective and decentralized management and motivates participants to contribute to the f...
详细信息
ISBN:
(纸本)9780889866386
Clusters of nondedicated heterogeneous nodes promise high utilization and performance. A market-based resource allocation allows effective and decentralized management and motivates participants to contribute to the functionality of a cluster. Major issues for a user who wants to execute a task on a nondedicated cluster are to ensure data integrity and evaluate reasonably a target processor. We have proposed an auditing mechanism that allows a user to establish suitable prices of cluster computational resources in an untrusted system environment of heterogeneous nondedicated clusters. In a process of price evaluation, a user can detect resources that behave incorrectly.
Withthe on-chip resources largely increased, modern architectures are much different from traditional ones. the relation between temporal computing and spatial computing is getting more and more intricate. In this pa...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
Withthe on-chip resources largely increased, modern architectures are much different from traditional ones. the relation between temporal computing and spatial computing is getting more and more intricate. In this paper, we first analyzed and compared the abstract computing models of modern architectures. then, a runtime reconfigurable architecture called programmable dataflow computing architecture (ProDFA) is proposed. the architecture of ProDFA is sumarized, then the process of how applications are mapped and execute are simply introduced. As a case study, a specific reconfigurable structure for symmetric ciphers is implemented. Performance of several typical symmetric ciphers are evaluated. the experimental results show high performance and efficiency of ProDFA.(1)
暂无评论