the code GAM numerically solves initial value ordinary differential equations by means of a family of variable-step variable-order block Boundary Value Methods. Were we consider the possibility of performing the code ...
详细信息
ISBN:
(纸本)3540664432
the code GAM numerically solves initial value ordinary differential equations by means of a family of variable-step variable-order block Boundary Value Methods. Were we consider the possibility of performing the code on parallel machines. Some numerical tests and comparisons are presented.
In this paper, we have designed an efficient parallel algorithm for performing 3 D image reconstruction. In our framework, we have considered 3 D image to be reconstructed from a series of 2 D images, produced using U...
详细信息
In this paper, we have designed an efficient parallel algorithm for performing 3 D image reconstruction. In our framework, we have considered 3 D image to be reconstructed from a series of 2 D images, produced using Ultrasonography, Computer Tomography, etc. the paper discusses a general parallel algorithm for 3 D image reconstruction over CRCW, CREW and EREW PRAM models. We have developed efficient implementations of this algorithm over a vector machines, a distributed system comprising of a cluster of Work Stations and various interconnection network like mesh network and reconfigurable bus network. the performance of the above algorithms are tested using simulation experiments performed for 3 D image reconstruction of the vitreous region of the eye using ophthalmic ultrasonograms. A novel approximation scheme has also been proposed for a drastic improvement in performance for specific kinds of image. Results indicate the time complexities of the algorithms are in resonance with expected theoretical values and image obtained has a uncompromising level of accuracy.
In this paper various methods of CNAM learning (synthesis) are compared in order to find their common features. this allows to transfer the important characteristics among the methods, and to do some assumptions about...
详细信息
ISBN:
(纸本)3540663630
In this paper various methods of CNAM learning (synthesis) are compared in order to find their common features. this allows to transfer the important characteristics among the methods, and to do some assumptions about their capabilities. Also the influence of learning parameters in some methods on the CNAM stability is investigated, and recommendations on their choice are given.
this paper describes a library of platform independent functions for performing modular arithmetic on a range of parallel hardware. It is based around an approximate Chinese remainder reconstruction which allows the m...
详细信息
ISBN:
(纸本)3540664432
this paper describes a library of platform independent functions for performing modular arithmetic on a range of parallel hardware. It is based around an approximate Chinese remainder reconstruction which allows the most significant bits of the stored number to be calculated without the cost of a full reconstruction. We describe how this can be used to calculate the length of a modular number, and also its applications to comparison and division.
the semidicretization of a time-dependent nonlinear partial differential equation leads to a large-scale initial value problem for ordinary differential equations which often cannot be solved in a reasonable time on a...
详细信息
ISBN:
(纸本)3540663630
the semidicretization of a time-dependent nonlinear partial differential equation leads to a large-scale initial value problem for ordinary differential equations which often cannot be solved in a reasonable time on a sequential computer. We investigate in what extent can be practically exploited the idea of parallelism across method in the case of such large problems, and using a distributed computational system.
Maintaining the coherence is becoming one of the most serious problems faced when designing today's machines. Initially, this problem was relatively simple when the interconnection network of Sym metric MultiProce...
详细信息
ISBN:
(纸本)3540664432
Maintaining the coherence is becoming one of the most serious problems faced when designing today's machines. Initially, this problem was relatively simple when the interconnection network of Sym metric MultiProcessors (SMP) was an atomic bus, which simplified the implementation of invalidation coherence protocols. However, due to the increasing bandwidth demand, atomic busses have been progressively replaced by split busses that uncoupled the request and response phases of a transaction. Split busses enable initiating new requests before receiving the response to those already in progress but make more complicated the preservation of the coherence. Indeed, a new request induces a conflict when it concerns a block address involved by another current request and when one of the requests is a WRITE miss. Several solutions exist to solve this problem. that one used in the SGI machine is based on a shared data bus which traces the completion of transactions. Unfortunately, it becomes impracticable in the recent machines which replace data busses by more efficient networks (again for bandwidth constrains), ultimately by a crossbar. this work describes and quantitatively evaluates two possible solutions to the coherence problem for the new architectures where all the data responses cannot be traced by each processor.
Data dependences are known to hamper efficient parallelization of programs. Memory expansion is a general method to remove dependences in assigning distinct memory locations to dependent writes. parallelization via me...
详细信息
ISBN:
(纸本)3540664432
Data dependences are known to hamper efficient parallelization of programs. Memory expansion is a general method to remove dependences in assigning distinct memory locations to dependent writes. parallelization via memory expansion requires both moderation in the expansion degree and efficiency at run-time. We present a general storage mapping optimization framework for imperative programs, applicable to most loop nest parallelization techniques.
the paper introduces the concept of collective breakpoints and macrosteps. Based on the collective breakpoints the macrostep-by-macrostep execution mode has been defined. After introducing the concept of the execution...
详细信息
ISBN:
(纸本)3540664432
the paper introduces the concept of collective breakpoints and macrosteps. Based on the collective breakpoints the macrostep-by-macrostep execution mode has been defined. After introducing the concept of the execution tree and meta-breakpoints the systematic debugging of message passing parallel programs is explained. the main features and distributed structure of DIWIDE, a macrostep debugger is described. the integration of DIWIDE into the GRADE and WINPAR parallel programming environments is outlined..
In this paper, we present a new, efficient, parallel text retrieval algorithm, appropriate for concurrent processing of multiple text queries in very low total execution times. Our approach is based on the widely-know...
详细信息
ISBN:
(纸本)3540664432
In this paper, we present a new, efficient, parallel text retrieval algorithm, appropriate for concurrent processing of multiple text queries in very low total execution times. Our approach is based on the widely-known vector space text processing model assuming the existence of a high-capacity interconnection network, such as the fully connected hypercube. We exploit the large number of available communication links of the hypercube and develop efficient parallel protocols under heavy query load, we give their detailed theoretical analysis and prove their optimal performance. We also prove the better performance of our protocols on hypercubes vs, other a) high capacity interconnection topologies like ideal fat-trees and b) single query parallel text retrieval methods. All the above results are also experimentally demonstrated via suitable embeddings on the Parsytec GCel3/512 supercomputer.
No current task schedulers for distributed-memory MIMD machines produce optimal schedules in general, so it is possible for optimizations to be performed after scheduling. Message merging, communication reordering, an...
详细信息
ISBN:
(纸本)3540664432
No current task schedulers for distributed-memory MIMD machines produce optimal schedules in general, so it is possible for optimizations to be performed after scheduling. Message merging, communication reordering, and task duplication are shown to be effective post-scheduling optimizations. the percentage decrease in execution time is dependent on the original schedule, so improvements are not always achievable for every schedule. However, significant decreases in execution time (up to 50%) are possible, which makes the investment in extra processing time worthwhile.
暂无评论