The paper summarizes author's investigations in tuning a multithreaded interval branch-and-prune algorithm for nonlinear systems and presents the developed solver. New results for using the box-consistency enforci...
详细信息
The paper summarizes author's investigations in tuning a multithreaded interval branch-and-prune algorithm for nonlinear systems and presents the developed solver. New results for using the box-consistency enforcing operator and a new variant of the initial exclusion phase are presented. Also, a new heuristic to choose the coordinate for bisection is considered. Extensive numerical experiments are analyzed to provide the satisfying version of the algorithm.
One major problem with nonrigid image registration techniques is their,high computational cost. Because of this, these methods have found limited application to clinical situations where fast execution is required, e....
详细信息
One major problem with nonrigid image registration techniques is their,high computational cost. Because of this, these methods have found limited application to clinical situations where fast execution is required, e.g., intraoperative imaging. This, paper presents a parallel implementation of a nonrigid image registration algorithm. It takes advantage of shared-memory multiprocessor computer architectures using multithreaded programming by partitioning of data and partitioning of tasks, depending on the computational subproblem. For three different biomedical applications (intraoperative brain deformation, contrast-enhanced MR mammography, intersubject brain registration), the scaling behavior of the algorithm is quantitatively analyzed. The method is demonstrated to perform the computation of intra-operative brain deformation in less than a minute using 64 CPUs on a 128-CPU shared-memory supercomputer (SGI Origin 3800). It is shown that its serial component is,no more than 2% of the total computation time, allowing a speedup of at least a factor of 50. In most cases, the theoretical limit of the speedup is substantially higher (up to: 132-fold in the application examples presented in this paper). The parallel implementation of our algorithm is, therefore, capable of solving nonrigid registration problems with short execution time requirements and may be considered an important step in the application of such techniques to clinically important problems such as the computation of brain deformation during cranial image-guided surgery.
Load balancing is a key issue in the development of parallel algorithms with irregular structures. Existing load balancing systems each support only one specific programming paradigm and thus are of limited use. The s...
详细信息
Load balancing is a key issue in the development of parallel algorithms with irregular structures. Existing load balancing systems each support only one specific programming paradigm and thus are of limited use. The system VDS presented here allows concurrent use of various paradigms such as fork-join, weighted tasks, and static dags (directed acyclic graphs that are known in advance). The system provides visual performance evaluation tools to facilitate the efficient application of the system. VDS supports various communication interfaces including PVM and MPI. Thus, VDS-applications can be run on architectures ranging from workstation clusters to massively parallel systems. (C) 2000 Elsevier Science B.V. All rights reserved.
暂无评论