Effective and fast localization of anatomical structures is a crucial first step towards automated analysis of medical volumes. In this paper, we propose an iterative approach for structure localization in medical vol...
详细信息
ISBN:
(纸本)9781467329255;9781467329224
Effective and fast localization of anatomical structures is a crucial first step towards automated analysis of medical volumes. In this paper, we propose an iterative approach for structure localization in medical volumes based on the adaptive bandwidth mean-shift algorithm for object detection (ABMSOD). We extend and tune the ABMSOD algorithm, originally used to detect 2D objects in non-medical images, to localize 3D anatomical structures in medical volumes. For fast localization, we design and develop optimized parallel implementations of the proposed algorithm on multi-cores using OpenMP, and on GPUs using CUDA. We evaluate the quality, performance and scalability of the proposed algorithm on Computed Tomography (CT) volumes for various structures.
This paper presents the various mechanisms for virtual machine image distribution within a large batch farm and between sites that offer cloudcomputing services. The work is presented within the context of the Large ...
详细信息
Virtualization, as a technology that enables easy and effective resource sharing with a low cost and energy footprint, is becoming increasingly popular not only in enterprises but also in high performance computing. A...
详细信息
ISBN:
(纸本)9781467329255;9781467329224
Virtualization, as a technology that enables easy and effective resource sharing with a low cost and energy footprint, is becoming increasingly popular not only in enterprises but also in high performance computing. Applications with stringent performance needs often make use of graphics processors for accelerating their computations. Hence virtualization solutions that support GPU acceleration are gaining importance. This paper performs a detailed evaluation of three frameworks: rCUDA, gVirtuS and Xen, which support GPU acceleration through CUDA, within a virtual machine. We describe the architectures of these three solutions and compare and contrast them in terms of their fidelity, performance, multiplexing and interposition characteristics.
An optimal resource allocation is desired to increase the efficiency of systems processing business or scientific workflows. These systems include the processing of workflows in distributedcomputing environments such...
详细信息
Building massively parallel numerical simulations is not easy due to lasting changes of parallel programming models and various software technologies needed. We develop a component based graphical parallel programming...
详细信息
ISBN:
(纸本)9781509024032
Building massively parallel numerical simulations is not easy due to lasting changes of parallel programming models and various software technologies needed. We develop a component based graphical parallel programming approach to lower the difficulties of coding applications in scientific andengineeringcomputing and support rapid development of large scale simulations basing on a domain specific framework. parallel applications can be constructed simply by configuring components and assembling them in predefined flowcharts interactively. Large part of codes is auto generated from the graphical configuration for an application. The approach facilitates the rapid design and development of parallel numerical simulations by shielding many knowledge and technologies required from domain experts. Real applications demonstrate that the approach for developing complex numerical is both practical and efficient.
In this paper we investigate Monte Carlo optimisation of the fitness function on a multi-GPU cluster. Our main goal is to develop auto-tuning techniques for the GPU cluster. Monte Carlo or random sampling is a techniq...
详细信息
ISBN:
(纸本)9781467329255
In this paper we investigate Monte Carlo optimisation of the fitness function on a multi-GPU cluster. Our main goal is to develop auto-tuning techniques for the GPU cluster. Monte Carlo or random sampling is a technique to optimise a fitness function by giving random values to function parameters. When execution of the fitness function requires a high amount of computational power Monte Carlo sampling becomes both very time and computational power consuming. A developer who is not familiar with the application, hardware, and the CUDA runtime cannot determine the optimal execution parameters. This makes GPU auto-tuning well suited to achieving better performance and reducing computing time. Finally, we compare the execution time with the consequent CPU as well as with multi-core CPU implementation.
computing paradigms have evolved over the years in the form of parallel, distributed, grid, cloud, and fog computing. Scheduling is a fundamental issue in distributed environments where several tasks compete among ava...
详细信息
cloudcomputing is a novel parallel platform, this paper proposed a kind of simple parallel genetic algorithm (PGA) using cloudcomputing called SMRPGA. Comparing with the traditional PGAs using high performance compu...
详细信息
ISBN:
(纸本)9783037852828
cloudcomputing is a novel parallel platform, this paper proposed a kind of simple parallel genetic algorithm (PGA) using cloudcomputing called SMRPGA. Comparing with the traditional PGAs using high performance computers (HPC), cluster or grid, SMRPGA is simple and easy to be implemented. Another advantage is that PGA using cloudcomputing is easy to be extend to larger-scale, which is very useful for solving the time-consuming problems. A prototype is implemented based on Hadoop, which is an open source cloudcomputing. The result of running two benchmark functions showed that the speed-up of PGA using cloudcomputing is not obvious considering the long communication time and it is suitable to solve the time-consuming problems.
In this paper, we present an performance evaluation of a 256-core cluster based on the Intel Xeon Processor E5-2680. This is the new version of Sandy Bridge processor for server and workstation market. It employs an i...
详细信息
cloudcomputing facilitates end users to allow high performance computing applications by allocating resources on demand. This avoids large capital expenditure for small and medium sized enterprises having limited res...
详细信息
ISBN:
(纸本)9781538636848
cloudcomputing facilitates end users to allow high performance computing applications by allocating resources on demand. This avoids large capital expenditure for small and medium sized enterprises having limited resources to obtain High Performance computing(HPC). In this paper, we propose a new distributed HPC model in self-built OpenStack public cloud under SDN infrastructure. For partitioning any task in HPC, domain decomposition strategy is used. We analyse different parallel computation topologies and benchmark their performances. The public cloud constructed under OpenStack platform is integrated with Opendaylight SDN controller for network controlling and monitoring. Also, we analyze and compare the performance of the OpenStack cloud with and without proposed distributed HPC system. The result shows that the speed performance of OpenStack cloud under SDN infrastructure is enhanced by implementation of our HPC system based on Hypercube algorithm and Mesh Algorithm.
暂无评论