We consider the deconvolution of 3D Fluorescence Microscopy RGB images, describing the benefits arising from facing medical imaging problems on modern graphics processing units (GPUs), that are non expensive parallel ...
详细信息
ISBN:
(数字)9783642314643
ISBN:
(纸本)9783642314636;9783642314643
We consider the deconvolution of 3D Fluorescence Microscopy RGB images, describing the benefits arising from facing medical imaging problems on modern graphics processing units (GPUs), that are non expensive parallelprocessing devices available on many up-to-date personal computers. We found that execution time of CUDA version is about 2 orders of magnitude less than the one of sequential algorithm. Anyway, the experiments lead some reflections upon the best setting for the CUDA-based algorithm. that is, we notice the need to model the GPUs architectures and their characteristics to better describe the performance of GPU-algorithms and what we can expect of them.
Matrix multiplication is an essential building block of many linear algebra operations and applications. this paper presents parallelalgorithms with shared A or B matrix in the memory for the special massively multit...
详细信息
the simulation of complex problems in the field of plasma deposition technology requires the usage of parallel code running on modern multicore architectures. the inhouse developed Particle-in-Cell Monte-Carlo (PIC-MC...
详细信息
ISBN:
(纸本)9783642281501
the simulation of complex problems in the field of plasma deposition technology requires the usage of parallel code running on modern multicore architectures. the inhouse developed Particle-in-Cell Monte-Carlo (PIC-MC) simulation environment has recently been ported from PVM towards MPI, which is the de-facto standard for parallelization by message passing. We measured a shorter latency time of MPI in comparison with PVM and determined the impact on the PIC-MC performance.
the main contribution of this paper is to show optimal algorithms computing the sum and the prefix-sums on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). the DMM and...
详细信息
processing of extremely large polygonal (vector-based) spatial datasets has been a long-standing research challenge for scientists in the Geographic Information Systems and Science (GIS) community. Surprisingly, it is...
详细信息
ISBN:
(纸本)9780769549569;9781467362184
processing of extremely large polygonal (vector-based) spatial datasets has been a long-standing research challenge for scientists in the Geographic Information Systems and Science (GIS) community. Surprisingly, it is not for the lack of individual parallel algorithm; we discovered that the irregular and data intensive nature of the underlying processing is the main reason for the meager amount of work by way of system design and implementation. Furthermore, of all the systems reported in the literature, very few deal withthe complexities of vector-based datasets and none, including commercial systems, on the cloud platform. We have designed and implemented an open-architecture-based system named Crayons for Windows Azure cloud platform using state-of-the-art techniques. We have implemented three different architectures of Crayons with different load balancing schemes. Crayons scales well for sufficiently large data sets, achieving end-to-end absolute speedup of over 28-fold employing 100 Azure processors. For smaller and more irregular workload, it still yields over 10-fold speedup.
Modern electronic systems claim for Analog-to-Digital (A/D) interfaces with strong requirements in terms of resolution and frequency. Among several A/D architecturesthat intent to achieve these hard specifications, T...
详细信息
ISBN:
(纸本)9781467308595
Modern electronic systems claim for Analog-to-Digital (A/D) interfaces with strong requirements in terms of resolution and frequency. Among several A/D architecturesthat intent to achieve these hard specifications, Time-Interleaved Analog-to-Digital Converters (TIADC) arises as a competitive candidate. TIADC offer a higher sampling frequency with suitable moderate power consumption. However, their architecture introduces mismatch errors that affect the resolution of data conversion. Calibration methods permit to reduce significantly the impact of these errors. A possible solution is the insertion of an additional circuitry in the A/D conversion system: a Built-In Self-Calibration (BISC). A BISC system aims to compensate imperfections from the TIADC, such as offset, gain and timing errors. Despite the benefits carried by the BISC, this system also introduces errors in the overall data conversion system. this paper proposes a case study of a mixed-signal BISC TIADC, highlighting the strengths and the weakness of using built-in calibration circuits. Supplementary calibration methods will be explored to mitigate the impact of the BISC and to improve the A/D conversion performance. the debate is open: how calibration systems must be calibrated ?
Accurate prediction of parallel applications' performance is becoming increasingly complex. We seek to characterize the behavior of message-passing applications by extracting a signature to predict the performance...
详细信息
ISBN:
(纸本)9783642281501
Accurate prediction of parallel applications' performance is becoming increasingly complex. We seek to characterize the behavior of message-passing applications by extracting a signature to predict the performance in different target systems. We have developed a tool we called parallel Application Signature for Performance Prediction (PAS2P) that strives to describe an application based on its behavior. Based on the application's message-passing activity, we have been able to identify and extract representative phases, with which we created a signature. We have experimented using scientific applications and we predicted the execution times on multicore architectures with an average accuracy of over 97%.
We optimize codes implementing Monte Carlo simulations of spin-glass systems for some multi-core CPU and GPU architectures. We consider boththe binary Ising and floating-point Heisenberg spin-glass models in 3 dimens...
详细信息
ISBN:
(纸本)9783642281501;9783642281518
We optimize codes implementing Monte Carlo simulations of spin-glass systems for some multi-core CPU and GPU architectures. We consider boththe binary Ising and floating-point Heisenberg spin-glass models in 3 dimensions. We provide performance figures for the Intel Nehalem quad-core and the IBM Cell/BE CPUs and the Nvidia Tesla C1060 GPU;for the binary model we also draw a comparison withthe performance of dedicated computers, such as the Janus machine.
We show that developing an optimal parallelization of the two-list algorithm is much easier than we once thought. All it takes is to observe that the steps of the search phase of the two-list algorithm are closely rel...
详细信息
the purpose of this paper is to evaluate the performance of GPU using in a router for a DiffServ-based network for video conferences using G729 and H264 standards for voice codec and video codec, respectively. the spe...
详细信息
ISBN:
(纸本)9781467311762
the purpose of this paper is to evaluate the performance of GPU using in a router for a DiffServ-based network for video conferences using G729 and H264 standards for voice codec and video codec, respectively. the speech and video quality is improved by assigning a priority to each voice and video packet to the edge router of the network. the importance of a packet is determined differently for voice and video. A priority is assigned to the packet according to its importance, which ensures that important packets are more robust during transmission. the priority is computed using graphic processing units (GPU), in a parallel manner, using Compute Unified Device Architecture (CUDA) and therefore the performance is increased.
暂无评论