Multi-cores are more and more popular recently and have being altered the course of computing. Traditional XPath query evaluation algorithms cannot take full advantages of multi-cores, and it is not straightforward to...
详细信息
ISBN:
(纸本)9783642120251
Multi-cores are more and more popular recently and have being altered the course of computing. Traditional XPath query evaluation algorithms cannot take full advantages of multi-cores, and it is not straightforward to adapt such algorithms on multi-cores. In this paper, we propose an efficient parallel Path Stack algorithm, named P-PathStack. for processing XML twig queries. the algorithm first efficiently partitions input element lists into multiple buckets, and then processes data in each bucket in parallel. With efficient partitioning method, our proposed algorithm can avoid many useless elements and achieve very good speedup ratio. We have implemented the algorithm and experimental results show that it achieves high performance and speedup ratio.
this paper presents a preliminary PhD research towards developing a framework to evaluate and optimize application mapping algorithms for Network-on-Chip architectures. Several such algorithms have been proposed for m...
详细信息
ISBN:
(纸本)9781424473359
this paper presents a preliminary PhD research towards developing a framework to evaluate and optimize application mapping algorithms for Network-on-Chip architectures. Several such algorithms have been proposed for mapping the threads of a parallel application on a NoC architecture. However, the performance of those algorithms is evaluated only on some specific NoC designs. A unified approach for evaluating such algorithms allows a better comparison of their performance and can potentially lead to some optimizations. the proposed framework is intended to be flexible so that the algorithms can be tested on different NoC designs. To this end, a scalable and flexible Network-on-Chip simulator is proposed. Some preliminary results obtained withthis simnlator are presented, too. they show the flexibility of this simulator and that it is feasible for addressing the application mapping problem in a unified manner.
Graphics processors are increasingly used in scientific applications due to their high computational power, which comes from hardware with multiple-level parallelism and memory hierarchy. Sparse matrix computations fr...
详细信息
ISBN:
(纸本)9783642115141
Graphics processors are increasingly used in scientific applications due to their high computational power, which comes from hardware with multiple-level parallelism and memory hierarchy. Sparse matrix computations frequently arise in scientific applications, for example, when solving PDEs on unstructured grids. However, traditional sparse matrix algorithms are difficult to efficiently parallelize for GPUs due to irregular patterns of memory references. In this paper we present a new storage format for sparse matrices that better employs locality, has low memory footprint and enables automatic specialization for various matrices and future devices via parameter tuning. Experimental evaluation demonstrates significant speedups compared to previously published results.
the proceedings contain 81 papers. the topics discussed include: length encoded secondary structure profile for remote homologous protein detection;a process scheduling analysis model based on grid environment;a resou...
ISBN:
(纸本)3642030947
the proceedings contain 81 papers. the topics discussed include: length encoded secondary structure profile for remote homologous protein detection;a process scheduling analysis model based on grid environment;a resource broker with cross grid information services on computational multi-grid environments;implementation of a performance-based loop scheduling on heterogeneous clusters;a software transactional memory service for grids;an empirical study on the performance issues on the clustered client-server computing environment;energy-efficient clustering in wireless sensor networks;maximally local connectivity on augmented cubes;a cluster-based data routing for wireless sensor networks;and a energy efficient scheduling base on dynamic voltage and frequency scaling for multi-core embedded real-time system.
this work concentrates on the issue of rigid body collision detection, a critical component of any software package employed to approximate the dynamics of multibody systems with frictional contact. this paper present...
详细信息
ISBN:
(纸本)9780791849019
this work concentrates on the issue of rigid body collision detection, a critical component of any software package employed to approximate the dynamics of multibody systems with frictional contact. this paper presents a scalable collision detection algorithm designed for massively parallel computing architectures. the approach proposed is implemented on a ubiquitous Graphics processing Unit (GPU) card and shown to achieve a 40x speedup over state-of-the art Central processing Unit (CPU) implementations when handling multi-million object collision detection. GPUs are composed of many (on the order of hundreds) scalar processors that can simultaneously execute an operation;this strength is leveraged in the proposed algorithm. the approach can detect collisions between five million objects in less than two seconds;with newer GPUs, the capability of detecting collisions between eighty million objects in less than thirty seconds is expected. the proposed methodology is expected to have an impact on a wide range of granular flow dynamics and smoothed particle hydrodynamics applications, e.g. sand, gravel and fluid simulations, where the number of contacts can reach into the hundreds of millions.
Scientific Visualization is a computer-based field concerned with techniques that allow scientists to create graphical rep- resentations from datasets generated by computational sim- ulations or acquisition instrument...
详细信息
ISBN:
(纸本)9781450304535
Scientific Visualization is a computer-based field concerned with techniques that allow scientists to create graphical rep- resentations from datasets generated by computational sim- ulations or acquisition instruments. To address the compu- tational cost of visualization tasks, specially for large datasets, researchers have explored grid environments as a platform for their parallel evaluation. It is however not trivial to adapt each different visualization technique to run in grid environments. A desirable alternative would separate the specificities of data and process distribution in grids from visualization computation logic. In this work we claim that the QEF (query evaluation framework) leverages scientific visualization computation withthe above mentioned char- acteristics. Visualization computation techniques are mod- eled as operators in an algebra and integrated with a set of control operators that manage data distribution leading to a parallel QEP (query execution plan). We show the ben-efits of parallelization for two of those techniques: particle tracing and volume rendering. For these techniques, our ex- periments demonstrate many positive aspects of the solution presented, as well as opportunities for future work. Copyright 2010 ACM.
Most of cryptographic systems are based on modular exponentiation. It is performed using successive modular multiplications. One way of improving the throughput of a cryptographic system implementation is reducing the...
详细信息
In this work we present the implementation of an application to simulate the evolution of pressure and temperature inside a cavity when acoustic energy is injected, a physical system currently under intensive research...
详细信息
ISBN:
(纸本)9783642143892
In this work we present the implementation of an application to simulate the evolution of pressure and temperature inside a cavity when acoustic energy is injected, a physical system currently under intensive research. the particular features of the equations of the model makes the simulation problem very stiff and time consuming. However, intrinsic parallelism makes the application suitable for implementation in GPUs providing the researchers with a very useful tool to study the problem at a very reasonable price. In our experiments the problem was solved in less than half the time required by CPUs.
this study describes the performance results on testing MatLab applications using the parallel computing and the distributed computing toolboxes under different platforms with different hardware and operating systems....
详细信息
ISBN:
(纸本)9780769540184
this study describes the performance results on testing MatLab applications using the parallel computing and the distributed computing toolboxes under different platforms with different hardware and operating systems. Each trial was executed keeping the hardware fixed and changing the operating system to obtain unbiased results. To standardize the benchmarking test, Fast Fourier Transform (FFT), discrete cosine transform (DCT), edge detection and matrix multiplication algorithms were executed. the results show that the leveraging of multicore platforms can speed up considerably the processing of images through the use of parallel computing tools in MatLab. Two different system hardware platforms (systems 1 and 2) were used in a series of experiments. Four rounds of experiments were performed benchmarking the FFT algorithm using the parallel tool box, by changing system platform, number of workers, image size and number of images. the results of the ANOVA test suggest that although there is no statistical significance on the factor represented by the operating system (OS) on system 1, the OS plays a significant roll on system 2. Moreover, on both systems there is statistical significance on the factors represented by the number of workers utilized and the number of images processed, yielding more than a 500% performance increase by using 8 MatLab workers on a dual quad-core machine.
We are interested in varying the vocabulary size in the image categorization task with a bag-of-visual-words to investigate its influence on the classification accuracy in two cases: in the first one, boththe test-se...
详细信息
暂无评论