It is likely that 2005 will be viewed as the year that parallelism came to the masses, with multiple vendors shipping dual/multi-core platforms into the mainstream consumer and enterprise markets. Assuming that this t...
详细信息
In this paper, a parallel algorithm for data clustering is presented on a multi-computer with star topology. this algorithm is fast and requires a small amount of memory per processing element, which makes it even sui...
详细信息
ISBN:
(纸本)3540292357
In this paper, a parallel algorithm for data clustering is presented on a multi-computer with star topology. this algorithm is fast and requires a small amount of memory per processing element, which makes it even suitable for SIMD implementation. the proposed parallel algorithm completes in O(K + S-2 - T-2) steps for a clustering problem of N data patterns with M features per pattern and K clusters, where N.M = S!, KM = T!, and M = R!, on a s-star interconnection network.
作者:
Schikuta, EUniv Vienna
Res Lab Computat Techol & Appl Inst Knowledge & Business Engn A-1010 Vienna Austria
We developed a concise but comprehensive analytical model for the well-known sort merge Join algorithm on cost effective cluster architectures. We try to concentrate on a limited number of characteristic parameters to...
详细信息
ISBN:
(纸本)3540292357
We developed a concise but comprehensive analytical model for the well-known sort merge Join algorithm on cost effective cluster architectures. We try to concentrate on a limited number of characteristic parameters to keep the analytical model clear and focused. We believe that a meaningful model can be built upon only three characteristic parameter sets, describing main memory size, the I/O bandwidth and the disk bandwidth. We justify our approach by a practical implementation and a comparison of the theoretical to real performance values.
the rapid development of modern parallel computing platforms, and the advent and availability of HPC techniques, including communication and parallel numerical libraries, provides an incomparable opportunity to face a...
详细信息
Helper threading is a technique that utilizes a second core or logical processor in a multi-threaded system to improve the performance of the main thread. A helper thread executes in parallel withthe main thread that...
详细信息
ISBN:
(纸本)076952429X
Helper threading is a technique that utilizes a second core or logical processor in a multi-threaded system to improve the performance of the main thread. A helper thread executes in parallel withthe main thread that it attempts to accelerate. In this paper the helper thread merely prefetches data into a shared cache and does not incur any other programmer visible effects. Helper thread prefetching has been proposed as a viable solution in various scenarios where it is difficult to prefetch efficiently within the main thread itself this paper presents our helper threading experience on SUN's second dual-core SPARC microprocessor, the UltraSFARC IV+. the two cores on this processor share an on-chip L2 and an off-chip L3 cache. We present a compiler framework to automatically construct helper threads and evaluate our scheme on the UltraSFARCIV+ processor Our preliminary results using helper threads on the SPEC CPU2000 suite show gains of up to 22% on programs that suffer substantial L2 cache misses while at the same time incurring negligible losses on programs that do not suffer L2 cache misses.
A number of compute-intensive applications suffer from performance loss due to the lack of instruction-level parallelism in sequences of dependent instructions. this is particularly accurate on wide-issue architecture...
详细信息
ISBN:
(纸本)076952429X
A number of compute-intensive applications suffer from performance loss due to the lack of instruction-level parallelism in sequences of dependent instructions. this is particularly accurate on wide-issue architectures with large register banks, when the memory hierarch), (locality and bandwidth) is not the dominant bottleneck. We consider two real applications from computational biology and from crymanalysis, characterized by long sequences of dependent instructions, irregular control-flow and intricate scalar and array dependence patterns. Although these applications exhibit excellent memory locality and branch-prediction behavior state-of-the-art loop transformations and back-end optimizations are unable to exploit much instruction-level parallelism. We show that good speedups can be achieved through deep jam, a new transformation of the program control- and data-flow. Deep jam combines scalar and array renaming with a generalized form of recursive unroll-and-jam;it brings together independent instructions across irregular control structures, removing memorybased dependences. this optimization contributes to the extraction offine-grain parallelism in irregular applications. We propose a feedback-directed deep jam algorithm, selecting a jamming strategy, function of the architecture and application charactristics.
Withthe increasing importance of multiple multiplatform remote sensing missions, digital image registration has been applied into many fields, and specially plays a very important role in remotely sensed data process...
详细信息
ISBN:
(纸本)3540292357
Withthe increasing importance of multiple multiplatform remote sensing missions, digital image registration has been applied into many fields, and specially plays a very important role in remotely sensed data processing. Firstly a brief introduction of existing parallel methods of wavelet-based global registration is given. And then the communication optimization for GP method is described. the optimized algorithm is named Group-Optimized-parallel (GOP for short). To find out the reason of occasionally lower efficiency of GOP than other methods, a more careful analysis is presented in theory and proved in experiments. Moreover, we give a quantitative criterion, called Remainder Items, to choose the best solution in different input conditions.
Successful participation in task-oriented, inference-rich dialogs requires, among other things, understanding of specifications implicitly conveyed through the exploitation of parallel structures. Several linguistic o...
详细信息
ISBN:
(纸本)3540260315
Successful participation in task-oriented, inference-rich dialogs requires, among other things, understanding of specifications implicitly conveyed through the exploitation of parallel structures. Several linguistic operators create specifications of this kind, including "the other way (a)round", "vice-versa", and "analogously";unfortunately, automatic reconstruction of the intended specification is difficult due to the inherent dependence on given context and domain. We address this problem by a well-informed reasoning process. the techniques applied include building deep semantic representations, application of categories of patterns underlying a formal reconstruction, and using pragmaticallymotivated and domain-justified preferences. Our approach is not only suitable for improving the understanding in everyday discourse, but it specifically aims at extending capabilities in a tutorial dialog system, where stressing generalities and analogies is a major concern.
Large scale distributed computing infrastructure captures the use of high number of nodes, poor communication performance and continously varying resources that are not available at any time. In this paper, we focus o...
详细信息
ISBN:
(数字)9783540320715
ISBN:
(纸本)3540292357
Large scale distributed computing infrastructure captures the use of high number of nodes, poor communication performance and continously varying resources that are not available at any time. In this paper, we focus on the different tools available for mining traces of the activities of such aforementioned architecture. We propose new techniques for fast management of a frequent itemset mining parallel algorithm. the technique allow us to exhibit statistical results about the activity of more that one hundred PCs connected to the web.
Redundancy is a basic technique for achieving fault tolerance, but the overhead introduced by redundancy may degrade system's performance. In this paper, we propose efficient replication based algorithms for fault...
详细信息
暂无评论