检索结果-内蒙古大学图书馆

4th international conference on parallel Computing in Electrical Engineering (PARELECT 2004)

作者： Bozejko, W Wodecki, M Wroclaw Univ Technol Inst Engn PL-50372 Wroclaw Poland

ISBN: (纸本)0769520804

there are very many issues, where scheduling can be applied, in computer systems (single and multiprocessors) as well as in production systems. Scheduling problems belongs in most cases to NP-hard class. For most of classical scheduling problems, published in last 10 years (for example benchmarks of Taillard [14]for the flow shop problem), there are still no optimal solutions. In this paper we propose very effective method of construct parallel algorithms based on tabu search metaheuristic. We apply block properties, which enable parallel algorithm to distribute calculations and reduce communication between processors. algorithms are implemented in Ada95 and MPI.(*)

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Implementing malleability on MPI jobs

Implementing malleability on MPI jobs

引用

13th international conference on parallel Architecture and Compilation Techniques

作者： Utrera, G Corbalán, J Labarta, J Univ Politecn Catalunya Dept Arquitectura Computadors E-08028 Barcelona Spain

ISBN: (纸本)0769522297

parallel jobs are characterized for having processes that communicate and synchronize with each other frequently. A processor allocation strategy widely used in parallel supercomputers is Space-Sharing, that is assigning a processors partition to each job for its exclusive use. In this article we present a global solution to offer virtual Malleability on message passing parallel jobs, by applying a processor allocation strategy, the Folding by JobType (FJT). this technique is based on Folding and Moldability concepts and tries to decide the optimal initial number of processes, when to fold jobs and the number of folding times by analyzing the current and past system information. At processor level, we apply Co-Scheduling. We implement and evaluate the FJT under several workloads with different job sizes, classes and machine utilization. Results show that the FJT adapts easily to load changes, and can obtain better performance than the rest evaluated, on workloads with high coefficient variation and especially with burst arrivals.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

From heterogeneous task scheduling to heterogeneous mixed parallel scheduling

引用

10th international European conference on parallel processing, Euro-Par 2004

作者： Suter, Frédéric Desprez, Frédéric Casanova, Henri Dept. of CSE Univ. of California San Diego United States LIP ENS Lyon UMR CNRS-ENS Lyon UCB Lyon-INRIA 5668 France San Diego Supercomputer Center Univ. of California San Diego United States

ISBN: (纸本)3540229248

Mixed-parallelism, the combination of data-and task parallelism, is a powerful way of increasing the scalability of entire classes of parallel applications on platforms comprising multiple compute clusters. While multi-cluster platforms are predominantly heterogeneous, previous work on mixed-parallel application scheduling targets only homogeneous platforms. In this paper we develop a method for extending existing scheduling algorithms for task-parallel applications on heterogeneous platforms to the mixed-parallel case. © Springer-Verlag Berlin Heidelberg 2004.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

Merging, sorting and matrix operations on the SOME-Bus multiprocessor architecture

引用

FUTURE GENERATION COMPUTER SYSTEMS-thE international JOURNAL OF ESCIENCE 2004年第4期20卷 643-661页

作者： Katsinis, C Drexel Univ Philadelphia PA 19104 USA

Due to advances in fiber-optics and VLSI technology, interconnection networks which allow multiple simultaneous broadcasts are becoming feasible. this paper presents the multiprocessor architecture of the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus), and examines the performance of representative algorithms for matrix operations, merging and sorting. using the message-passing and distributed-shared-memory paradigms. It shows that simple enhancements to the network interface and the cache and directory controllers can result in communication time of 0(l) for the matrix-vector multiplication algorithm using DSM. the SOME-Bus is a low-latency, high-bandwidth, fiber-optic interconnection network which directly links arbitrary pairs of processor nodes without contention, and can efficiently interconnect over 100 nodes. It contains a dedicated channel for the data output of each node, eliminating the need for global arbitration and providing bandwidth that scales directly with the number of nodes in the system. Each of P nodes has an array of receivers, with one receiver dedicated to each node output channel. No node is ever blocked from transmitting by another transmitter or due to contention for shared switching logic. the entire P receiver array can be integrated on a single chip at a comparatively minor cost resulting in O(P) complexity. the SOME-Bus has much more functionality than a crossbar by supporting multiple simultaneous broadcasts of messages, allowing cache consistency protocols to complete much faster. (C) 2003 Elsevier B.V. All rights reserved.

关键词： multiprocessors broadcast architectures numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

Targeting heterogeneous architectures in ASSIST: Experimental results

引用

10th international European conference on parallel processing, Euro-Par 2004

作者： Aldinucci, M. Campa, S. Coppola, M. Magini, S. Pesciullesi, P. Potiti, L. Ravazzolo, R. Torquati, M. Zoccolo, C. Dept. of Computer Science University of Pisa Viale Buonarroti 2 Pisa Italy Inst. of Information Science and Technologies CNR Via Moruzzi 1 Pisa Italy

ISBN: (纸本)3540229248

We describe how the ASSIST parallel programming environment can be used to run parallel programs on collections of heterogeneous workstations and evaluate the scalability of one task-farm real application and a data-parallel benchmark, comparing the actual performance figures measured when using homogeneous and heterogeneous workstation clusters. We describe also the ASSIST approach to heterogeneous distributed shared memory and provide preliminary performance figures of the current implementation. © Springer-Verlag Berlin Heidelberg 2004.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Static Placement, Dynamic Issue (SPDI) scheduling for EDGE architectures

Static Placement, Dynamic Issue (SPDI) scheduling for EDGE a...

引用

Proceedings - 13th international conference on parallel architectures and Compilation Techniques (PACT 2004)

作者： Nagarajan, Ramadass Kushwaha, Sundeep K. Burger, Doug McKinley, Kathryn S. Lin, Calvin Keckler, Stephen W. Comp. Arch. and Technol. Laboratory Department of Computer Sciences University of Texas Austin

ISBN: (纸本)0769522297

Technology trends present new challenges for processor architectures and their instruction schedulers. Growing transistor density will increase the number of execution units on a single chip, and decreasing wire transmission speeds will cause long and variable on-chip latencies. these trends will severely limit the two dominant conventional architectures: dynamic issue superscalars, and static placement and issue VLIWs. We present a new execution model in which the hardware and static scheduler instead work cooperatively, called Static Placement Dynamic Issue (SPDI). this paper focuses on the static instruction scheduler for SPDI. We identify and explore three issues SPDI schedulers must consider - locality, contention, and depth of speculation. We evaluate a range of SPDI scheduling algorithms executing on an Explicit Data Graph Execution (EDGE) architecture. We find that a surprisingly simple one achieves an average of 5.6 instructions-per-cycle (IPC) for SPEC2000 64-wide issue machine, and is within 80% of the performance without on-chip latencies. these results suggest that the compiler is effective at balancing on-chip latency and parallelism, and that the division of responsibilities between the compiler and the architecture is well suited to future systems.

关键词： Computer architecture

来源：评论

学校读者我要写书评

暂无评论

A self-controlled and dynamically reconfigurable architecture

引用

Working conference on Distributed and parallel Embedded Systems (DIPES 2004) held at the 18th World Computer Congress

作者： Dittmann, F Rettberg, A Univ Gesamthsch Paderborn D-4790 Paderborn Germany

ISBN: (纸本)1402081480

Reconfigurable systems have the potential to combine the performance of ASICs with the flexibility of software. the architecture presented in this paper offers a new concept for reconfiguration by operating self-timed and self-controlling. Data is routed together with its control information in a so-called packet through the operator network to make local decisions concerning the behavior of the network. therefore, we can realize different paths without a central control unit. In this paper, we describe the architecture from the aspect of reconfiguration. An example shows the architecture in practical operation.

关键词： high-level synthesis reconfigurable architectures embedded systems

来源：评论

学校读者我要写书评

暂无评论

Decentralized reactive clustering for collaborative processing in sensor networks

Decentralized reactive clustering for collaborative processi...

引用

10th international conference on parallel and Distributed Systems (ICPADS 2004)

作者： Xu, YY Qi, HR Univ Tennessee Dept Elect & Comp Engn Knoxville TN 37996 USA

ISBN: (纸本)0769521525

A sensor network forms a loosely-coupled distributed environment where collaborative processing among multiple sensor nodes is essential in order to compensate for the limitation of each sensor node in its processing capability, sensing capability, and energy usage, as well as to improve the degree of fault tolerance. Due to the sheer amount of nodes deployed, collaboration is usually carried out among nodes within the same cluster. Different clustering protocols can affect the performance of network to a great extent. Most existing clustering protocols either do not adequately address the energy-constraint problem or derive clusters proactively which may not be suitable for event-driven collaborative processing in sensor networks. this paper focuses on the design of clustering protocols for collaborative processing. We propose a decentralized reactive clustering (DRC) protocol where the clustering procedure is initiated only when events are detected. It uses power control technique to minimize energy usage in forming clusters. We compare the performance of DRC with another popular clustering algorithm, LEACH. Simulation results show considerable improvement over LEA CH in energy conservation and network lifetime using DRC.

关键词： Sensor data fusion

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel hierarchical clustering

引用

10th international European conference on parallel processing, Euro-Par 2004

作者： Dash, Manoranjan Petrutiu, Simona Scheuermann, Peter Department of Information Systems School of Computer Engineering Nanyang Technological University Singapore639798 Singapore Department of Electrical and Computer Engineering Northwestern University EvanstonIL60208 United States

ISBN: (纸本)3540229248

Hierarchical agglomerative clustering (HAC) is a common clustering method that outputs a dendrogram showing all N levels of agglomerations where N is the number of objects in the data set. High time and memory complexities are some of the major bottlenecks in its application to real-world problems. In the literature parallel algorithms are proposed to overcome these limitations. But, as this paper shows, existing parallel HAC algorithms are inefficient due to ineffective partitioning of the data. We first show how HAC follows a rule where most agglomerations have very small dissimilarity and only a small portion towards the end have large dissimilarity. Partially overlapping partitioning (POP) exploits this principle and obtains efficient yet accurate HAC algorithms. the total number of dissimilarities is reduced by a factor close to the number of cells in the partition. We present pPOP, the parallel version of POP, that is implemented on a shared memory multiprocessor architecture. Extensive theoretical analysis and experimental results are presented and show that pPOP gives close to linear speedup and outperforms the existing parallel algorithms significantly both in CPU time and memory requirements. © Springer-Verlag Berlin Heidelberg 2004.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Preconditioned iterative linear solvers for unstructured grids on the Earth Simulator

Preconditioned iterative linear solvers for unstructured gri...

引用

7th international conference on High Performance Computing and Grid in Asia Pacific Region (HPCAsia 2004)

作者： Nakajima, K Univ Tokyo Dept Earth & Planetary Sci Tokyo Japan

ISBN: (纸本)076952138X

Efficient parallel preconditioned iterative linear solvers for unstructured grid have been developed for symmetric multiprocessor (SMP) cluster architectures with vector processors such as the Earth Simulator. three types of preconditioning, methods (ICCG, multigrid and selective-blocking for contact problems) have been developed and performance has been demonstrated on the Earth Simulator using flat-MPI and hybrid parallel programming models, where each of three preconditioning methods corresponds, to typical finite-element type applications in solid earth simulation developed in GeoFEM project. Simple 3D linear elastic problems with more than 2.2x10(9) DOF have been solved using 3x3 block ICCG(0) method and PDJDS/CM-RCM reordering on 176 nodes of the Earth Simulator, achieving performance of 3.80 TFLOPS. Multicolor and RCM ordering provide excellent parallel and vector performance of the three preconditioned methods on the Earth Simulator.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：