Due to advances in fiber-optics and VLSI technology, interconnection networks which allow multiple simultaneous broadcasts are becoming feasible. this paper presents the multiprocessor architecture of the Simultaneous...
详细信息
Due to advances in fiber-optics and VLSI technology, interconnection networks which allow multiple simultaneous broadcasts are becoming feasible. this paper presents the multiprocessor architecture of the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus), and examines the performance of representative algorithms for matrix operations, merging and sorting. using the message-passing and distributed-shared-memory paradigms. It shows that simple enhancements to the network interface and the cache and directory controllers can result in communication time of 0(l) for the matrix-vector multiplication algorithm using DSM. the SOME-Bus is a low-latency, high-bandwidth, fiber-optic interconnection network which directly links arbitrary pairs of processor nodes without contention, and can efficiently interconnect over 100 nodes. It contains a dedicated channel for the data output of each node, eliminating the need for global arbitration and providing bandwidththat scales directly withthe number of nodes in the system. Each of P nodes has an array of receivers, with one receiver dedicated to each node output channel. No node is ever blocked from transmitting by another transmitter or due to contention for shared switching logic. the entire P receiver array can be integrated on a single chip at a comparatively minor cost resulting in O(P) complexity. the SOME-Bus has much more functionality than a crossbar by supporting multiple simultaneous broadcasts of messages, allowing cache consistency protocols to complete much faster. (C) 2003 Elsevier B.V. All rights reserved.
In the last decade various soft computing techniques have been developed. they include neural networks, fuzzy systems, evolutionary algorithms, rough sets and others. In many applications it is desirable that soft com...
详细信息
the proceedings contain 144 papers. the special focus in this conference is on Support Tools, Performance Evaluation, Scheduling, Load Balancing and Compilers for High Performance. the topics include: An approach base...
ISBN:
(纸本)3540229248
the proceedings contain 144 papers. the special focus in this conference is on Support Tools, Performance Evaluation, Scheduling, Load Balancing and Compilers for High Performance. the topics include: An approach based on components, web services and workflow tools;some techniques for automated, resource-aware distributed and mobile computing in a multi-paradigm programming system;support tools and environments;efficient pattern search in large traces through successive refinement;adaptive control system with hardware performance counters;a tool for source-to-source transformations and real-life compiler tests;a time-coherent model for the steering of parallel simulations;dynamic performance tuning environment;imprecise exceptions in distributed parallel components;a data structure oriented monitoring environment for Fortran openMP programs;an approach for symbolic mapping of memory references;evaluating openMP performance analysis tools withthe apart test suite;understanding the behavior and performance of non-blocking communications in MPI;generation of simple analytical models for message passing applications;parallel peps tool performance analysis using stochastic automata networks;scheduling under conditions of uncertainty;scheduling tasks sharing files from distributed repositories;lookahead scheduling for reconfigurable grid systems;more legal transformations for locality;a polyhedral approach to ease the composition of program transformations;using data compression to increase energy savings in multi-bank memories;architecture-independent meta-optimization by aggressive tail splitting;parallel and distributed databases, data mining and knowledge discovery and a large-scale digital library system to integrate heterogeneous data of distributed databases.
In the era of future embedded systems the designer is confronted with multi-processor architectures both for performance and energy reasons. Exploiting (sub)task-level parallelism is becoming crucial because the instr...
详细信息
Distributed Video-on-Demand (DVoD) systems are proposed as a solution to the limited streaming capacity and null scalability of centralized systems. In such full decentralized architectures with storage constraints, s...
详细信息
the new principals of organization of parallel input-output of the optical information in the personal computer from the fiber-optical measuring lines are considered. the device has block structure and has two mode of...
详细信息
ISBN:
(纸本)0819453226
the new principals of organization of parallel input-output of the optical information in the personal computer from the fiber-optical measuring lines are considered. the device has block structure and has two mode of operation: calibration mode of operation and work mode of operation. In the calibration mode of operation computing, system is adaptation to condition of the Solution problem of reconstruction information about parameters of monitoring physical fields. In the work mode of operation the device implements the adaptive processing of incoming optical radiation.
We consider the problem of scheduling n jobs with release dates on m identical parallel batch processing machines so as to minimize the maximum lateness. Each batch processing machine can process up to B (B < n) jo...
详细信息
ISBN:
(纸本)354022856X
We consider the problem of scheduling n jobs with release dates on m identical parallel batch processing machines so as to minimize the maximum lateness. Each batch processing machine can process up to B (B < n) jobs simultaneously as a batch, and the processing time of a batch is the largest processing time among the jobs in the batch. Jobs processed in the same batch start and complete at the same time. We present a polynomial time approximation scheme (PTAS) for this problem.
Scientific computing is evolving from parallelprocessing to distributed computing withthe availability of new computing infrastructures such as computational grids. We investigate the design of a component model for...
详细信息
the proceedings contain 35 papers. the special focus in this conference is on Languages and Compilers for parallel Computing. the topics include: Search space properties for mapping coarse-grain pipelined FPGA applica...
ISBN:
(纸本)9783540246442
the proceedings contain 35 papers. the special focus in this conference is on Languages and Compilers for parallel Computing. the topics include: Search space properties for mapping coarse-grain pipelined FPGA applications;adapting convergent scheduling using machine-learning;time-sensitive, flow-specific profiling at runtime;a hierarchical model of reference affinity;cache optimization for coarse grain task parallelprocessing using inter-array padding;compiler-assisted cache replacement;memory-constrained data locality optimization for tensor contractions;compositional development of parallel programs;supporting high-level abstractions through XML technology;applications of HP java;programming for locality and parallelism with hierarchically tiled arrays;evaluating the impact of programming language features on the performance of parallel applications on cluster architectures;putting polyhedral loop transformations to work;index-association based dependence analysis and its application in automatic parallelization;improving the performance of morton layout by array alignment and loop unrolling;space-aware programming for networks of embedded systems;memory redundancy elimination to improve application energy efficiency;adaptive MPI polynomial-time algorithms for enforcing sequential consistency in SPMD programs with arrays;a system for automating application-level checkpointing of MPI programs;the power of belady’s algorithm in register allocation for long basic blocks;load elimination in the presence of side effects, concurrency and precise exceptions;a preliminary study on the vectorization of multimedia applications for multimedia extensions;a data cache with dynamic mapping;compiler-based code partitioning for intelligent embedded disk processing;compilation for nanocontrollers and slice-hoisting for array-size inference in MATLAB.
parallel software reuse and easy integration between parallel programs and other sequential/parallel applications and software layers can be obtained exploiting the software component paradigm. In this paper we descri...
详细信息
暂无评论