Many problems of distributed object-oriented applications can be uniformly resolved in the frame of approach based on the concept of cover. The cover is defined as an environment that transparently controls all aspect...
详细信息
Many problems of distributed object-oriented applications can be uniformly resolved in the frame of approach based on the concept of cover. The cover is defined as an environment that transparently controls all aspects of object's community, life: creation, interaction etc. To enable transparency, an object-oriented application must obey a principle of late binding, a reference to server object being obtained by the client at run time from a system environment. To implement cover services, the technique of metaobject control is applied, which provides extensions of program's semantics without changing the program code, by means of attaching additional method calls to each application object invocation. A special language (TL) in which the user can incrementally define new metaservices is described and illustrated by numerous examples.
In order to efficiently compute Fast Fourier transform (FFT) various parallel algorithms and their implementation to multiprocessors and multicomputers have been developed. In general, the local interconnection networ...
详细信息
In order to efficiently compute Fast Fourier transform (FFT) various parallel algorithms and their implementation to multiprocessors and multicomputers have been developed. In general, the local interconnection network is more high speed than a global one, but its capability depends on network architecture. On the other hand, the global interconnection network is not so high speed, but it does not depends on network architecture. It provides a flexible communication interface to the programmer. In this paper, we discuss parallel radix R FFT algorithms on a multiprocessor or multicomputer system with a global interconnection network. We propose two algorithms a stage-by-stage method and a multi-stage method. We also estimate the communication time. Then we show that the communication time is very sensitive to and affected by data exchange strategy. Finally, we implement these algorithms on two commercial massively parallel computers (nCUBE/2 and CM5) and measure the communication time.
Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the cont...
详细信息
ISBN:
(纸本)0818679778
Over the last decade, significant advances have been made in compilation technology for capitalizing on instruction-level parallelism (ILP). The vast majority of ILP compilation research has been conducted in the context of general-purpose computing, and more specifically the SPEC benchmark suite. At the same time, number of microprocessor architectures have emerged which have VLIW and SIMD structures that are well matched to the needs of the ILP compilers. Most of these processors are targeted at embedded applications such as multimedia and communications, rather than general-purpose systems. Conventional wisdom, and a history of hand optimization of inner-loops, suggests that ILP compilation techniques are well suited to these applications. Unfortunately, there currently exists a gap between the compiler community and embedded applications developers. This paper presents MediaBench, a benchmark suite that has been designed to fill this gap. This suite has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement to establish uniqueness, and integration with system synthesis algorithms to establish usefulness.
The proceedings contain 37 papers. The special focus in this conference is on Evolutionary Methods for Modeling, Training and Alternative Frameworks for the Computational Study of Evolutionary Social Systems. The topi...
ISBN:
(纸本)9783540627883
The proceedings contain 37 papers. The special focus in this conference is on Evolutionary Methods for Modeling, Training and Alternative Frameworks for the Computational Study of Evolutionary Social Systems. The topics include: Complexity formalisms, order and disorder in the structure of art;the application of evolutionary computation to selected problems in molecular biology;parallel evolutionary programming for constructing artificial neural networks;scaling behavior of the evolution strategy when evolving neuronal control architectures for autonomous agents;an object oriented simulation platform applied to markets and organizations;an agent-based computational model for the evolution of trade networks;performance-enhanced genetic programming;comparing subtree crossover with macromutation;composing 16th-century counterpoint with genetic programming and symbiosis;design of a high-gain operational amplifier and other circuits by means of genetic programming;modeling speculators with genetic programming;fast evolution strategies;airspace congestion smoothing by stochastic optimization;evolutionary optimization based on lagrangian with constraint scaling;solving static and dynamic fuzzy constraint networks using evolutionary hill-climbing;applying family competition to evolution strategies for constrained optimization;supporting polyploidy in genetic algorithms using dominance vectors;an individually variable mutation-rate strategy for genetic algorithms;inductive learning of mutation step-size in evolutionary parameter optimization;a note on the escape probabilities for two alternative methods of selection under gaussian mutation;raising theoretical questions about the utility of genetic algorithms;some geometric and algebraic results on crossover and an analysis of evolutionary algorithms based on neighborhood and step sizes.
With increasing on-chip hardware, concurrency is a way to bridge the gap between the computational power demanded by the applications and that afforded by the computer platforms. Although parallel systems are increasi...
详细信息
With increasing on-chip hardware, concurrency is a way to bridge the gap between the computational power demanded by the applications and that afforded by the computer platforms. Although parallel systems are increasingly popular they remain very difficult to program. In fact, most compilers require the programmer to specify how to partition data or map program code to the system's processors. To ensure an effective program, cache locality is important because of the large speed gap between microprocessors and memory systems. It is also important to make use of local communication whenever possible, since it is cheaper faster and less power hungry than global communication. In order to exploit these locality properties, we present a systematic operation placement and scheduling scheme for fine-grain parallelarchitectures. The key advantages are twofolds: (1) This multiprojection method, which deals with multidimensional parallelism systematically, can alleviate the burden of the programmer in coding and data partitioning. (2) it addresses the memory/communication bandwidth bottleneck, and can lend to faster program execution. On a special design example of the motion estimation block-matching algorithm, which requires the most intensive computation and memory accesses in video coding, our method lends to a reduction of external memory accesses by two to three orders of magnitude.
We propose an "Asymmetric Distributed Shared Memory: ADSM", which provides users with an efficient shared memory model. The ADSM is a hybrid system that needs not only operating system support, but also comp...
详细信息
We propose an "Asymmetric Distributed Shared Memory: ADSM", which provides users with an efficient shared memory model. The ADSM is a hybrid system that needs not only operating system support, but also compiler support. The ADSM executes a load instruction as the shared read with the assistance of virtual memory mechanism. As for the shared write, the ADSM executes a sequence of instructions for consistency management after the corresponding store instruction. We describe the algorithm to reduce overheads for consistency management. The algorithm coalesces a sequence of instructions for consistency management using the information of affine memory accesses. The coalescing algorithm is evaluated using the SPLASH-2 benchmark. The performance evaluation shows that the coalescing algorithm achieves an execution time improvement compared to the non optimized result, ranging from 76% to 85%.
Data parallel language was suggested to solve programming problems of distributed memory machines in terms of programming language. Among data parallel languages, HPF is a standard data parallel language across a vari...
详细信息
ISBN:
(纸本)0818678704
Data parallel language was suggested to solve programming problems of distributed memory machines in terms of programming language. Among data parallel languages, HPF is a standard data parallel language across a variety of high-performance architectures. Most HPF compilers are source-to-source translators because they can be easily implemented. However, these source-to-source compilers produce significant amount of ineffective codes. In particular, the FORALL construct is converted into several DO loops, so its loop overhead is increased. Therefore, we propose some techniques for converting FORALL construct to optimized DO loop. For this, we define and use relation distance vector which can represent both data dependence information and flow information. Then we evaluate and analyze execution time for the codes converted by our method and by PARADIGM method.
parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named "conditional parallel progra...
详细信息
parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named "conditional parallelprogramming"). Ideally, the programmer should not be burdened by operational issues which have little relationship to the application itself. This paper proposes a new parallelprogramming environment, ATME, to automate task scheduling in conditional parallelprogramming. By adaptively producing accurate estimates of the task model prior to execution, ATME modifies task distribution to improve the system and application performance.
The proceedings contains 31 papers from the XVII international Conference of the Chilean Computer Science Society. Topics discussed include: paraconsistent evidential logic programming languages;software architectures...
详细信息
The proceedings contains 31 papers from the XVII international Conference of the Chilean Computer Science Society. Topics discussed include: paraconsistent evidential logic programming languages;software architectures;constrained Steiner tree problems;visualization systems;visual query systems;deductive databases;fault-tolerant routings;parallel/distributed implementation environments;domain analysis support tools;balanced data structures;HyperRed-Black trees;Seljuk-Amoeba operating environments;superscalar sorting algorithms;interactive hypermedia literary stories;vehicle routing problems;object-oriented database systems;generalized edge-toughness;meeting scheduling systems;and cooperative databases.
The proceedings contain 42 papers. The topics discussed include: managing dependencies - a key problem in fault-tolerant distributed algorithms;an approach to fault-tolerant parallel processing on intermittently idle,...
ISBN:
(纸本)0818678313
The proceedings contain 42 papers. The topics discussed include: managing dependencies - a key problem in fault-tolerant distributed algorithms;an approach to fault-tolerant parallel processing on intermittently idle, heterogeneous workstations;renegotiable quality of service - a new scheme for fault tolerance in wireless networks;evaluation of a 32-bit microprocessor with built-in concurrent error-detection;probabilistic checkpointing;portable checkpointing for heterogenous architectures;a communication-induced checkpointing protocol that ensures rollback-dependency trackability;a method to automate user interface testing using variable finite state machines;towards a statistical approach to testing object-oriented programs;and experimental evaluation of failure-detection schemes in real-time communication networks.
暂无评论