This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collec...
详细信息
ISBN:
(纸本)9780897918169
This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collection of analyses, including privatization and reduction recognition for both array and scalar variables, and symbolic analysis of array subscripts. The interprocedural analysis framework is designed to provide analysis results nearly as precise as full inlining but without its associated costs. Experimentation with this system shows that it is capable of detecting coarser granularity of parallelism than previously possible. Specifically, it can parallelize loops that span numerous procedures and hundreds of lines of codes, frequently requiring modifications to array data structures such as privatization and reduction transformations. Measurements from several standard benchmark suites demonstrate that an integrated combination of interprocedural analyses can substantially advance the capability of automatic parallelization technology.
This paper presents the computational market of SIGMA (System of Information Gathering Market-based Agents) as a model of decentralized decision making for the task of information filtering in multidimensional spaces ...
详细信息
Parallel simulation has been an active research area for more than a decade. The parallel simulation community needs a common benchmark suite for performance evaluation of parallel simulation environments. Performance...
详细信息
Parallel simulation has been an active research area for more than a decade. The parallel simulation community needs a common benchmark suite for performance evaluation of parallel simulation environments. Performance evaluation of a parallel simulation environment is harder than evaluating a parallel processing system, since the underlying system is nor only composed of architecture and operating system, but also of simulation kernel. Thus, simulation kernel designers often confront a twofold task: (i) to evaluate how efficiently their simulation kernel runs on certain architectures; and (ii) to evaluate how simulation problems scale using this kernel In this paper we advocate an incremental benchmarking methodology that focuses on the evaluation of a parallel simulation system which is based on Time Warp. We start from a reduced set of ping models that can effectively estimate the various overheads, contention and latencies of Time Warp running on a multiprocessor. The benchmark suite has been used to locate several sources of overhead in an existing Time Warp implementation. Using this benchmark suite we also compare the performance of the improved version of the Time Warp implementation with the original one.
In this paper we propose a general framework for viewing a class of heuristics for track assignment in channel routing from a purely graph theoretic angle. Within this framework we propose algorithms for computing rou...
详细信息
In this paper we propose a general framework for viewing a class of heuristics for track assignment in channel routing from a purely graph theoretic angle. Within this framework we propose algorithms for computing routing solutions using optimal or near optimal number of tracks for several well-known benchmark channels in the two-layer VH. Three-layer HVH, and multi-layer V/sub i/H/sub i/ and V/sub i/H/sub i+1/ routing models. Within the same framework we also design an algorithm for minimizing the total wire length in the two-layer VH and three-layer HVH routing models.
A concept of a two levels distributed processing model for local area networks is proposed. The level 1 concerns analysis of current workload of each workstation. Dynamic task distribution in order to achieve load bal...
详细信息
A concept of a two levels distributed processing model for local area networks is proposed. The level 1 concerns analysis of current workload of each workstation. Dynamic task distribution in order to achieve load balancing of a whole network is determined on the level 2. To estimate the required parameters two experiments are proposed. They determine relative speeds of workstations and normalized tasks processing time. The proposed approach is the base of the model implementation in a Unix local area network.
The performance of multicomputer s is highly dependent on the underlying communication mechanisms, especially for a large-scale multicomputer machine with hundreds or thousands of processing elements. It is often requ...
详细信息
The coupled simulation equivalence is slightly larger than observation equivalence. Where observation equivalence is based on weak bisimulations, coupled simulation equivalence is based on pairs of simulations which c...
详细信息
This paper presents a hardware architecture and a software tool needed for future autonomous robots. Specific attention is given to the execution of artificial neural networks and to the need for a good inspection and...
详细信息
Transition systems are a basic semantic model for formal description, specification, and analysis of concurrent and distributed systems. In order to describe and analyze aspects of reliability, such as the likelihood ...
详细信息
In this paper, we assert that window system plays a fundamental role in supporting multiple interaction channels distributed over a finite number of I/O devices. For historical as well as technical reasons, window sys...
详细信息
暂无评论