the goal of the Performance Oriented End-to-end Modeling System (POEMS) project is to create and experimentally evaluate a problem solving environment for end-to-end performance modeling of complex parallel/distribute...
详细信息
the steady growth in size and complexity of communication networks has necessitated corresponding advances in the underlying networking technologies including communication protocols. this multi-faceted growth has ren...
详细信息
the steady growth in size and complexity of communication networks has necessitated corresponding advances in the underlying networking technologies including communication protocols. this multi-faceted growth has rendered analysis of today's ultra-large networks, a complex task. simulations have been used to model and analyze communication networks. Complete models of the ultra-large networks need to be simulated in order to study crucial scalability and performance issues. Discrete event simulations of such ultra-large networks, with limited hardware resources is complex due to their sheer size. this paper presents the issues involved in the design of a framework to enable ultra-large simulations, consisting of millions of nodes. the parallelsimulation techniques used, the application program interface needed for model development, and the results from experiments conducted using the framework are also presented.
the main features of gridless finite-size-particle (FSP) codes are discussed, from the point of view of the performances that can be obtained, with respect both to the spatial-resolution level and the efficiency of pa...
详细信息
ISBN:
(纸本)3540658211
the main features of gridless finite-size-particle (FSP) codes are discussed, from the point of view of the performances that can be obtained, with respect both to the spatial-resolution level and the efficiency of parallel particle simulations. It is shown that such codes are particularly suited for particle-decomposition parallelization on distributed-memory architectures, as they present a strong reduction, in comparison with particle-in-cell (PIC) codes, of the memory and computational offsets related to storing and updating the replicated fluctuating-field arrays.
the fast simulation of large networks of spiking neurons is a major task for the examination of biology inspired vision systems. Networks of this type are labelling features by synchronization of spikes and there is s...
详细信息
ISBN:
(纸本)0769500439
the fast simulation of large networks of spiking neurons is a major task for the examination of biology inspired vision systems. Networks of this type are labelling features by synchronization of spikes and there is strong demand to simulate those effects in a real world environment. Because of the quite complex calculations for one model neuron the simulation of thousands or millions of these neurons is not efficient on existing hardware platforms. In order to simulate closer to the real time requirement, it is necessary to implement a dedicated hardware. Our aim is a hardware system mainly consisting of standard components which is as flexible as possible concerning the model neuron but as specialized as necessary to meet our performance requirements. thus we decided to implement a parallel system with Digital Signal Processors (DSP) offering a large on-chip-memory. One main task of this work is the optimization of the simulation algorithm for the neurons distributed to the DSP which means the sequential part of simulation. this optimization benefits from the fact that there is only a very low percentage of simultaneously active neurons in vision networks. For communication between the nodes only spikes are distributed via a spike switching network. Processing of the network topology is realized by two different concepts. One idea is to compute the synapses autonomously on the processing node by representing a regular connection scheme with one connection mask for many neurons. Additional connections requiring adaptability and irregular connection schemes are stored in a shared memory. To avoid a bottleneck a synapse caching is used within each processing node. this paper describes the architecture of a DSP accelerator and shows the advantages withsimulation results from a typical large vision network.
parallel database systems are the key to high performance database processing. In this paper, we propose parallel join algorithms in shared disk parallel database systems, where all coupled nodes are connected via a h...
详细信息
ISBN:
(纸本)076950468X
parallel database systems are the key to high performance database processing. In this paper, we propose parallel join algorithms in shared disk parallel database systems, where all coupled nodes are connected via a high-speed network and share a common database at the disk level. the proposed algorithms are novel in the sense that they can provide a higher potential for dynamic load balancing withthe inherent flexibility of the shared disk architecture. Using a parallel database simulation model, we evaluate the performance of the proposed algorithms under a wide variety of system configurations and database workloads.
In this paper, we have designed an efficient parallel algorithm for performing 3 D image reconstruction. In our framework, we have considered 3 D image to be reconstructed from a series of 2 D images, produced using U...
详细信息
In this paper, we have designed an efficient parallel algorithm for performing 3 D image reconstruction. In our framework, we have considered 3 D image to be reconstructed from a series of 2 D images, produced using Ultrasonography, Computer Tomography, etc. the paper discusses a general parallel algorithm for 3 D image reconstruction over CRCW, CREW and EREW PRAM models. We have developed efficient implementations of this algorithm over a vector machines, a distributed system comprising of a cluster of Work Stations and various interconnection network like mesh network and reconfigurable bus network. the performance of the above algorithms are tested using simulation experiments performed for 3 D image reconstruction of the vitreous region of the eye using ophthalmic ultrasonograms. A novel approximation scheme has also been proposed for a drastic improvement in performance for specific kinds of image. Results indicate the time complexities of the algorithms are in resonance with expected theoretical values and image obtained has a uncompromising level of accuracy.
this paper addresses issues of task clustering-the coalition of several fine grain tasks into single coarser grain tasks called task clusters-and task cluster scheduling on distributed processors. the performance of v...
详细信息
this paper addresses issues of task clustering-the coalition of several fine grain tasks into single coarser grain tasks called task clusters-and task cluster scheduling on distributed processors. the performance of various scheduling schemes is studied and compared for a variety of workloads. simulation results indicate that the scheduling policy that gives priority to the cluster withthe smallest cumulative service demand of all its tasks performs better than the other policies examined.
this paper investigates the performance of IEEE 802.11 wireless local area network (WLAN) protocol's distributed Coordination Function (DCF) in the presence of mobile and hidden terminals. In order to study the jo...
详细信息
this paper investigates the performance of IEEE 802.11 wireless local area network (WLAN) protocol's distributed Coordination Function (DCF) in the presence of mobile and hidden terminals. In order to study the joint effect of hidden terminals and user mobility on the performance of IEEE 802.11 DCF, we extend Tobagi and Kleinrock's hearing graph framework to model hidden terminals in a static environment. We derive a combined mobility and hidden terminal model using a Markov chain from the hearing graph of a given physical layout. the simple model uses two parameters: α, which controls the number of hidden terminals in the steady state, and λ, which controls the rate of mobility of each terminal. By varying the values of α and λ we can systematically generate scenario with different number of hidden terminals and different mobility rates for a particular physical layout with static obstructions. We have developed a discrete event simulator which uses the parameterized model to obtain the throughput and blocking probability behavior of an IEEE 802.11 based ad hoc network in the presence of certain static obstructions. Our simulations suggest that the IEEE 802.11 DCF protocol is robust enough to handle moderate conditions of hidden terminals and mobility, but the performance may degrade under extreme conditions. Carefully selecting protocol parameters (RTS and Fragmentation threshold) can help improve the performance even under extreme conditions.
this paper presents a concept called hierarchically grouped message to improve the performance of geographically distributed timed cosimulation. In the proposed method, messages which are transferred between simulator...
详细信息
this paper presents a concept called hierarchically grouped message to improve the performance of geographically distributed timed cosimulation. In the proposed method, messages which are transferred between simulators in a short period of simulated time are hierarchically grouped into a physical message to reduce the number of rollbacks in optimistic simulation as well as the communication overhead of message transfer. Experiments show the efficiency of the proposed method in an internationally distributed cosimulation environment.
From improved crash simulation to acoustic optimisation to innovative company-saving designs, distributed computing and meta-applications are enabling European industry to compete more effectively in many areas. three...
详细信息
ISBN:
(纸本)3540658211
From improved crash simulation to acoustic optimisation to innovative company-saving designs, distributed computing and meta-applications are enabling European industry to compete more effectively in many areas. three such meta-applications are PROMENVIR, Optimus and TOOLSHED, all developed in recent ESPRIT projects. the PROMENVIR product provides users with all the functionality needed to perform Monte-Carlo analyses of complex problems, from satellite deployment to crash simulation. Optimus, from LMS also enables the user to perform sensitivity analyses, but with a broader focus encompassing classical design of experiment techniques. TOOLSHED is a problem solving environment (PSE) in the truest sense of the word. the package will automate any analysis process, providing the user with CAD importation, mesh generation, computational steering and visualisation in addition to the standard PSE requirements of seamless data transfer and transparent execution of tasks. the parallel implementation of LUSAS, an FE solver from FEA Ltd. developed in the ESPRIT project PARACOMP has been designed to run on a dual use cluster of NT machines, and has many requirements in common withthe other meta-applications to be discussed. All of these products have interfaces to Intrepid, the Intelligent Resource Manager from PAC. Intrepid provides each of these packages with full meta-computing management, from a single point of control for the definition of a meta-computer to a clean, intuitive API through which they can control the execution of jobs. It makes use of performance models of applications to determine the CPU load, disk and memory requirements. these parameters are used to ensure boththat tasks have the resources they need to execute and that execution of the entire problem is carried out in the most efficient manner. this paper will present an overview of the design of Intrepid, its interaction withthese three packages and industrial examples of its application.
暂无评论