In parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an ...
详细信息
ISBN:
(纸本)9781467323703;9781467323727
In parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. the latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well.
As the complexity of chip designs increase, simulation time also increases. Unit and variable delay simulation takes the most simulation time in IC design process;however, parallel processing performs inefficiently du...
详细信息
As the complexity of chip designs increase, simulation time also increases. Unit and variable delay simulation takes the most simulation time in IC design process;however, parallel processing performs inefficiently due to large amount of synchronization. In this paper, techniques to reduce the number of synchronization points in synchronous designs are proposed, and a partitioner to partition designs along flip-flop boundaries is also proposed so that these techniques can be employed on real designs.
this paper presents an evolutionary prototyping methodology oriented to the model, design and implementation of concurrent distributedsystems. this methodology use two several stages: a modeling language of concurren...
详细信息
this paper presents an evolutionary prototyping methodology oriented to the model, design and implementation of concurrent distributedsystems. this methodology use two several stages: a modeling language of concurrent distributedsystems LeMSiDiC (a graphical modeled language who provides UML-like structuring capabilities and a precise syntax and semantic for automatic source code generation for these types of systems);a source code generator GeCSiDiC (a code generator able to construct the objects associated to the model specified with LeMSiDiC using the object-oriented paradigm). the methodology allows to interrelate with one architecture oriented to concurrent distributedsystems management or to interrelate with concurrent distributedsystems without a specialized support.
In this paper, we provide an "Improved Channel Aware QoS Scheduling Architecture for WiMAX Base Stations". In this scheme, we provide intelligence to the classifier by having feedback from the compensation b...
详细信息
ISBN:
(纸本)9780889867741
In this paper, we provide an "Improved Channel Aware QoS Scheduling Architecture for WiMAX Base Stations". In this scheme, we provide intelligence to the classifier by having feedback from the compensation block. We capitalize on the design of WiMAX, which differentiates connections with separate connection IDs and classifies traffic, based on a class-based approach. We propose a design in which the Classifier, in order to provide excellent QoS, admits frames with only those CIDs that can be serviced without any significant delay to the real time voice and video applications, thus improving the overall QoS. In other words, it rejects a packet having a CID which is experiencing bad channel quality and allows unmarked CIDs to utilize the channel at its maximum efficiency.
this paper presents coordinated virtual partition (CVP) for Grid computingsystems. the CVP is a way for regulating the resources supplied to different components of an application in unison according to an agreed rel...
详细信息
this paper presents coordinated virtual partition (CVP) for Grid computingsystems. the CVP is a way for regulating the resources supplied to different components of an application in unison according to an agreed relative proportion. this study shows that coordinated resource provisioning has several benefits including: (a) reducing the wait times experienced by an application and (b) improving the overall application performance by reducing the wait times. the CVP achieves these benefits by releasing resources from "fast" running application components that can be reallocated by the Grid for other applications.
Group recommender systems provide groups of users with shared recommendations. they have great potential for easing group decision processes, while at the same time entailing new technological and user-centred challen...
详细信息
ISBN:
(纸本)9780769543284
Group recommender systems provide groups of users with shared recommendations. they have great potential for easing group decision processes, while at the same time entailing new technological and user-centred challenges. In this paper we present the GroupRecoPF platform providing developers with support for building scalable and user-friendly group recommender systems. It leverages on advanced concepts for group recommendation merging strategy exploration, performance optimisation, session persistence and mobility, as well as open interfaces.
the SPACE RIP technique is one of the parallel imaging methods that has the potential to revolutionize the field of fast MR imaging. the image reconstruction problem of SPACE RIP is a computation intensive task which ...
详细信息
the SPACE RIP technique is one of the parallel imaging methods that has the potential to revolutionize the field of fast MR imaging. the image reconstruction problem of SPACE RIP is a computation intensive task which needs to be parallelized to further reduce the reconstruction time. In this paper, we analyzed the algorithm and identified the program bottleneck to be parallelized. the loop level parallelization is implemented with Pthread, OpenMP and MPI. Furthermore, since the reconstruction uses Singular Value decomposition (SVD) to solve the matrix pseudoinverse problem, we implemented the one sided Jacobi parallel SVD on the state-of-art cellular computer architecture Cyclops64 to speedup the problem at the fine grain level.
Resource management is an important aspect in open distributedsystems, as these systems are persistent and ubiquitous. In order to be scalable, it is imperative that the resource management itself should be distribut...
详细信息
Resource management is an important aspect in open distributedsystems, as these systems are persistent and ubiquitous. In order to be scalable, it is imperative that the resource management itself should be distributed as well as the resources. this paper focuses on the LINDA coordination model of open distributedsystems. One limited resource is memory, and garbage collection has already been proposed for the standard LINDA with multiple tuple-spaces (TSs) to avoid memory exhaustion. the implementation, however, was restricted to garbage collection of TSs. Taking into account the need for garbage collection not only for TSs, but also for tuples, this paper demonstrates how this can be extended to tuples, withthe introduction of multicapabilities, which generalise capabilities to collections of objects. We also illustrate the use of multicapabilities in two other applications related to resource management: managing deadlocks and information caching.
the balance between CPU speed and interconnection network throughput in distributed memory parallel computers varies with each generation of systems, but the trend is that CPUs are gaining performance faster than the ...
详细信息
the balance between CPU speed and interconnection network throughput in distributed memory parallel computers varies with each generation of systems, but the trend is that CPUs are gaining performance faster than the interconnection networks. this means that remote data accesses are becoming more expensive relative to local accesses in terms of CPU cycles. therefore, remote memory access mechanisms that were suited to a previous generation of parallel machines may be less appropriate for current clusters. this research evaluates a multithreaded programming paradigm with cached remote memory accesses and thread migration to exploit array locality on a cluster with Myrinet. the approach, called Nomadic threads, was originally developed for the CM5, but has been adapted to use MPI on Linux clusters. the results show that the current surfeit of CPU power vs. network throughput dramatically changes scaling characteristics of some programs while others behave much as they did on the decade-old CMS.
A methodology is presented that allows for a distributed execution of systems on several micro controllers and a FPGA (Field Programmable Gate Array). By using a FPGA the system performance can be increased significan...
详细信息
A methodology is presented that allows for a distributed execution of systems on several micro controllers and a FPGA (Field Programmable Gate Array). By using a FPGA the system performance can be increased significantly by means of parallel processing. thereby, hybrid electronic systems are focused on, which contain both state-based and continuous model parts. In order to fulfill real time requirements a real time operating system is used. For the measurement of the system performance a method is presented to analyze the time behavior that enables a graphical representation of the execution time interval and of the execution points in time of the tasks and the recognition of idle running times, and thus supports an optimization of the task scheduling. the data exchange is realized with CAN (Controller Area Network).
暂无评论