SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be ut...
详细信息
SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be utilized in parallel to improve the throughput further. However, several challenges must be addressed to make such solutions feasible. First, the memory distribution over different pipelines, as well as across different stages of each pipeline, must be balanced. Second, the traffic among these pipelines should be balanced. Third, the intra-flow packet order (i.e. the sequence) must be preserved. In this paper, we propose a parallel SRAM-based multi-pipeline architecture for IP lookup. A two-level mapping scheme is developed to balance the memory requirement among the pipelines as well as across the stages in each pipeline. To balance the traffic, we propose an early caching scheme to exploit the data locality inherent in the architecture. Our technique uses neither a large reorder buffer nor complex reorder logic. Instead, a flow-aware queuing scheme exploiting the flow information is used to maintain the intra-flow sequence. Extensive simulation using real-life traffic traces shows that the proposed architecture with 8 pipelines can achieve a throughput of up to 10 billion packets per second, i.e. 3.2 Tbps for minimum size (40 bytes) packets, while preserving intra-flow packet order. (c) 2009 Elsevier Inc. All rights reserved.
Congestion control algorithms of existing reliable multicast protocols are mainly derived from end-to-end model, which has high resource requirements and sometimes suppresses the package sending too much. Many-to-many...
详细信息
ISBN:
(纸本)9780769536422
Congestion control algorithms of existing reliable multicast protocols are mainly derived from end-to-end model, which has high resource requirements and sometimes suppresses the package sending too much. Many-to-many reliable multicast requires efficient congestion control over a one-to-many model. It's an important mechanism to use many-to-many multicast in LAN (Local Area Network) in distributedsimulation. In this paper, a congestion control algorithm based on loss trend for many-to-many reliable multicast is proposed. It predicts future package loss of receivers on the analysis of historic loss and buffer variety, and then control the congestion by adjusting the sending rate in advance. This algorithm aims at the reliable multicast in LAN. The main idea of the algorithm is to lower the possibility of multicast package loss, and then the nodes can afford the cost of package recovery. It alleviates the congestion on the depression of package loss possibility by regulating the sending rate. Experiment results indicate that the algorithm can keep a high throughput of many-to-many reliable multicast with relatively real-time performance.
The proceedings contain 14 papers. The topics discussed include: time-constrained high-fidelity rendering on local desktop grids;interactive physical simulation on multicore architectures;dynamic grid refinement for f...
ISBN:
(纸本)9783905674156
The proceedings contain 14 papers. The topics discussed include: time-constrained high-fidelity rendering on local desktop grids;interactive physical simulation on multicore architectures;dynamic grid refinement for fluid simulations on parallel graphics architectures;simulation of radio wave propagation by beam tracing;parallelized matrix factorization for fast BTF compression;parallelized matrix factorization for fast BTF compression;fast parallel unbiased diffeomorphic atlas construction on multi-graphics processing units;a flexible adaptation service for distributed rendering;wait-free shared-memory irradiance cache;data-parallel hierarchical link creation for radiosity;and a decomposition approach for optimizing large-scale parallel image composition on multi-core MPP systems.
With the expansion of real-time software system scale, component-based real-time software is becoming mainstream. Timed pi calculus and High-order pi calculus as the software calculus system both can not describe dyna...
详细信息
ISBN:
(纸本)9780769536422
With the expansion of real-time software system scale, component-based real-time software is becoming mainstream. Timed pi calculus and High-order pi calculus as the software calculus system both can not describe dynamic nature of component-based real-time software. They also give the THO-pi calculus which is time-character According to making constraint to high-order process activities, They give the evolution rules of THO-pi calculus, they give an new weak-time mutual simulation which is among the high-order process and they also give the concept about multi-resolution time constraint for the system-level calculus application. This article research paves the way for the dynamic architecture of component-based real-time software and the establishment of component-based real-time software's description language.
P systems or membrane systems provide a high level computational modeling framework that combines the structural and dynamic aspects of biological systems in a relevant and understandable way. P systems are massively ...
详细信息
ISBN:
(纸本)9780769538099
P systems or membrane systems provide a high level computational modeling framework that combines the structural and dynamic aspects of biological systems in a relevant and understandable way. P systems are massively paralleldistributed. and non-deterministic systems. In this paper, we describe the implementation of a simulator for the class of recognizer P systems with active membranes by using the GPU (Graphics Processing Unit). We compare the high-performance parallel simulator for the GPU it) the simulator developed on a single CPU (Central Processing Unit), and we show that the GPU is better suited than file CPU to simulate P systems due to its highly parallel nature.
The proceedings contain 14 papers. The topics discussed include: publish/subscribe as a model for scientific workflow interoperability;a navigation model for exploring scientific workflow provenance graphs;scheduling ...
ISBN:
(纸本)9781605587172
The proceedings contain 14 papers. The topics discussed include: publish/subscribe as a model for scientific workflow interoperability;a navigation model for exploring scientific workflow provenance graphs;scheduling data-intensive workflows on storage constrained resources;pipeline-centric provenance model;web enabling desktop workflow applications;a simulation toolkit to investigate the effects of grid characteristics on workflow completion time;a data-driven workflow language for grids based on array programming principles;plasma fusion code coupling using scalable I/O services and scientific workflows;workflow management for paramedical emergency operations within a mobile-static distributed environment;workflow representation and runtime based on lazy functional streams;composing and executing parallel data-flow graphs with shell pipes;and towards scientific workflow patterns.
We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCT...
详细信息
parallel discrete event simulation has been established as a technique which has great potential to speed up the execution of gate level circuit simulation. A fundamental problem posed by a parallel environment is the...
详细信息
ISBN:
(纸本)9780769537139
parallel discrete event simulation has been established as a technique which has great potential to speed up the execution of gate level circuit simulation. A fundamental problem posed by a parallel environment is the decision of whether it is best to simulate a particular circuit sequentially or on a parallel platform. Furthermore, in the event that a circuit should be simulated on a parallel platform, it is necessary to decide how many computing nodes should be used on the given platform. In this paper we propose a machine learning algorithm as an aid in making these decisions. The algorithm is based on the well-known K-Nearest Neighbor algorithm. After an extensive training regime, it was shown to make a correct prediction 99% of the time on whether to use a parallel or sequential simulator. The predicted number of nodes to use on a parallel platform was shown to produce an average execution time which was not more than 12% of the smallest execution time. The configuration which resulted in the minimal execution time was picked 61 % of the time.
The Lightweight Time Warp (LTW) protocol offers a novel approach to high-performance optimistic parallel discrete-event simulation, especially when a large number of simultaneous events need to be executed at each vir...
详细信息
ISBN:
(纸本)9780769537139
The Lightweight Time Warp (LTW) protocol offers a novel approach to high-performance optimistic parallel discrete-event simulation, especially when a large number of simultaneous events need to be executed at each virtual time. With LTW, the local simulation space on each node is partitioned into two sub-domains, allowing purely optimistic simulation to be driven by only a few full-fledged logical processes (LPs), while most processes are turned into lightweight LPs, free from the burden associated with Time Warp (TW) execution. This paper presents a comparative performance evaluation of the TW and LTW protocols for simulating several DEVS-based environmental models. The experiments indicate that the LTW protocol improves performance in terms of shortened execution time, reduced memory usage, lowered operational cost, and enhanced system stability.
simulation replication is a necessity for all stochastic simulations. Its efficient execution is particularly important when additional techniques are used on top, such as optimization or sensitivity analysis. One way...
详细信息
ISBN:
(纸本)9780769537139
simulation replication is a necessity for all stochastic simulations. Its efficient execution is particularly important when additional techniques are used on top, such as optimization or sensitivity analysis. One way to improve replication efficiency is to ensure that the best configuration of the simulation system is used for execution. A selection of the best configuration is possible when the number of required replications is sufficiently high, even without any prior knowledge on simulator performance or problem instance. We present an adaptive replication mechanism that combines portfolio theory with reinforcement learning: it adapts itself to the given problem instance at runtime and can be restricted to an efficient algorithm portfolio.
暂无评论