检索结果-内蒙古大学图书馆

Sequence-preserving parallel IP lookup using multiple SRAM-based pipelines

JOURNAL OF parallel AND distributed computing 2009年第9期69卷 778-789页

作者： Jiang, Weirong Prasanna, Viktor K. Univ So Calif Ming Hsieh Dept Elect Engn Los Angeles CA 90089 USA

SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be utilized in parallel to improve the throughput further. However, several challenges must be addressed to make such solutions feasible. First, the memory distribution over different pipelines, as well as across different stages of each pipeline, must be balanced. Second, the traffic among these pipelines should be balanced. third, the intra-flow packet order (i.e. the sequence) must be preserved. In this paper, we propose a parallel SRAM-based multi-pipeline architecture for IP lookup. A two-level mapping scheme is developed to balance the memory requirement among the pipelines as well as across the stages in each pipeline. To balance the traffic, we propose an early caching scheme to exploit the data locality inherent in the architecture. Our technique uses neither a large reorder buffer nor complex reorder logic. Instead, a flow-aware queuing scheme exploiting the flow information is used to maintain the intra-flow sequence. Extensive simulation using real-life traffic traces shows that the proposed architecture with 8 pipelines can achieve a throughput of up to 10 billion packets per second, i.e. 3.2 Tbps for minimum size (40 bytes) packets, while preserving intra-flow packet order. (c) 2009 Elsevier Inc. All rights reserved.

关键词： IP lookup Pipeline SRAM Router

来源：评论

学校读者我要写书评

暂无评论

Load Balancing of parallel Block Overlapped Incomplete Cholesky Preconditioning

Load Balancing of Parallel Block Overlapped Incomplete Chole...

引用

10th international conference on parallel computing technologies

作者： Kaporin, Igor Konshin, Igor Russian Acad Sci Dorodnicyn Comp Ctr Moscow 119333 Russia

ISBN: (纸本)9783642032745

A modification of the second order Incomplete Cholesky (IC) factorization with controllable amount of fill-in is described and analyzed. this algorithm is applied to the construction of well balanced coarse-grain parallel preconditioning for the Conjugate Gradient (CG) iterative solution of linear systems with symmetric positive definite matrix. the efficiency of the resulting parallel algorithm is illustrated by a series of numerical experiments using large-scale ill-conditioned test matrices taken from the collection of the University of Florida.

关键词： symmetric positive definite matrix incomplete Cholesky factorization conjugate gradient method parallel preconditioning

来源：评论

学校读者我要写书评

暂无评论

Modeling of Network computing Systems for Decision Tree Induction Tasks

引用

10th international conference on Intelligent Data Engineering and Automated Learning (IDEAL 2009)

作者： Walkowiak, Krzysztof Wozniak, Michal Wroclaw Univ Technol Fac Elect Chair Syst & Comp Networks PL-50370 Wroclaw Poland

ISBN: (纸本)9783642043932

Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing systems including Grids, public-resource computing systems, P2P systems and Cloud computing. In this paper we take a detailed look at the problem of modeling and optimization of network computing systems for parallel decision tree induction methods. Firstly, we present a comprehensive discussion Oil mentioned induction methods with a special focus on their parallel versions. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of parallel decision trees. To illustrate our work we provide results of numerical experiments showing that the distributed approach enables significant improvement of the system throughput.

关键词： Machine Learning Network computing Grids Modeling Optimization parallel Decision Tree

来源：评论

学校读者我要写书评

暂无评论

A parallel 3D Code for Simulation of Self-gravitating Gas-Dust Systems

A Parallel 3D Code for Simulation of Self-gravitating Gas-Du...

引用

10th international conference on parallel computing technologies

作者： Kireev, Sergei ICMMG SB RAS Novosibirsk Russia

ISBN: (纸本)9783642032745

A parallel 3D code for simulation of galaxies and protoplanetary discs is developed. the model includes dust;gas, gravitation and friction between dust and gas. the kinetic equation for dust particles is solved by PIC method. Gas dynamics equations are solved by FLIC method. In parallel implementation a domain decomposition technique is used where each subdomain is processed by a group of processors. Results of parallelization efficiency are presented.

关键词： Dust

来源：评论

学校读者我要写书评

暂无评论

Efficiency Analysis of parallel Batch Pattern NN Training Algorithm on General-Purpose Supercomputer

引用

10th international Work-conference on Artificial Neural Networks (IWANN 2009)

作者： Turchenko, Volodymyr Grandinetti, Lucio Univ Calabria Ctr Excellence High Performance Comp I-87036 Arcavacata Di Rende CS Italy

ISBN: (纸本)9783642024801

the theoretic and algorithmic description of the parallel batch pattern back propagation (BP) training algorithm of multilayer perceptron is presented in this paper. the efficiency research of the developed parallel algorithm is fulfilled at progressive increasing of the dimension of parallelized problem on general-purpose parallel Computer NEC TX-7.

关键词： Batch pattern training neural network parallelization efficiency

来源：评论

学校读者我要写书评

暂无评论

Mercury: A reflective middleware for automatic parallelization of Bags-of-Tasks

Mercury: A reflective middleware for automatic parallelizati...

引用

ARM 2009 - 8th Workshop on Adaptive and Reflective Middleware co-located with the 10th ACM/IFIP/USENIX international Middleware conference

作者： Silva, João Nuno Veiga, Luís Ferreira, Paulo INESCID/Technical University of Lisbon Distributed Systems Group Lisboa Portugal

ISBN: (纸本)9781605588506

Today, the development of Bag-of-Tasks, i.e. embarrassingly parallel, applications for execution on multiprocessors or clusters requires the use of APIs not designed for this kind of problem. For instance, MPI allows the parallel execution of tasks, but was developed for much complex parallel applications, with high data communication between tasks. the use of such APIs requires the programmers to learn them, and add complexity to the final parallel solution. Mercury provides a platform for the transformation of serial applications into parallel Bag-of-Tasks. Mercury reads a configuration file stating what methods and classes should be parallelized, loads the application, and in run-time transforms it so that the specified methods are executed concurrently. this transformation is performed without user intervention. Its modular design allows the integration of Mercury with different parallel environments. the initial experiments done show that the overhead is minimal, and that it is possible to take advantage of parallel processing environments (multiprocessors/multicores, clusters, ...) without the use of complex APIs. Copyright 2009 ACM.

关键词： Middleware

来源：评论

学校读者我要写书评

暂无评论

Fragmentation of Numerical Algorithms for the parallel Subroutines Library

Fragmentation of Numerical Algorithms for the Parallel Subro...

引用

10th international conference on parallel computing technologies

作者： Malyshkin, Victor E. Sorokin, Sergey B. Chajuk, Ksenia G. Russian Acad Sci Inst Computat Math & Math Geophys Novosibirsk Russia

ISBN: (纸本)9783642032745

Fragmentation of the often used numerical algorithms for inclusion into the library of parallel numerical subroutines are considered. Algorithms and programs fragmentation allow to create parallel programs that can be executed on parallel computers of different types (multiprocessors and/or multicomputers) and can be dynamically tuned to all the available resources. Programs' fragmentation is the way of automatic providing of the dynamic properties of parallel programs, like dynamic load balancing. Algorithm's fragmentation is a technological method of numerical algorithms parallelization which provides their effective parallel implementation.

关键词： Asynchronous programming parallel program numerical algorithm fragments based programming dynamic programs' properties

来源：评论

学校读者我要写书评

暂无评论

Asynchronous Language and System of Numerical Algorithms Fragmented Programming

Asynchronous Language and System of Numerical Algorithms Fra...

引用

10th international conference on parallel computing technologies

作者： Arykov, Sergey Malyshkin, Victor Russian Acad Sci Inst Computat Math & Math Geophys Supercomp Software Dept Novosibirsk 630090 Russia

ISBN: (纸本)9783642032745

A fragmented approach to parallel programming of numerical methods and its implementation in the asynchronous programming system Aspect are considered. It provides several important advantages like automatic implementation of dynamic properties (setting up on available resources, dynamic load balancing, dynamic resource distribution, etc.) of an application program. the asynchronous parallel programming system Aspect is considered which implements a conception of fragmented programming on supercomputers with shared memory architecture.

关键词： fragmented technology of programming asynchronous languages and programming systems dynamic program's properties automation of parallel realization of numerical models

来源：评论

学校读者我要写书评

暂无评论

Solution of Large-Scale Problems of Global Optimization on the Basis of parallel Algorithms and Cluster Implementation of computing Processes

Solution of Large-Scale Problems of Global Optimization on t...

引用

10th international conference on parallel computing technologies

作者： Koshur, Vladimir Kuzmin, Dmitriy Legalov, Aleksandr Pushkaryov, Kirill Siberian Fed Univ Inst Space & Informat Technol Krasnoyarsk 660074 Russia

ISBN: (纸本)9783642032745

the parallel hybrid inverse neural network coordinate approximations algorithm (PHINNCA) for solution of large-scale global optimization problems is proposed in this work. the algorithm maps a trial value of an objective function into values of objective function arguments. It decreases a trial value step by step to find a global minimum. Dual generalized regression neural networks are used to perform the mapping. the algorithm is intended for cluster systems. A search is carried out concurrently. When there are multiple processes, they share the information about their progress and apply a simulated annealing procedure to it.

关键词： optimization global optimization large-scale problems solution cluster neural networks

来源：评论

学校读者我要写书评

暂无评论

the GCA-w Massively parallel Model

The GCA-w Massively Parallel Model

引用

10th international conference on parallel computing technologies

作者： Hoffmann, Rolf Tech Univ Darmstadt D-64289 Darmstadt Germany

ISBN: (纸本)9783642032745

We introduce the GCA-w model (Global Cellular Automata with write access) that is an extension of the GCA (Global Cellular Automata) model, which is in turn an extension of the cellular automata (CA) model. All three models are called "massively parallel" because the models are based on cells that are updated synchronously in parallel. In the CA model, the cells have static links to their local neighbors whereas in the GCA model, the links are dynamic to any global neighbor. In both models, the access is "read-only". thereby no write conflict can occur which reduces the complexity of the model and its implementation. the GCA model can be used for many parallel problems that can be described with a changing global (or locally restricted) neighborhood. the main restriction of the GCA model is the forbidden write access to neighboring cells. Although the write access can be emulated in O(log n) time this slowdown is not desired in practical applications. therefore, the GCA-w model was developed. the GCA-w model allows to change the state of the own cell as well as the states of the neighboring cells. thereby parallel algorithms can be executed faster and the activity of the cells can be controlled in order, e.g., to reduce power consumption or to use inactive cells for other purposes. the application of the GCA-w model is demonstrated for some parallel algorithms: pointer inversion, sorting with pointers, synchronization and Pascal's triangle. In addition, a hardware architecture is outlined which can execute this model.

关键词： Massively parallel Model Global Cellular Automata GCA with Write Access Dynamic Neighborhood Dynamic Cell Activation GCA-w applications GCA-w Architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：