检索结果-内蒙古大学图书馆

A hybrid MPI design using SCTP and iWARP

A hybrid MPI design using SCTP and iWARP

10th Workshop on Advances in parallel and distributed Computational Models/22nd ieee International parallel and distributed processing symposium

作者： Tsai, Mike Penoff, Brad Wagner, Alan Univ British Columbia Dept Comp Sci Vancouver BC V6T 1W5 Canada

ISBN: (纸本)9781424416936

Remote Direct Memory Access (RDMA) and point-to-point network fabrics both have their own advantages. MPI middleware implementations typically use one or the other, however, the appearance of the Internet Wide Area RDMA Protocol (iWARP), RDMA over IP, and protocol off-load devices introduces the opportunity to use a hybrid design for MPI middleware that uses both iWARP and a transport protocol directly. We explore the design of a new MPICH2 channel device based on iWARP and the Stream Control Transmission Protocol (SCTP) that uses SCTP for all point-to-point MPI routines and iWARP for all remote memory access routines (i.e., one-sided communication). the design extends the Ohio Supercomputer Center software-based iWARP stack and our MPICH2 SCTP-based channel device. the hybrid channel device aligns the semantics of the MPI routine with the underlying protocol that best supports the routine and also allows the MPI API to exploit the potential performance benefits of the underlying hardware more directly. We describe the design and issues related to the progress engine design and connection setup. We demonstrate how to implement iWARP over SCTP rather than TCP and discuss its advantages and disadvantages. We are not aware of any other software implementations of iWARP over SCTP, nor MPI middleware that uses both iWARP verbs and the SCTP API.

关键词： Internet protocols

来源：评论

学校读者我要写书评

暂无评论

Continuous answering holistic queries over sensor networks

Continuous answering holistic queries over sensor networks

引用

10th Workshop on Advances in parallel and distributed Computational Models/22nd ieee International parallel and distributed processing symposium

作者： Liu, Kebin Chen, Lei Li, Minglu Liu, Yunhao Shanghai Jiao Tong Univ Shanghai Peoples R China Hong Kong Univ Sci & Technol Dept Comp Sci & Engn Hong Kong Peoples R China Univ Ottawa Sch Elect Engn & Comp Sci Ottawa ON K1N 6N5 Canada

ISBN: (纸本)9781424416936

Wireless sensor networks (WSNs) are widely used for various monitoring applications. Users issue queries to sensors and collect sensing data Due to the low quality sensing devices or random link failures, sensor data are often noisy. In order to increase the reliability of the query results, continuous queries are often employed. In this work we focus on continuous holistic queries like Median. Existing approaches are mainly designed for non-holistic queries like Average. However, it is not trivial to answer holistic ones due to their non-decomposable property. We propose two schemes for answering queries under different data changing conditions. While sensor data changes slowly, based on the data correlation between different rounds, we propose one algorithm for getting the exact answers. When the data changing speed is high, we propose another approach to derive the approximate results. We evaluate both designs through extensive simulations. the results demonstrate that our approach significantly reduces the traffic cost compared with previous works while maintaining the same accuracy.

关键词： Wireless sensor networks

来源：评论

学校读者我要写书评

暂无评论

Improving efficiency and performance of distributed file-systems

Improving efficiency and performance of distributed file-sys...

引用

7th ieee International symposium on Networking Computing and Applications, NCA 2008

作者： Galizia, Micah Lutfiyya, Hanan Department of Computer Science University of Western Ontario London ON N6A 5B7 Canada

ISBN: (纸本)9780769531922

this paper presents a distributed file-system for the present day medium-sized network. Existing servers and workstations pool their unused storage resources to form a communal share. Erasure codes provide fault tolerance and eliminate the need for replication. Middleware libraries facilitate object routing on an overlay network. © 2008 ieee.

关键词： Middleware

来源：评论

学校读者我要写书评

暂无评论

Effect of parallel TCP stream equalizer on real long fat-pipe network

Effect of parallel TCP stream equalizer on real long fat-pip...

引用

7th ieee International symposium on Networking Computing and Applications, NCA 2008

作者： Sugawara, Yutaka Tezuka, Hiroshi Inaba, Mary Hiraki, Kei Yoshino, Takeshi University of Tokyo 7-3-1 Hongo Bunkyo-ku Tokyo 113-8656 Japan Google Japan Inc.

ISBN: (纸本)9780769531922

With the rapid progress of high-performance cluster applications, data transfer between clusters in distant locations becomes more important. But, it is difficult to transfer data using parallel TCP streams on long distance high bandwidth network. In this paper, we microscopically observe parallel TCP streams on 10Gbps network using our network analyzer, propose, implement, and evaluate "Stream Equalizer" which relaxes self-congestion and balances throughput among streams. We evaluate it using a real wide-area network over the Pacific Ocean. the network analyzer and the Stream Equalizer are implemented on FPGA-based programmable high-speed network testbed TGNLE-1. © 2008 ieee.

关键词： Equalizers

来源：评论

学校读者我要写书评

暂无评论

Fast nonlocal filtering applied to electron cryomicroscopy

Fast nonlocal filtering applied to electron cryomicroscopy

引用

5th ieee International symposium on Biomedical Imaging

作者： Darbon, Jerome Cunha, Alexandre Chan, Tony F. Osher, Stanley Jensen, Grant J. Univ Calif Los Angeles Dept Math Los Angeles CA 90024 USA CALTECH Ctr Adv Comp Res Pasadena CA 91125 USA CALTECH Div Biol Pasadena CA 91125 USA

ISBN: (纸本)9781424420025

We present an efficient algorithm for nonlocal image filtering with applications in electron cryomicroscopy. Our denoising algorithm is a rewriting of the recently proposed nonlocal mean filter. It builds on the separable property of neighborhood filtering to offer a fast parallel and vectorized implementation in contemporary shared memory computer architectures while reducing the theoretical computational complexity of the original filter. In practice, our approach is much faster than a serial, non-vectorized implementation and it scales linearly with image size. We demonstrate its efficiency in data sets from Caulobacter crescentus tomograms and a cryoimage containing viruses and provide visual evidences attesting the remarkable quality of the nonlocal means scheme in the context of cryoimaging. With such development we provide biologists with an attractive filtering tool to facilitate their scientific discoveries.

关键词： nonlocal mean filtering image denoising electron cryomicroscopy image vectorization SMID parallel image processing

来源：评论

学校读者我要写书评

暂无评论

Scalable Data Gathering for Real-time Monitoring Systems on distributed Computing 08

Scalable Data Gathering for Real-time Monitoring Systems on ...

引用

8th ieee International symposium on Cluster Computing and the Grid

作者： Kamoshida, Yoshikazu Taura, Kenjiro Univ Tokyo Bunkyo Ku Tokyo 113 Japan

ISBN: (纸本)9781424442379

Real-time monitoring is increasingly becoming important in various scenes of large scale, multi-site distributed/parallel computing, e.g, understanding behavior of systems, scheduling resources, and debugging applications. Dedicated networks on inter-site communications are rarely available for the monitoring purposes. therefore, for real-time monitoring systems, reducing communication cost is important to handle a large number of nodes with limited network resources. We implemented a real-time Grid monitoring system called VGXP with techniques for low cost data gathering. It tries to send only diffs to recent data, and adapts to the requested data freshness and tolerable errors to minimize required communication. We evaluate monitoring overheads of the proposed method on a distributed environment consisting of 8-sites with 500 nodes. In a realistic setting where the sampling interval is set to 0.5 seconds and the tolerable error to 2%, the CPU usage of the server to gather data from all nodes was 0.2% and the transfer rate was less than 5kbps. the transfer rate did not exceed 50kbps even if we gather a detailed per-process statistics.

关键词： Data Gathering

来源：评论

学校读者我要写书评

暂无评论

parallel memory architecture for elliptic curve cryptography over GF_(p) aimed at efficient FPGA implementation

引用

JOURNAL OF SIGNAL processing SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2008年第1期51卷 39-55页

作者： Laue, Ralf Huss, Sorin A. Tech Univ Darmstadt Dept Comp Sci Integrated Circuits & Syst Lab Darmstadt Germany

parallelization of operations is of utmost importance for efficient implementation of Public Key Cryptography algorithms. Starting with a classification of parallelization methods at different abstraction levels of public key algorithms, we propose a novel memory architecture for elliptic curve implementations with multiple modular multiplier units. this architecture is well-suited for different point addition and doubling algorithms over GF(p) to be implemented on FPGAs. It allows the execution time to scale with the number of modular multipliers and exhibits nearly no overhead compared to the mere runtime of the multipliers. the advantages of this distributed memory architecture are demonstrated by means of two different point addition and doubling algorithms.

关键词： elliptic curve cryptography parallelization memory architecture FPGA

来源：评论

学校读者我要写书评

暂无评论

distributed mobility management for fast terminals in an IP-based micro-cellular network along roads

Distributed mobility management for fast terminals in an IP-...

引用

6th International symposium on Communication Systems, Networks and Digital Signal processing

作者： Yamada, Takahiko Yamashita, Satoshi Okumura, Takashi Hoa, Phan thanh Ritsumeikan Univ Coll Sci & Engn Kusatsu Shiga 5258577 Japan

ISBN: (纸本)9781424418756

this paper presents a scalable control system for a unified micro-cellular network named MM-MAN (Mobile Multimedia Metropolitan Area Network) in which fast terminals are provided high-bit rate IP packet transfer. In our previous papers, proposed schemes to guarantee smooth connections to fast movers in spite of frequent movement are LMC (Logical Macro Cell) and parallel polling. LMC-a multicast group of adjacent micro-cells and pollings are emitted from all BSs of the same LMC create a symmetric environment as a virtual single cell so the cell-to-cell movement of a mobile terminal within an LMC can be passed over. Detail of the distributed control for mobility management is described in this article. An extended LMC is introduced to conduct pre-downloading of packets and to allow distributed processing for the LMC switchover. However, the radio active channel is manipulated only at BSs in the LMC range but not in the extended LMC to save radio resources due to the overhead of parallel polling. If the polling response comes to the BS which differs from the central cell's BS of the LMC, this BS will be placed to become the central cell of the new LMC, and polling acknowledgement is multicast to the new extended LMC. the neighboring BS on the movement direction of the target mobile terminal (MT) can realize movement of the MT, and starts actions to join the new LMC by itself without help from the centralized control. these procedures can hide the delay in the cell-to-cell movement of the terminal even when it goes out from the LMC, and guarantee scalable and high-performance control over micro-cellular network. the simulation results tells the handover latency is less than 5ms, and the throughput for MT in case of the continuous multimedia like moving picture is 2Mbps over the 54Mbps wireless interface.

关键词： micro-cellular network IP packet transfer distributed control polling-based packet transfer Logical Macro-cell virtual single cell

来源：评论

学校读者我要写书评

暂无评论

parallel FFT algorithms on network-on-chips

Parallel FFT algorithms on network-on-chips

引用

5th International Conference on Information Technology - New Generations

作者： Bahn, Jun Ho Yang, Jungsook Bagherzadeh, Nader Univ Calif Irvine Dept Elect Engn & Comp Engn Irvine CA 92697 USA

ISBN: (纸本)9780769530994

this paper presents several parallel FFT algorithms with different degree of communication overhead for multiprocessors in Network-on-Chip(NoC) environment. three different methods of parallel FFT are presented. One is the reference parallel FFT for comparison, and the other two with well-distributed computation as well as reduced communication overhead. By evenly distributing parallel computation tasks which uses data locality, the execution time for completing each stage of FFT can be reduced. Moreover, by optimizing data exchanges we minimize the communication overhead. Depending on the communication regularity, one can select appropriate parallel FFT algorithm. By using the simulation results of our cycle-accurate SystemC NoC model with a parameterizable 2-D mesh architecture, and the performance analysis in time as well as complexity, our proposed algorithms are shown to outperform other parallel FFT algorithm or high-speed DSP implementations.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Flexible software-hardware Network Intrusion Detection System

Flexible software-hardware Network Intrusion Detection Syste...

引用

19th ieee/IFIP International symposium on Rapid System Prototyping

作者： Proudfoot, Ryan Kent, Kenneth Aubanel, Eric Chen, Nan Univ New Brunswick Fac Comp Sci Fredericton NB E3B 5A3 Canada

ISBN: (纸本)9780769531809

Network Intrusion Detection System (NIDS) demands have been steadily increasing over the past few years. Current solutions using software become inefficient running on high speed high volume networks and will end up dropping packets. Hardware solutions are available and result in much higher efficiency but present problems such as flexibility and cost. Our proposed system uses a modified version of Snort, a robust widely deployed open-sourced NIDS. Snort spends a significant fraction of its processing time doing pattern matching. Our proposed system runs Snort in software until it gets to the pattern matching function and then offloads that processing to the Field Programmable Gate Array (FPGA). the hardware is able to process data at up to 1.7GB/s on one Xilinx XC2VP100 FPGA. Our system is more flexible than other FPGA string matching designs in that the rules are not hard-coded. the design is scalable and allows FPGAs to be used in parallel to increase the processing speed even further.

关键词： Pattern matching

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：