检索结果-内蒙古大学图书馆

international Symposium on parallel processing

作者： C.M.P. Santos J.S. Ande NCE and COPPE Federal University of Rio de Janeiro Brazil

PM-PVM is a portable implementation of PVM designed to work on SMP architectures supporting multithreading. PM-PVM portability is achieved through the implementation of the PVM functionality on top of a reduced set of parallel programming primitives. Within PM-PVM, PVM tasks are mapped onto threads and the message passing functions are implemented using shared memory. three implementation approaches of the PVM message passing functions have been adopted. In the first one, a single message copy in memory is shared by all destination tasks. the second one replicates the message for every destination task but requires less synchronization. Finally, the third approach uses a combination of features from the two previous ones. Experimental results comparing the performance of PM-PVM and PVM applications running on a 4-processor Sparcstation 20 under Solaris 2.5 show that PM-PVM can produce execution times up to 54% smaller than PVM.

关键词： Yarn Message passing parallel programming Application software parallel processing Computer networks Operating systems Data structures Signal generators User interfaces

来源：评论

学校读者我要写书评

暂无评论

PM-PVM: A portable multithreaded PVM

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing Symposium, IPPS 1999年 191-195页

作者： Santos, C.M.P. Aude, J.S. Federal Univ of Rio de Janeiro Brazil

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

A comparison of router architectures for virtual cut-through and wormhole switching in a NOW environment

A comparison of router architectures for virtual cut-through...

引用

international Symposium on parallel processing

作者： J. Duato A. Robles F. Silla R. Beivide Universidad Politécnica de Valencia Valencia Spain Universidad de Cantabria Santander Spain

Most commercial routers designed for networks of workstations (NOWs) implement wormhole switching. However wormhole switching is not well suited for NOWs. the long wires required in this environment lead to large buffers to prevent buffer overflow during flow control signaling. Moreover, wire length is limited by buffer size. Virtual cut-through (VCT) achieves a higher throughput than wormhole switching. Moreover, the traditional disadvantages of VCT switching, as buffer requirements and packetizing overhead, disappear in NOWs. In this paper, we show that VCT routers can be simpler than wormhole ones, while still achieving the advantages of using virtual channels and adaptive routing. We also propose a fully adaptive routing algorithm for VCT switching in NOWs. Moreover, we show that VCT routers outperform wormhole routers in a NOW environment at a lower cost.

关键词： Routing Buffer storage throughput Postal services Computer worms Costs Pipelines Packet switching Wire Computer networks

来源：评论

学校读者我要写书评

暂无评论

Comparison of router architectures for virtual cut-through and wormhole switching in a NOW environment

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing Symposium, IPPS 1999年 240-247页

作者： Duato, J. Robles, A. Silla, F. Beivide, R. Universidad Politecnica de Valencia Valencia Spain

Most commercial routers designed for networks of workstations (NOWs) implement wormhole switching. However, wormhole switching is not well suited for NOWs. the long wires required in this environment lead to large buffers to prevent buffer overflow during flow control signaling. Moreover, wire length is limited by buffer size. Virtual cut-through (VCT) achieves a higher throughput than wormhole switching. Moreover, the traditional disadvantages of VCT switching, as buffer requirements and packetizing overhead, disappear in NOWs. In this paper, we show that VCT routers can be simpler than wormhole ones, while still achieving the advantages of using virtual channels and adaptive routing. We also propose a fully adaptive routing algorithm for VCT switching in NOWs. Moreover, we show that VCT routers outperform wormhole routers in a NOW environment at a lower cost.

关键词： Pipeline processing systems

来源：评论

学校读者我要写书评

暂无评论

SMARTS: Exploiting temporal locality and parallelism through vertical execution

Proceedings of the International Conference on Supercomputin...

引用

Proceedings of the international conference on Supercomputing 1999年 302-310页

作者： Vajracharya, Suvas Karmesin, Steve Beckman, Peter Crotinger, James Malony, Allen Shende, Sameer Oldehoeft, Rod Smith, Stephen Los Alamos Natl Lab Los Alamos United States

In the solution of large-scale numerical problems, parallel computing is becoming simultaneously more important and more difficult. the complex organization of today's multi-processors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely complex code that does not port to other architectures. this paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of managing parallelism and data locality from the user. We present innovative algorithms, based on the macro-dataflow model, for detecting data parallelism and efficiently executing data-parallel statements on shared-memory multiprocessors. We also describe how these algorithms can be implemented on clusters of SMPs.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

the Wiener filter and regularization methods for image restoration problems

The Wiener filter and regularization methods for image resto...

引用

international conference on Image Analysis and processing

作者： A. Murli L. D'Amore V. De Simone Center for Research on Parallel Computing and Supercomputerse University of Naples Federico II Naples Italy

Discretization of image restoration problems often leads to a discrete inverse ill-posed problem: the discretized operator is so badly conditioned that it can be actually considered as undetermined. In this case one should single out the solution which is the nearest to the desired solution. the usual way to do it is to regularize the problem. In this paper we focus on the computational aspects of the Wiener filter within the framework of the regularization methods. the emphasis is on its reliability and its efficiency, both of which become more and more important as the size and the complexity of the real problem grow and the demand for advanced real-time processing increases.

关键词： Wiener filter Image restoration Degradation Filtering parallel processing Supercomputers Gaussian noise Random variables Convolution Inverse problems

来源：评论

学校读者我要写书评

暂无评论

VPSF: A parallel signature file technique using vertical partitioning and extendable hashing 10th

VPSF: A parallel signature file technique using vertical par...

引用

10th international conference on Database and Expert Systems Applications, DEXA 1999

作者： Kim, Jeong-Ki Chang, Jae-Woo Real-time Computing Dept ETRI Yusong P.O. Box 106 Taejon305-600 Korea Republic of Chonbuk National University Chonju Chonbuk560-756 Korea Republic of

ISBN: (纸本)3540664483

In this paper, we propose a Vertically-partitioned parallel Signature File (VPSF) method which can partition a signature file vertically. Our VPSF method uses an extendable hashing technique for dynamic environment and uses a frame-sliced signature file technique for efficient retrieval. Our VPSF method also can eliminate the data skew and the execution skew by allocating each frame to a processing node. To prove the efficiency of our VPSF method, we compare its performance with those of the conventional parallel signature file methods, i.e., HPSF and HF, in terms of retrieval time, storage overhead, and insertion time. the experiment runs on several distributions with normal, half, and double standard deviations of the real data. the result shows that our VPSF achieves about 40% better retrieval performance than the HF in all cases. In addition, we show that our VPSF gains about 20~50% improvement in retrieval time, compared with the HF and HPSF on record sets with the half deviation. As a result, our VPSF generally outperforms on retrieval performance when the records of a database are uniform in size. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

Graph based framework to detect optimal memory layouts for improving data locality

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing Symposium, IPPS 1999年 738-743页

作者： Kandemir, Mahmut Choudhary, Alok Ramanujam, J. Banerjee, Prith Northwestern Univ Evanston United States

In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advances in caches help in better utilization of the memory hierarchy, compiler-directed locality enhancement techniques are also important. In this paper we propose a locality improvement technique that uses data space (array layout) transformations in contrast to most of the previous work based on iteration space (loop) transformations. In other words, rather than changing the order of loop iterations, our technique modifies the memory layouts of multi-dimensional arrays. In comparison with previous work on data transformations it brings two novelties. First, we formulate the problem on a special graph structure called the layout graph (LG) and use integer linear programming (ILP) methods to determine optimal layouts. Second, in addition to static layout detection, our approach also enables the compiler to determine optimal dynamic layouts;that is, the layouts that can be changed across loop nest boundaries. We believe that this is the first attempt to determine optimal dynamic memory layouts. We also present preliminary experimental results on the SGI Origin 2000 distributed shared memory multiprocessor. Our results so far are encouraging and indicate that the additional compilation time taken by the solver is tolerable.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

A graph based framework to detect optimal memory layouts for improving data locality

A graph based framework to detect optimal memory layouts for...

引用

international Symposium on parallel processing

作者： M. Kandemir A. Choudhary J. Ramanujam P. Banerjee CPDC Department of Electrical and Computer Engineering Northwestern University Evanston IL USA Department of Electrical and Computer Engineering Louisiana State University Baton Rouge LA USA

In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advances in caches help in better utilization of the memory hierarchy, compiler-directed locality enhancement techniques are also important. In this paper we propose a locality improvement technique that uses data space (array layout) transformations in contrast to most of the previous work based on iteration space (loop) transformations. In other words, rather than changing the order of loop iterations, our technique modifies the memory layouts of multi-dimensional arrays. In comparison with previous work on data transformations it brings two novelties. First, we formulate the problem on a special graph structure called the layout graph (LG) and use integer linear programming (ILP) methods to determine optimal layouts. Second, in addition to static layout detection, our approach also enables the compiler to determine optimal dynamic layouts; that is, the layouts that can be changed across loop nest boundaries. We believe that this is the first attempt to determine optimal dynamic memory layouts. We also present preliminary experimental results on the SGI Origin 2000 distributed shared memory multiprocessor. Our results so far are encouraging and indicate that the additional compilation time taken by the solver is tolerable.

关键词： Random access memory Data structures Data mining Memory management Integer linear programming parallel machines Cache memory Optimizing compilers Program processors Law

来源：评论

学校读者我要写书评

暂无评论

Morphological filters implementation based on a co-design approach

Morphological filters implementation based on a co-design ap...

引用

international conference on Image Analysis and processing

作者： R. Sasportas J.-C. Klein

In the framework of object-based video coding morphological filtering has became an important technique for image simplification. Morphological filters are used in a pre-processing step to improve the chances of obtaining meaningful segmentation results. However the computation cost of these filters limits their practical use. An improvement of the processing time is then required for their use in real-time applications. In this paper the implementation issue is addressed. Improvement of the processing time is achieved by exploiting the local parallelism of these filters and by reducing the number of clock cycles used to access a specific data structure called the hierarchical queue. As various morphological filters can be used, an architecture based on hardware software partitioning is proposed. Performance results showing the benefits of the proposed architecture are provided.

关键词： Filters Computer architecture Video coding Filtering Image segmentation Computational efficiency parallel processing Clocks Data structures Hardware

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：