检索结果-内蒙古大学图书馆

Proceedings of the international parallel processing symposium, IPPS 1999年 555-562页

作者： Sreenivas, Mahesh K. AlSabti, Khaled Ranka, Sanjay Univ of Florida Gainesville United States

Classification is an important problem in the field of data mining. Construction of good classifiers is computationally intensive and offers plenty of scope for parallelization. Divide-and-conquer paradigm can be used to efficiently construct decision tree classifiers. We discuss in detail various techniques for parallel divide-and-conquer and extend these techniques to handle efficiently disk-resident data. Furthermore, a generic technique for parallel out-of-core divide-and-conquer problems is suggested. We present pCLOUDS, the parallel version of the decision tree classifier algorithm CLOUDS, capable of handling large out-of-core data sets. pCLOUDS exhibits excellent speedup, sizeup and scaleup properties which make it a competitive tool for data mining applications. We evaluate the performance of pCLOUDS for a range of synthetic data sets on the IBM-SP2.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

PM-PVM: A portable multithreaded PVM

PM-PVM: A portable multithreaded PVM

引用

international symposium on parallel processing

作者： C.M.P. Santos J.S. Ande NCE and COPPE Federal University of Rio de Janeiro Brazil

PM-PVM is a portable implementation of PVM designed to work on SMP architectures supporting multithreading. PM-PVM portability is achieved through the implementation of the PVM functionality on top of a reduced set of parallel programming primitives. Within PM-PVM, PVM tasks are mapped onto threads and the message passing functions are implemented using shared memory. three implementation approaches of the PVM message passing functions have been adopted. In the first one, a single message copy in memory is shared by all destination tasks. the second one replicates the message for every destination task but requires less synchronization. Finally, the third approach uses a combination of features from the two previous ones. Experimental results comparing the performance of PM-PVM and PVM applications running on a 4-processor Sparcstation 20 under Solaris 2.5 show that PM-PVM can produce execution times up to 54% smaller than PVM.

关键词： Yarn Message passing parallel programming Application software parallel processing Computer networks Operating systems Data structures Signal generators User interfaces

来源：评论

学校读者我要写书评

暂无评论

Implementation of the limited-area numerical weather prediction model Aladin in distributed memory

Implementation of the limited-area numerical weather predict...

引用

5th international Conference on parallel processing, Euro-Par 1999

作者： Fischer, Claude Estrade, Jean-François Jerman, Jure Météo-France/CNRM/GMAP/EXT 42 Avenue Gustave Coriolis 31057 Toulouse Cedex France Météo-France/SCEM/TTI France Slovenian Hydrometeorological Institute Ljubljana Slovenia

ISBN: (纸本)3540664432

the technical challenges for the limited area model Aladin are various: Aladin is the French operational mesoscale model, run on the Météo-France VPP-700E on 4 processors twice a day. Also, the model is used in operational or pre-operational mode in several European countries, plus Morocco, and therefore the model has to be portable and computationally efficient on several platforms. In the presentation, the general choices for the porting of Aladin in a parallel environment will be discussed, with an emphasis to Aladin-specific aspects, the general choices being closely related to those of Arpège. Finally, the question of applications to workstation clusters will be addressed. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Weather forecasting

来源：评论

学校读者我要写书评

暂无评论

Systolic algorithm to process compressed binary images

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing symposium, IPPS 1999年 477-484页

作者： Ercal, Fikret Allen, Mark Feng, Hao Univ of Missouri - Rolla Rolla United States

A new systolic algorithm which computes image differences in run-length encoded (RLE) format is described. the binary image difference operation is commonly used in many image processing applications including automated inspection systems, character recognition, fingerprint analysis, and motion detection. the efficiency of these operations can be improved significantly with the availability of a fast systolic system that computes the image difference as described in this paper. It is shown that for images with a high similarity measure, the time complexity of the systolic algorithm is small and in some cases constant with respect to the image size. the time for the systolic algorithm is proportional to the difference between the number of runs in the two images, while the time for the sequential algorithm is proportional to the total number of runs in the two images together. A formal proof of correctness for the algorithm is also given.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Low-latency message passing on workstation clusters using SCRAMNet

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing symposium, IPPS 1999年 148-152页

作者： Moorthy, Vijay Jacunski, Matthew G. Pillai, Manoj Ware, Peter P. Panda, Dhabaleswar K. Page Jr., thomas W. Sadayappan, P. Nagarajan, V. Daniel, Johns Ohio State Univ Columbus United States

Clusters of workstations have emerged as a popular platform for parallel and distributed computing. Commodity high speed networks which are used to connect workstation clusters provide high bandwidth, but also have high latency. SCRAMNet is an extremely low latency replicated non-coherent shared memory network, so far used only for real-time applications. this paper reports our early experiences with using SCRAMNet for cluster computing. We have implemented a user-level zero-copy message passing protocol for SCRAMNet called the BillBoard Protocol (BBP). the one way latency for sending a 4-byte message between two nodes using the BBP is measured to be as low as 7.8 μs. Since SCRAMNet supports hardware level replication of messages, it is possible to implement multicast with almost the same latency as point-to-point communication. Using the BBP, the latency for broadcasting short messages to 4 nodes is measured to be 10.1 μs and the latency for a 4-node barrier is measured to be 37 μs. We have also built an MPI library on top of the BBP which makes use of multicast support from the BBP. Our results demonstrate the potential of SCRAMNet as a high performance interconnect for building scalable workstation clusters supporting message passing.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Lazy logging and prefetch-based crash recovery in software distributed shared memory systems

Lazy logging and prefetch-based crash recovery in software d...

引用

international symposium on parallel processing

作者： A. Kongmunvattana Nian-Feng Tzeng Center for Advanced Computer Studies University of Southwestern Louisiana Lafayette LA USA

In this paper we propose a new, efficient logging protocol, called lazy logging, and a fast crash recovery protocol, called the prefetch-based crash recovery (PCR), for software distributed shared memory (SDSM). Our lazy logging protocol minimizes failure-free overhead by logging only data indispensable for correct recovery, while our PCR protocol reduces the recovery time by prefetching data according to the future memory access patterns, thus eliminating memory miss penalty during the recovery process. We have performed experiments on workstation clusters, comparing our protocols against the earlier reduced-stable logging (RSL) protocol by actually implementing both protocols in TreadMarks, a state-of-the-art SDSM system. the experimental results show that our lazy logging protocol consistently outperforms the RSL protocol. Our protocol increases the execution time slightly by 1% to 4% during failure-free execution, while the RSL protocol results in the execution time overhead of 6% to 21% due to its larger log size and higher disk access frequency. Our PCR protocol also outperforms the widely used simple crash recovery protocol by 18% to 57% under all applications examined.

关键词： Prefetching Computer crashes Access protocols Delay Software systems distributed computing Ash Ear Application software Workstations

来源：评论

学校读者我要写书评

暂无评论

Scalable hardware-algorithms for Binary prefix sums 13th

引用

13th international parallel processing symposium, IPPS 1999 Held in Conjunction with the 10th symposium on parallel and distributed processing, SPDP 1999

作者： Lin, R. Nakano, K. Olariu, S. Pinotti, M.C. Schwing, J.L. Zomaya, A.Y. Department of Computer Science SUNY Geneseo GeneseoNY14454 United States Department of Electrical and Computer Engineering Nagoya Institute of Technology Showa-ku Nagoya466-8555 Japan Department of Computer Science Old Dominion University NorfolkVA23529 United States I.E.I C.N.R Pisa Italy Department of Computer Science Central Washington University EllensburgWA98926 United States Parallel Computing Research Lab Dept of Electrical and Electronic Eng University of Western Australia Perth Australia

ISBN: (纸本)3540658319

the main contribution of this work is to propose a number of broadcast efficient VLSI architectures for computing the sum and the prefix sums of a wk-bit, k ≥ 2, binary sequence using, as basic building blocks, linear arrays of at most w2 shift switches. An immediate consequence of this feature is that in our designs broadcasts are limited to buses of length at most w2 making them eminently practical. Using our design, the sum of a wk-bit binary sequence can be obtained in the time of2k-2 broadcasts, using 2wk−2 + O(wk−3) blocks, while the corresponding prefix sums can be computed in 3k-4 broadcasts using (k + 2)Wk−2 + O(kwk−3) blocks. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Binary sequences

来源：评论

学校读者我要写书评

暂无评论

Hardwired-clusters partial-crossbar: A hierarchical routing architecture for multi-FPGA systems 13th

引用

13th international parallel processing symposium, IPPS 1999 Held in Conjunction with the 10th symposium on parallel and distributed processing, SPDP 1999

作者： Khalid, Mohammed A.S. Rose, Jonathan Quicktum Design Systems 55 West Trimble Road San JoseCA95131-1013 Canada Department of Electrical and Computer Engineering University of Toronto TorontoONM5S 3G4 Canada

ISBN: (纸本)3540658319

Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture which is the manner in which wires, FPGAs and Field-Programmable Interconnect Devices (FPIDs) are connected. Several routing architectures for MFSs have been proposed [Amo92] [Butt92] [Hauc94] [Apti96] [Vui196] [Babb97] and previous research has shown that the partial crossbar is one of the best existing architectures [Kim96] [Kha197]. Recently, the Hybrid Complete-Graph Partial-Crossbar Architecture (HCGP) was proposed [Kha198], which was shown to be superior to the Partial Crossbar. In this paper we propose a new routing architecture, called the Hardwired-Clusters Partial-Crossbar (HWCP) which is better suited for large MFSs implemented using multiple boards. the HWCP architecture is compared to the HCGP and Partial Crossbar and we show that it gives substantially better manufacturability. We compare the performance and cost of the HWCP, HCGP and Partial Crossbar architectures experimentally, by mapping a set of 15 large benchmark circuits into each architecture. We show that the HWCP architecture gives reasonably good cost and speed compared to the HCGP and Partial Crossbar architectures. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Novel compilation framework for supporting semi-regular distributions in hybrid applications

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing symposium, IPPS 1999年 597-602页

作者： Chakrabarti, Dhruva R. Banerjee, Prithviraj Northwestern Univ Evanston United States

this paper explains how efficient support for semi-regular distributions can be incorporated in a uniform compilation framework for hybrid applications. the key focus of this work is in showing how, unlike other existing schemes, our scheme is able to minimize preprocessing overheads and maintain sophisticated communication optimizations (such as reduction of inter-processor communication during schedule generation and sharing of communicated information between regular and irregular accesses) even in the presence of semi-regular distributions. It is only natural that preprocessing overheads associated with semi-regular distributions be intermediate between those involved for regular and irregular distributions. this paper shows how various properties can be inferred for semi-regular distributions. these allow the use of the interval representation which in turn reduces the preprocessing overhead and makes possible compatible code generation for hybrid references. Experimental results on a 16-processor IBM SP-2 for a number of sparse applications using semi-regular distributions show that our scheme is feasible.

关键词： Program compilers

来源：评论

学校读者我要写书评

暂无评论

Computational Co-op: Gathering clusters into a metacomputer

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel processing symposium, IPPS 1999年 160-166页

作者： Cirne, Walfredo Marzullo, Keith Univ of California San Diego La Jolla United States

We explore the creation of a metacomputer by the aggregation of independent sites. Joining a metacomputer is voluntary, and hence it has to be an endeavor that mutually benefits all parties involved. We identify proportional-share allocation as a key component of such a mutual benefit. Proportional-share allocation is the basis for enforcing the agreement reached among the sites on how to use the metacomputer's resources. We introduce a resource manager that provides proportional-share allocation over a cluster of workstations, assuming applications to be master-slave. this manager is novel because it performs non-preemptive proportional scheduling of multiple processors. A prototype has been implemented and we report on preliminary results. Finally, we discuss how tickets (first-class entities that encapsulate allocation endowments) can be used in practice to enforce the metacomputer agreement, and also how they can ease the site selection to be performed by the application.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：