Recently, graphics hardware architectures have begun to emphasize versatility, offering rich new ways to programmatically reconfigure the graphics pipeline. In this paper we explore whether current graphics architectu...
详细信息
Recently, graphics hardware architectures have begun to emphasize versatility, offering rich new ways to programmatically reconfigure the graphics pipeline. In this paper we explore whether current graphics architectures can be applied to problems where general-purpose vector processors might traditionally be used. We develop a programming framework and apply it to a variety of problems, including matrix multiplication and 3-SAT. Comparing the speed of our graphics card implementations to standard CPU implementations, we demonstrate startling performance improvements in many cases, as well as room for improvement in others. We analyze the bottlenecks and propose minor extensions to current graphics architectures which would improve their effectiveness for solving general-purpose problems. Based on our results and current trends in microarchitecture, we believe that efficient use of graphics hardware will become increasingly important to high-performance computing on commodity hardware.
The aim of the CatNet project is to combine economic and computer science research to provide new coordination mechanisms for large-scale application-layer net-works. The ability of a free-market economy to balance an...
详细信息
ISBN:
(纸本)9780769515823
The aim of the CatNet project is to combine economic and computer science research to provide new coordination mechanisms for large-scale application-layer net-works. The ability of a free-market economy to balance and satisfy the convicting needs of millions of human agents recommends it as a decentralized organizational principle. CatNet will evaluate a decentralized mechanism for resource allocation in computer networks, which is based on the economic paradigm of the Catallaxy. The technical realization of the paradigm builds on software agents which buy and sell network services and resources. This concept is applied both to initial service deployment and service access and to provisioning during the network's lifecycle.
Checkpointing protocols for distributed computing systems can also be applied to mobile computing systems, but the unique characteristics of the mobile environment need to be taken into account. In this paper, an impr...
详细信息
ISBN:
(纸本)0769518524
Checkpointing protocols for distributed computing systems can also be applied to mobile computing systems, but the unique characteristics of the mobile environment need to be taken into account. In this paper, an improved time-based checkpointing protocol is proposed, which is suitable for mobile computing systems based on Mobile IP. The main improvement over a traditional time-based protocol is that our protocol reduces the number of checkpoints per checkpointing process to nearly minimum, so that fewer checkpoints need to be transmitted through the bandwidth-limited wireless links. The proposed protocol also performs very well in the aspects of minimizing the number and the size of messages transmitted in the wireless network. Therefore, the protocol brings very little overhead to a mobile host which has limited resource. Additionally, by integrating the improved timer synchronization technique, our protocol can also be applied to wide area networks.
Recently substantial research has been devoted to Unmanned Aerial Vehicles (UAVs). One of a UAV's most demanding subsystem is vision. The vision subsystem must dynamically combine different algorithms as the UAV&#...
详细信息
ISBN:
(纸本)0769517919
Recently substantial research has been devoted to Unmanned Aerial Vehicles (UAVs). One of a UAV's most demanding subsystem is vision. The vision subsystem must dynamically combine different algorithms as the UAV's goal and surrounding chang. To fully utilize the available hardware, a run time system must be able to vary the quality and the size of regions the algorithms are applied to, as the number of image processing tasks changes. To allow this the run time system and the underlying computational model must be integrated. In this paper we present a computational model suitable for integration with a run time system. The computational model is called Image Processing Data Flow Graph (IP-DFG). IP-DFG has been developed for modeling of complex image processing algorithms. IP-DFG is based on data flow graphs, but has been extended with hierarchy and new rules for token consumption, which makes the computational model more flexible and more suitable for human interaction. In this paper we also show that IP-DFGs are suitable for modelling expressions, including data dependent decisions and iterations, which are common in complex image processing algorithms.
It is envisaged that the grid infrastructure will be a large-scale distributed software system that will provide high-end computational and storage capabilities to differentiated users. A number of distributed computi...
详细信息
ISBN:
(纸本)9780769515823
It is envisaged that the grid infrastructure will be a large-scale distributed software system that will provide high-end computational and storage capabilities to differentiated users. A number of distributed computing technologies are being applied to grid development work, including CORBA and Jini. In this work, we introduce an A4 (Agile Architecture and Autonomous Agents) methodology, which can be used for resource management for grid computing. An initial system implementation utilises the performance prediction techniques of the PACE toolkit to provide quantitative data regarding the performance of complex applications running on local grid resources. At the meta-level, a hierarchy of identical agents is used to provide an abstraction of the system architecture. Each agent is able to cooperate with other agents to provide service advertisement and discovery to schedule applications that need to utilise grid resources. A performance monitor and advisor (PMA) is in development to optimize the performance of agent behaviours.
The concept of macro scale synthetic jets has been applied to the low Reynolds number channel flows associated with biosensor microfluidics. The current numerical investigation utilizes a hybrid approach of the lattic...
详细信息
ISBN:
(纸本)0819447323
The concept of macro scale synthetic jets has been applied to the low Reynolds number channel flows associated with biosensor microfluidics. The current numerical investigation utilizes a hybrid approach of the lattice Boltzmann method for flow field computations and the convection-diffusion equation for passive scalar transport. The study presents results for various synthetic jet geometries, jet inlet conditions, scaling issues and Reynolds numbers. The results indicate limited effects due to synthetic jet cavity-slot geometry and that the synthetic jet imparts momentum to the channel flow thus enhancing fluid mixing.
This volume contains the papers selected for presentation at IPCO 2002, the NinthInternationalConferenceonIntegerProgrammingandCombinatorial- timization, Cambridge, MA (USA), May 27–29, 2002. The IPCO series of c- fe...
详细信息
ISBN:
(数字)9783540478676
ISBN:
(纸本)9783540436768
This volume contains the papers selected for presentation at IPCO 2002, the NinthInternationalConferenceonIntegerProgrammingandCombinatorial- timization, Cambridge, MA (USA), May 27–29, 2002. The IPCO series of c- ferences highlights recent developments in theory, computation, and application of integer programming and combinatorial optimization. IPCO was established in 1988 when the ?rst IPCO program committee was formed. IPCO is held every year in which no International symposium on Ma- ematical Programming (ISMP) takes places. The ISMP is triennial, so IPCO conferences are held twice in every three-year period. The eight previous IPCO conferences were held in Waterloo (Canada) 1990, Pittsburgh (USA) 1992, Erice (Italy) 1993, Copenhagen (Denmark) 1995, Vancouver (Canada) 1996, Houston (USA) 1998, Graz (Austria) 1999, and Utrecht (The Netherlands) 2001. In response to the call for papers for IPCO 2002, the program committee received 110 submissions, a record number for IPCO. The program committee met on January 7 and 8, 2002, in Aussois (France), and selected 33 papers for inclusion in the scienti?c program of IPCO 2002. The selection was based on originality and quality, and re?ects many of the current directions in integer programming and combinatorial optimization research.
Driven by the need to solve linear systems arising from problems posed on extremely large, unstructured grids, there has been a recent resurgence of interest in algebraic multigrid (AMG). AMG is attractive in that it ...
详细信息
Driven by the need to solve linear systems arising from problems posed on extremely large, unstructured grids, there has been a recent resurgence of interest in algebraic multigrid (AMG). AMG is attractive in that it holds out the possibility of multigrid-like performance on unstructured grids. The sheer size of many modem physics and simulation problems has led to the development of massively parallel computers, and has sparked much research into developing algorithms for them. Parallelizing AMG is a difficult task, however. While much of the AMG method parallelizes readily, the process of coarse-grid selection, in particular, is fundamentally, sequential in nature. We have previously introduced a parallel algorithm [A.J. Cleary, R.D. Falgout, V.E. Henson, J.E. Jones, in: Proceedings of the Fifth International symposium on Solving Irregularly Structured Problems in Parallel, Springer, New York, 1998] for the selection of coarse-grid points, based on modifications of certain parallel independent set algorithms and the application of heuristics designed to insure the quality of the coarse grids, and shown results from a prototype serial version of the algorithm. In this paper we describe an implementation of a parallel ANIG code, using the algorithm of A.J. Cleary, R.D. Falgout, V.E. Henson, J.E. Jones [in: Proceedings of the Fifth International symposium on Solving Irregularly Structured Problems in Parallel, Springer, New York, 1998] as well as other approaches to parallelizing the coarse-grid selection. We consider three basic coarsening schemes and certain modifications to the basic schemes, designed to address specific performance issues. We present numerical results for a broad range of problem sizes and descriptions, and draw conclusions regarding the efficacy of the method. Finally, we indicate the current directions of the research. (C) 2002 IMACS. Published by Elsevier Science B.V. All rights reserved.
This paper describes the performance benefits attained using enhanced network interfaces to achieve low latency communication. We make use of DMA communication mode, to send data to other nodes, while the CPU performs...
详细信息
ISBN:
(纸本)9780769515823
This paper describes the performance benefits attained using enhanced network interfaces to achieve low latency communication. We make use of DMA communication mode, to send data to other nodes, while the CPU performs useful calculations. Zero-copy communication is achieved through pinned-down physical memory regions, provided by NIC's driver modules. Our testbed concerns the parallel execution of tiled nested loops onto a Linux PC cluster with PCI-SCI NICs (Dolphin D330). Tiles are essentially exchanging data and should also have large Computational grain, so that their parallel execution becomes beneficial. We schedule tiles much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive, atomic tile executions. The applied nonblocking schedule resembles a pipelined data-path where computation phases are overlapped with communication ones, instead of being interleaved with them. Experimental evaluation illustrates that when using enhanced communication features such as DMA transfers, memory-mapped interfaces and zero-copy mechanisms, overall performance is considerably improved compared to using conventional, CPU and kernel bounded, communication primitives.
The proceedings contain 53 papers. The special focus in this conference is on Mathematical Foundations of Computer Science. The topics include: Global development via local observational construction steps;towards a c...
ISBN:
(纸本)3540440402
The proceedings contain 53 papers. The special focus in this conference is on Mathematical Foundations of Computer Science. The topics include: Global development via local observational construction steps;towards a concise proof of the four-colour theorem of planar maps;applications of finite automata;an algorithmic challenge;low stretch spanning trees;on radiocoloring hierarchically specified planar graphs;finite domain constraint satisfaction using quantum computation;fast algorithms with algebraic monge properties;packing edges in random regular graphs;a lower bound technique for nondeterministic graph-driven read-once-branching programs and its applications;matroid intersections, polymatroid inequalities, and related problems;accessibility in automata on scattered linear orderings;on infinite terms having a decidable monadic theory;a chomsky-like hierarchy of infinite graphs;competitive analysis of on-line stream merging algorithms;coloring k-colorable semirandom graphs in polynomial expected time via semidefinite programming;on word equations in one variable;a sharp bound on the density of guessed bits;two-way finite state transducers with nested pebbles;optimal non-preemptive semi-online scheduling on two related machines;more on weighted servers or FIFO is better than LRU;on maximizing the throughput of multiprocessor tasks;some results on random unsatisfiable k-sat instances and approximation algorithms applied to random structures;evolutive tandem repeats using hamming distance;subgraph isomorphism, log-bounded fragmentation and graphs of locally bounded treewidth and algorithms for computing small NFAs.
暂无评论