In this paper, we give an algorithm for the node-to-set disjoint paths problem in transposition graphs. The algorithm is of polynomial order of n for an n-transposition graph. It is based on recursion and divided into...
详细信息
ISBN:
(纸本)1932415262
In this paper, we give an algorithm for the node-to-set disjoint paths problem in transposition graphs. The algorithm is of polynomial order of n for an n-transposition graph. It is based on recursion and divided into two cases according to the distribution of destination nodes. The maximum length of each path and the time complexity of the algorithm are estimated and the average performance is evaluated based on computer experiment.
The development of a distributed policy managed storage architecture, based on an overlay network, is discussed. DIET, a platform developed at BTExact, was employed to build the overlay network. The DIET platform is a...
详细信息
ISBN:
(纸本)1932415262
The development of a distributed policy managed storage architecture, based on an overlay network, is discussed. DIET, a platform developed at BTExact, was employed to build the overlay network. The DIET platform is a robust, adoptable and scalable platform for hosting multi-agent systems. The first level of DIET is the core, which enforces the constraints on agents. The second layer contains application reusable components and at the top, is the application layer, where developers can build upon the given platform functionality. The benefits of DIET include its usage as a data archival repository for files such as music, pictures and film.
Action system formalisms describe distributed system behavior in terms of objects that participate in joint actions. High-level action system specifications are nondeterministic and generic since the execution order o...
详细信息
ISBN:
(纸本)1892512459
Action system formalisms describe distributed system behavior in terms of objects that participate in joint actions. High-level action system specifications are nondeterministic and generic since the execution order of actions is left open and classes are used as templates of objects. In this paper, action systems are designed using DisCo, a specification method for distributed and reactive systems. We present a tool and methodology for deriving VHDL descriptions from high-level DisCo specifications. Compiler features and principles of the DisCo-VHDL mapping are discussed in detail.
An important component in compiling for distributed memory machines is data partitioning, While a number of automatic analysis techniques have been proposed for this phase, none of them is applicable for irregular pro...
详细信息
ISBN:
(纸本)0818686510
An important component in compiling for distributed memory machines is data partitioning, While a number of automatic analysis techniques have been proposed for this phase, none of them is applicable for irregular problems. In this paper we present compile-time analysis for determining data partitioning for such applications. We have developed a set of cost functions for determining communication and redistribution costs in irregular codes. We first determine the appropriate distributions for a single data parallel statement and then use the cost functions with a greedy algorithm for computing distributions for the full program. Initial performance results on a 16 processor IBM SP-2 are also presented.
Neural networks offer an intriguing set of techniques for learning based on the adjustment of weights of connections between processing units. However, the power and limitations of connectionist methods for learning, ...
详细信息
Neural networks offer an intriguing set of techniques for learning based on the adjustment of weights of connections between processing units. However, the power and limitations of connectionist methods for learning, such as the method of back propagation in paralleldistributedprocessing networks, are not yet entirely clear. A set of experiments that more precisely identify the power and limitations of the method of back propagation is reported on. The experiment on learning to compute the exclusive-or function suggests that the computational efficiency of learning by the method of back propagation depends on the initial weights in the network. The experiment on learning to play tic-tac-toe suggests that the information content of what is learned by the back propagation method is dependent on the initial abstractions in the network. It also suggests that these abstractions are a major source of power for learning in paralleldistributedprocessing networks. In addition, it is shown that the learning task addressed by connectionist methods, including the back propagation method, is computationally intractable. These experimental and theoretical results strongly indicate that current connectionist methods may be too limited for the complex task of learning they seek to solve. It is proposed that the power of neural networks may be enhanced by developing task-specific connectionist methods.
In distributed computer systems, processors often need to be synchronized to maintain correctness and consistency. Unlike shared-memory parallel systems, the lack of shared memory and a clock considerably complicates ...
详细信息
ISBN:
(纸本)1932415262
In distributed computer systems, processors often need to be synchronized to maintain correctness and consistency. Unlike shared-memory parallel systems, the lack of shared memory and a clock considerably complicates the task of synchronization is distributed systems. The objective of this paper is to present a new Self-Stabilization algorithm (arbiter-based and broadcast-based synchronization) for an acyclic directed-graph structured distributed systems. This new fault-tolerant algorithm survives all imaginable faults in distributed systems.
We present gossip-based distributed algorithms by which each node of a cluster or a grid can find an estimate of the global average load of the system. Knowledge of this estimate can improve the performance by allowin...
详细信息
ISBN:
(纸本)1932415262
We present gossip-based distributed algorithms by which each node of a cluster or a grid can find an estimate of the global average load of the system. Knowledge of this estimate can improve the performance by allowing each node to make better scheduling decisions. Our algorithms are based on asynchronous exchanges (gossip) of load messages between either a random or a fixed pairs of nodes. We show that based on these messages, each node can find an estimate of the average load of all the nodes. We present the algorithms and prove their convergence.
We present a powerful distributed SAT procedure for Microchip PIC microcontrollers. The algorithm is an adaption of the state-of-the-art solver CHAFF optimised for the limited resources of the Microchip processors. It...
详细信息
ISBN:
(纸本)1932415610
We present a powerful distributed SAT procedure for Microchip PIC microcontrollers. The algorithm is an adaption of the state-of-the-art solver CHAFF optimised for the limited resources of the Microchip processors. It contains features of modern SAT engines like conflict-driven learning and non-chronological backtracking as well as an efficient work stealing method to run several processors in parallel. The underlying hardware environment is a special multiprocessor system based on a PC ISA slot card holding up to 9 PIC microcontrollers. Thereby the communication topology between the computing units can be reconfigured during runtime. Knowledge sharing between processors working on different parts of the same problem instance is important. In particular, we introduce filtering of conflict clauses and checking for so-called single literal clauses. Experimental results demonstrate that with both features enabled a further speedup of about 49% on average is possible.
The problem of deepening memory hierarchy towards exascale is becoming serious for applications such as those based on stencil kernels, as it is difficult to satisfy both high memory bandwidth and capacity requirement...
详细信息
ISBN:
(纸本)9780769557854
The problem of deepening memory hierarchy towards exascale is becoming serious for applications such as those based on stencil kernels, as it is difficult to satisfy both high memory bandwidth and capacity requirements simultaneously. This is evident even today, where problem sizes of stencil-based applications on GPU supercomputers are limited by aggregated capacity of GPU device memory. There have been locality improvement techniques such as temporal blocking to enhance performance, but integrating those techniques into existing stencil applications results requires substantially higher programming cost, especially for complex applications and as a result are not typically utilized. We alleviate this problem with a run-time GPU-MPI process oversubscription library we call HHRT that automates data movement across the memory hierarchy, and a systematic methodology to convert, and optimize the code to accommodate temporal blocking. The proposed methodology has shown to significantly eases the adaptation of real applications, such as the whole-city airflow simulator embodying more than 12,000 lines of code;with careful tuning, we successfully maintain up to 85% performance even with problems whose footprint is four time larger than GPU device memory capacity, and scale to hundreds of GPUs on the TSUBAME2.5 supercomputer.
Autonomous intelligent robotic systems that can survive in dynamic changing environments have yet to be realized. The robot must process incoming information, make decisions, and learn from previous actions concurrent...
详细信息
ISBN:
(纸本)1932415610
Autonomous intelligent robotic systems that can survive in dynamic changing environments have yet to be realized. The robot must process incoming information, make decisions, and learn from previous actions concurrently and in real time. Reliance on one all-encompassing system with one overburdened processor, no matter how fast, makes this goal unreachable. One of the primary reasons the goal is unreachable is that the knowledge base that the robot must access to make its decisions grows so large over time that simply searching the knowledge base for a decision becomes too time consuming for acceptable robotic behavior. One solution is to distribute the knowledge base among several processors and search in parallel. An experiment was contrived to test parallel searching algorithms which must search large databases of information, and provide some insight on performance of the underlying programming environment.
暂无评论