A parallel implementation of the bzip2 block-sorting loss-less compression program is described. the performance of the parallel implementation is compared to the sequential bzip2 program running on various shared mem...
详细信息
A parallel implementation of the bzip2 block-sorting loss-less compression program is described. the performance of the parallel implementation is compared to the sequential bzip2 program running on various shared memory parallel architectures. the parallel bzip2 algorithm works by taking the blocks of input data and running them through the Burrows-Wheeler Transform (BWT) simultaneously on multiple processors using pthreads. the output of the algorithm is fully compatible withthe sequential version of bzip2 which is in wide use today. the results show that a significant, near-linear speedup is achieved by using the parallel bzip2 program on systems with multiple processors. this will greatly reduce the time it takes to compress large amounts of data while remaining fully compatible withthe sequential version of bzip2.
Recently self-simulation algorithms have been developed to execute algorithms on a reconfigurable mesh (RM) of size smaller than recommended in those algorithms. Optimal slowdown, in self-simulation, has been achieved...
详细信息
Recently self-simulation algorithms have been developed to execute algorithms on a reconfigurable mesh (RM) of size smaller than recommended in those algorithms. Optimal slowdown, in self-simulation, has been achieved withthe compromise that the resultant algorithms fail to remain AT(2) optimal. In this paper, we introduce, for the first time, the idea of adaptive algorithm which runs on RM of variable sizes without compromising the AT(2) optimality. We support our idea by developing adaptive algorithms for sorting items and computingthe contour of maximal elements of a set of planar points on RM. (C) 2000 Published by Elsevier Science B.V. All rights reserved.
A wide range of planning applications are combinatorial in nature, making the design of general purpose planning algorithms a still very challenging endeavor. In order to cope withthis combinatorial complexity, some ...
详细信息
A wide range of planning applications are combinatorial in nature, making the design of general purpose planning algorithms a still very challenging endeavor. In order to cope withthis combinatorial complexity, some of the most recent work in artificial intelligence (AI) planning focuses on the use of sophisticated heuristics, domain search control knowledge, random search and efficient abstract state space encodings such as binary decision diagrams. the additional performance needed by complex planning applications can be provided by adopting massively parallelcomputingsystems, such as networks of clusters. this paper describes a simple, general approach for turning backtrack search based planners into more powerful distributedsystemsthat run on networks of clusters. Our approach consists in distributing backtrack search points to different processes on the network. We illustrate its potential using DSHOP, a distributed version of the SHOP planner.
this paper presents a data format for the parallel numerical integration package PARINT using XML. As with many other numeric computation programs, PARINT accepts a long list of arguments for describing the user's...
详细信息
this paper presents a data format for the parallel numerical integration package PARINT using XML. As with many other numeric computation programs, PARINT accepts a long list of arguments for describing the user's problem, the algorithm to be used and for specifying parallel run characteristics. Supporting XML input allows platform-independent creation and manipulation of input specifications and simplifies the addition of new integration algorithms. We discuss the purpose of each section in the proposed XML data format, and describe how new sections can be added to the XML data structure in order to support new computing paradigms. We also explain how data are processed efficiently and give some application examples. the format can serve more generally for various software packages.
One implementation of broadcast-based networks is Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus). It is a low-latency, high-bandwidth, fiber-optic network that directly connects each processing node to al...
详细信息
One implementation of broadcast-based networks is Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus). It is a low-latency, high-bandwidth, fiber-optic network that directly connects each processing node to all other nodes without contention. To better utilize the communication network and reduce the completion time of a parallel application, this paper describes the Key Message (KM) approach on SOME-Bus clusters. After presenting KM algorithm with SOME-Bus structure, an example is analyzed to evaluate its performance. the analysis result shows improved performance of communication of a parallel application over a system that does not use the KM approach.
Large-scale classification is an important task of machine learning, especially in the smart city field, which is a big data environment. In recent years, single-threaded optimization algorithms can no longer meet the...
详细信息
this paper proposes a network routing algorithm REI which has autonomous adaptability to network traffic conditions. When a routing node has some different paths to a given destination, we can evaluate these paths in ...
详细信息
this paper proposes a network routing algorithm REI which has autonomous adaptability to network traffic conditions. When a routing node has some different paths to a given destination, we can evaluate these paths in terms of their latency (delay time) given in inbound data packets. Evaluating scores of the paths, every node works as a distributed autonomous agent for adaptive routing. By network simulations to compare with a conventional OSPF and enhanced ones, we show that the multiagents-based routing algorithm has good adaptability in congested path avoidance and network load balancing.
In this paper, we improve the performance of server-side I/O scheduling on parallel file systems by transparently including information about the applications' access patterns. Server-side I/O scheduling is a valu...
详细信息
ISBN:
(纸本)9781479920815
In this paper, we improve the performance of server-side I/O scheduling on parallel file systems by transparently including information about the applications' access patterns. Server-side I/O scheduling is a valuable tool on multi-application scenarios, where the applications' spatial locality suffers from interference caused by concurrent accesses to the file system. We present AGIOS, an I/O scheduling library for parallel file systems. We guide scheduler's decisions by including information about the applications' future requests. this information is obtained from traces generated by the scheduler itself, without changes in application or file system. Our approach shows performance improvements under different workloads of 46.3% on average when compared to a scenario without an I/O scheduler, and of 25.1% when compared to a scheduler which does not use information about future accesses.
On the instruction-level parallel architecture such as VLIW, the performance is affected by the compiler technique. In this paper, we propose an integrated optimization technique which cooperates register reusing, spi...
详细信息
On the instruction-level parallel architecture such as VLIW, the performance is affected by the compiler technique. In this paper, we propose an integrated optimization technique which cooperates register reusing, spilling and rematerialization First, we develop a register allocation method that can be decided, whether the register must be reusing or spilled or rematerialized by the prediction of the execution timing of the instruction in the program, when registers are insufficient. We evaluate our method in comparison with conventional compiler technique for blocks of programs. Second, the spilling and the rematerialization are also applied to the software pipelining to improve the parallelism in the loops. It was shown that the spilling and the rematerialization adopted in the scheduling, improves the parallelism in the loop executions.
During the past decade, Peer-to-Peer Video-on Demand (VoD) systems have proved their efficiency for large deployments. they raise new challenges such as peers resource allocation. Most literature on resource allocatio...
详细信息
ISBN:
(纸本)9781479920815
During the past decade, Peer-to-Peer Video-on Demand (VoD) systems have proved their efficiency for large deployments. they raise new challenges such as peers resource allocation. Most literature on resource allocation tackle the problem with optimal static rules found at offline study of the system. In this paper, we use a dynamic metaheuristic, called Multiple Local-Search Algorithm for Dynamic Optimization (MLSDO) to optimize the problem at hand. the obtained results show that using a dynamic resource allocation reduces the rejection rate while enhancing the entropy of the system, in the face of a dynamically changing title demand.
暂无评论