An optimal mobile agent moving policy for a mobile agent network where a single mobile agent moves and collects management information, is studied. the network has one management station that maintains many network el...
详细信息
ISBN:
(纸本)9780889867048
An optimal mobile agent moving policy for a mobile agent network where a single mobile agent moves and collects management information, is studied. the network has one management station that maintains many network elements each of which generates the management information (e.g., the updated state of the element). We formulate this network model as a semi-Markov decision process (SMDP) and compute an optimal policy with a policy iteration algorithm. It is shown that an optimal mobile agent moving policy achieved by the policy iteration algorithm has a cost reduction effect for the network in comparison to a heuristic policy.
Graph Partitioning has several important applications in Computer Science, including VLSI circuit layout, image processing, and distributing workloads for parallel computation. It is known to be NP-hard. In this paper...
详细信息
ISBN:
(纸本)9780889867048
Graph Partitioning has several important applications in Computer Science, including VLSI circuit layout, image processing, and distributing workloads for parallel computation. It is known to be NP-hard. In this paper we present in detail the K-Graph Partitioning Problem and the Dynamic distributed Double Guided Genetic Algorithm. this algorithm consists of agents dynamically created and cooperated in order to solve the problem. Each agent performs its own genetic algorithm, guided by the min-conflict-heuristic. the paper also presents the results of application the algorithm for the $K$-Graph Partitioning Problem using a multilevel paradigm.
this paper presents comprehensive evaluations of parallel double Divide and Conquer for singular value decomposition on a super computer, HPC2500. For bidiagonal SVD, double Divide and Conquer was proposed. It first c...
详细信息
ISBN:
(纸本)9780889867048
this paper presents comprehensive evaluations of parallel double Divide and Conquer for singular value decomposition on a super computer, HPC2500. For bidiagonal SVD, double Divide and Conquer was proposed. It first computes singular values by a compact version of Divide and Conquer. the corresponding singular vectors are then computed by twisted factorization. the speed and accuracy of double Divide and Conquer are as good or even better than standard algorithms such as QR and the original Divide and Conquer. Moreover, it shows high scalability even on a PC cluster, distributed memory architecture. parallel algorithm of dDC and numerical results in some architectural options, matrix sizes and types on HPC2500, SMP cluster is shown.
In this paper detailed analysis of discrete state observers from the point of view of potential time expenses required for the observer computation is presented. the aim of this article is to present, what is the infl...
详细信息
ISBN:
(纸本)9780889866331
In this paper detailed analysis of discrete state observers from the point of view of potential time expenses required for the observer computation is presented. the aim of this article is to present, what is the influence of choice of observer parameters on computation time consumption. the attention is paid most of all on the analysis of the possibility of application of the control systems in a distributed version in the context of discrete state reconstruction. All of numerical tests are conducted in the LAM/MPI environment.
Many prediction techniques based on historical data have been proposed to reduce over-estimations of job runtimes provided by users. they were shown to improve the accuracy of runtime estimates and scheduling performa...
详细信息
ISBN:
(纸本)9780889867048
Many prediction techniques based on historical data have been proposed to reduce over-estimations of job runtimes provided by users. they were shown to improve the accuracy of runtime estimates and scheduling performance of backfill policies, according to particular error metrics and average performance measures. However, using a more complete set of performancemeasures and a new error metric, we show potential performance problems of using previous prediction techniques for job scheduling. Furthermore, we show simply adding half of the requested runtime to each initial prediction greatly reduces the problems.
Large-scale graph problems are becoming increasingly important in science and engineering. the irregular, sparse instances are especially challenging to solve on cache-based architectures as they are known to incur er...
详细信息
ISBN:
(纸本)9780889867048
Large-scale graph problems are becoming increasingly important in science and engineering. the irregular, sparse instances are especially challenging to solve on cache-based architectures as they are known to incur erratic memory access patterns. Yet many of the algorithms also exhibit some degree of regularity with memory accesses. It is important to characterize the locality behavior in order to bridge the gap between algorithm and architecture. In our study we quantify the locality of several fundamental graph algorithms, both sequential and parallel, and correlate our observations withthe algorithmic design. Our study of locality behavior brings insight into the impact of different cache architectures on the performance of both sequential and parallel graph algorithms.
Data clustering is a common technique for data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Due to the continuous increase of...
详细信息
ISBN:
(纸本)9780889867048
Data clustering is a common technique for data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Due to the continuous increase of datasets size and the intensive computation of clustering algorithms when used for analyzing large datasets, developing of efficient clustering algorithms is needed for processing time reduction. this paper describes the design and implementation of a recently developed clustering algorithm RACAL [1], which is a RAdius based Clustering ALgorithm. the proposed parallel algorithm (PRACAL) has the ability to cluster large datasets of high dimensions in a reasonable time, which leads to a higher performance computing.
We are developing a task parallel script language named MegaScript for mega-scale parallel processing. MegaScript regards sequential/parallel programs as tasks, and controls them for massively parallel execution. Alth...
详细信息
ISBN:
(纸本)9780889867048
We are developing a task parallel script language named MegaScript for mega-scale parallel processing. MegaScript regards sequential/parallel programs as tasks, and controls them for massively parallel execution. Although MegaScript programs require optimizations and extensions specific to the application and the computing environment, modifying the runtime system or task programs greatly reduces portability and reusability. To satisfy these conflicting requirements, we propose a user-level dynamic extension scheme named Adapter. In this scheme, the user defines a customization code and hooks to it a specific event. the runtime system calls back the code for the event locally, enabling it to extend or optimize system behavior without modifying the runtime or task programs. the results of our evaluation of the scheme show that the overhead and programming cost are both small enough for practical use.
Speculative Locking protocol (SL) is a concurrency control protocol that allows for parallel execution of conflicting transactions through a method of multilevel lending and versioning. the SL protocol shows performan...
详细信息
ISBN:
(纸本)9780889867048
Speculative Locking protocol (SL) is a concurrency control protocol that allows for parallel execution of conflicting transactions through a method of multilevel lending and versioning. the SL protocol shows performance improvements over the standard two-phase locking (2PL) protocol, but relies on several assumptions that would make it unsuitable in real-world scenarios. In this paper, we have proposed an adaptive speculative locking (ASL) protocol that improves performance of real-time distributed database systems by augmenting the SL protocol with four features: distributed real-time database system support;simultaneous multi-threading or page execution;control of transaction execution through transaction queue management;and restricting system memory through the use of virtual memory. the simulation results demonstrate the superiority of the ASL protocol over the SL protocols through the reduction of data contention caused by finite memory and the overall increase in transaction throughput.
Currently, clusters of shared memory symmetric multiprocessors (SMPs) are one of the most common parallelcomputingsystems, for which some existing environments have between 8 to 32 processors per node. Examples of s...
详细信息
ISBN:
(纸本)9780889867048
Currently, clusters of shared memory symmetric multiprocessors (SMPs) are one of the most common parallelcomputingsystems, for which some existing environments have between 8 to 32 processors per node. Examples of such environments include some supercomputers: DataStar p655 (P655 and P655m) and P690 at the San Diego Supercomputing Center, and Seaborg and Bassi at the DOE National Energy Research Scientific computing Center. In this paper, we quantify the performance gap resulting from using different number of processors per node for application execution (for which we use the term processor partitioning), and conduct detailed performance experiments to identify the major application characteristics that affect processor partitioning. We use the STREAM memory benchmarks and Intel's MPI benchmarks to explore the performance impact of different application characteristics. the results are then utilized to explain the performance results of processor partitioning using three NAS parallel Application benchmarks. the experimental results indicate that processor partitioning can have a significant impact on performance of a parallel scientific application as determined by its communication and memory requirements.
暂无评论