We consider the problem of determining parallel complexity of solving banded triangular linear systems using substitution on a k-dimensional torus network. We present lower bounds on execution time for solving these s...
详细信息
ISBN:
(纸本)088986392X
We consider the problem of determining parallel complexity of solving banded triangular linear systems using substitution on a k-dimensional torus network. We present lower bounds on execution time for solving these systems, taking into account communication costs. Furthermore, optimal algorithms are designed.
High performance computingsystems and cluster computers are becoming so cost-effective that even small research groups can afford them. Hence, efforts to take advantage of these widely distributed resources are becom...
详细信息
High performance computingsystems and cluster computers are becoming so cost-effective that even small research groups can afford them. Hence, efforts to take advantage of these widely distributed resources are becoming popular. Although recent projects provide resource management and job scheduling to support groups of computational resources across the country working together on massive problems, they have not yet fully addressed how distributedparallel programs will communicate. Therefore, we propose a new paradigm to support cluster-to-cluster (C2C) communications, which handles run-time communications between parallel programs running on distributed clusters.
The correctness of applications that perform asynchronous message passing typically relies on the underlying hardware having a sufficient amount of memory (message buffers) to hold all undelivered messages-such applic...
详细信息
ISBN:
(纸本)9780889866386
The correctness of applications that perform asynchronous message passing typically relies on the underlying hardware having a sufficient amount of memory (message buffers) to hold all undelivered messages-such applications may deadlock when executed on a system with an insufficient number of message buffers. Thus, determining the minimum number of buffers that an application needs to prevent deadlock is an important task when writing or porting parallel applications. Unfortunately, both this problem (called the Buffer Allocation Problem) and the simpler problem of determining whether an application may deadlock for a given number of available message buffers are intractable [1]. We present a new epoch-based polynomial-time approach for approximating the Buffer Allocation Problem. Our approach partitions application executions into epochs and intersperses barrier synchronizations between them, thus limiting the number of message buffers necessary to ensure deadlock-freedom. This approach produces near optimal solutions for many common cases and can be adapted to guide application modifications that ensure deadlock freedom when the application is ported. Lastly, we describe a space-time trade-off between the number of available message buffers and the number of barrier synchronizations, and describe how this trade-off can be used to fine-tune application performance.
For an initial study in divisible load scheduling, an optimal computing power allocation problem in a distributedparallelcomputing grid involving two sources and a sink is considered. The objective is to optimally a...
详细信息
ISBN:
(纸本)9780889868205
For an initial study in divisible load scheduling, an optimal computing power allocation problem in a distributedparallelcomputing grid involving two sources and a sink is considered. The objective is to optimally allocate the computing power of the sink in the grid in a such way that the total parallelcomputing finish time of the entire load is equalized to the sequential computing finish time while utilizing the full computing power. A numerical method to calculate the optimal adaptive computing power via a deterministic analysis is presented under several computing constraints. Performance of the computing power adaptation is modeled and evaluated. For performance evaluation, we define average computing finish time.
In this paper, we present the design of SAgent, a general-purpose mobile agent security framework. SAgent is designed for comprehensive protection of mobile agent computations and data in potentially hostile environme...
详细信息
ISBN:
(纸本)9780889866386
In this paper, we present the design of SAgent, a general-purpose mobile agent security framework. SAgent is designed for comprehensive protection of mobile agent computations and data in potentially hostile environments and works with the JADE (Java Agent DEvelopment) platform [1], a FIPA-compliant multi-agent environment. Using good software engineering design techniques of software reusability and abstraction, SAgent allows agent protection protocols and applications to be developed independently of each other. To accomplish this, a clean conceptual framework is presented which encapsulates in several general class interfaces the common security functionality required by secure agent applications. Since SAgent is designed to generically protect the computations of mobile agent applications, we provide implementations of two secure multiagent protocols that protect the confidentiality of agent data as well as implementations of four methods that protect the integrity of mobile agent data. Experimental results showing the feasibility of these methods are available in separate publications [2] [3]. The goal of SAgent is to provide a framework where proposed theoretical techniques can be used and experimentally evaluated. SAgent allows a new security provider to implement and experiment with new techniques for protecting mobile agents in a well-defined manner and is generic enough to support both software-based and hardware-based protections.
We introduce a new view of distributed computation, called the NavP view, under which a distributed program is composed of multiple sequential self-migrating threads called DSCs. In contrast with those in the conventi...
详细信息
ISBN:
(纸本)088986392X
We introduce a new view of distributed computation, called the NavP view, under which a distributed program is composed of multiple sequential self-migrating threads called DSCs. In contrast with those in the conventional SPMD style, programs developed in the NavP view exhibit the nice properties of algorithmic integrity and parallel program composition orthogonality, which make them clean and easy to develop and maintain. The NavP programs are also scalable. We use example code and performance data to demonstrate the advantages of using the NavP view for general purpose distributedparallel programming.
The HBSP(Heterogeneous Bulk-Synchronous parallel) model is an asynchronous parallelcomputing model, whose communication features are abstracted by some parameters. In this paper, we present parallel algorithms for th...
详细信息
ISBN:
(纸本)088986392X
The HBSP(Heterogeneous Bulk-Synchronous parallel) model is an asynchronous parallelcomputing model, whose communication features are abstracted by some parameters. In this paper, we present parallel algorithms for the n × n 2D-data partition on the HBSP model, and we also present a method converting the BSP algorithms processing with 2D-data into the HBSP algorithms. Using our converting method, some BSP algorithms such as matrix multiplication or all-pairs shortest paths can be converted into efficient HBSP algorithms.
We study the behavior of a new load balancing scheme applied to adaptive task partitioning for multivariate integration. parallel performance and scalability are analyzed. Performance results are given for test famili...
详细信息
We study the behavior of a new load balancing scheme applied to adaptive task partitioning for multivariate integration. parallel performance and scalability are analyzed. Performance results are given for test families of functions with significant irregular behavior. The effects of certain integrand characteristics can be accounted for in the strategy. The scheme is incorporated in the PARINT package for parallel multivariate integration.
In recent times, power consumption and heat dissipation have become crucial for designing parallel computer architecture. Multi-level cache memory organization in multiprocessor/multicore system increases total power ...
详细信息
In recent times, power consumption and heat dissipation have become crucial for designing parallel computer architecture. Multi-level cache memory organization in multiprocessor/multicore system increases total power consumption as cache is very power hungry. Studies suggest that there are realistic opportunities to increase performance/power ratio of parallel architectures by rearranging and multi-using its cache memory organization. In this paper, we propose a novel approach to reduce the total power consumption and mean memory latency of multicore systems by introducing a versatile victim cache (WC). In addition to functioning as a regular victim cache, proposed WC holds block address and miss information (BAMI) entries, supports stream buffering, and entirely eliminates the need of cache locking. We simulate a quad-core system that has private level-1 cache (CL1), shared level-2 cache (CL2), WC in between CLls and CL2, and the main memory. We run the simulation programs using popular multimedia workloads including MPEG-4 and H.264/AVC. Experimental results show that the quad-core system with proposed WC decreases the mean memory latency and total power consumption by up to 38% and 31%, respectively, when compared with a CL2 cache locking system without WC.
The severe energy constraints of wireless sensor networks (WSNs) require energy-efficient communication protocols in order to fulfill the objectives of the application. Cross-layer design is a technique which can pote...
详细信息
ISBN:
(纸本)9780889867048
The severe energy constraints of wireless sensor networks (WSNs) require energy-efficient communication protocols in order to fulfill the objectives of the application. Cross-layer design is a technique which can potentially be used to improve the overall performance of WSNs by way of jointly optimizing and exploiting the interactions between various layers of the network protocol stack. In this paper, we propose a cross-layer framework design for the Embedded Middleware in Mobility Applications (EMMA) project. This optimization agent based framework design provides efficient data exchange between the various protocols layers via a state repository to improve the performance of WSN applications in terms of memory consumption and processing overhead.
暂无评论