Achieving 100 TeraOps performance within a ten-year horizon will require massively-parallel architectures that exploit both commodity software and hardware technology for cost efficiency. Increasing clock rates and sy...
详细信息
ISBN:
(纸本)0818675519
Achieving 100 TeraOps performance within a ten-year horizon will require massively-parallel architectures that exploit both commodity software and hardware technology for cost efficiency. Increasing clock rates and system diameter in clock periods will make efficient management of communication and coordination increasingly critical. Configurable logic presents a unique opportunity to customize bindings, mechanisms, and policies which comprise the interaction of processing, memory, I/O and communication resources. this programming flexibility, or customizability, can provide the key to achieving robust highperformance. the MultiprocessOr with Reconfigurable Parallel Hardware (MORPH) uses reconfigurable logic blocks integrated withthe system core to control policies, interactions, and interconnections. this integrated configurability can improve the performance of local memory hierarchy, increase the efficiency of interprocessor coordination, or better utilize the network bisection of the machine. MORPH provides a framework for exploring such integrated application-specific customizability. Rather than complicate the situation, MORPH's configurability supports component software and interoperabililty frameworks, allowing direct support for application-specified patterns, objects, and structures. this paper reports the motivation and initial design of the MORPH system.
In this paper, we address the problem of supporting highperformance Distributed computing (HPDC) applications running over ATM networks. For this purpose, we consider a logically separate subnetwork for these applica...
详细信息
ISBN:
(纸本)0818675829
In this paper, we address the problem of supporting highperformance Distributed computing (HPDC) applications running over ATM networks. For this purpose, we consider a logically separate subnetwork for these applications. After presenting an architectural reference model for the HPDC subnetwork and distinguishing which functions should be installed over the ATM network in order to satisfy the needs of HPDC applications, we propose two mechanisms that aim at optimizing communications by taking advantage of boththe special properties of HPDC traffic and the cell-based nature of ATM. the performance of these mechanisms is evaluated and compared withthat achieved by the SSCOP protocol. the results show that when the ATM network experiences high load and the HPDC applications make an intensive use of arrays, cell-based mechanisms become more robust than standard SSCOP and provide low latency and efficient cell loss recovery. Since both situations are very likely to occur in HPDC environments, we conclude that the introduction of cell-based retransmission mechanisms does contribute to enhance performance of HPDC systems over ATM networks.
this paper studies the problem of making distributed decisions withthe goal of maximizing a given utility function. the focus is entirely on the design and performance of the voting strategy and complete abstract fro...
详细信息
ISBN:
(纸本)9780897918008
this paper studies the problem of making distributed decisions withthe goal of maximizing a given utility function. the focus is entirely on the design and performance of the voting strategy and complete abstract from implementational details, such as how ballots are collected from the voters and how the results of the election is communicated to the voters. the main interest is in giving a tight estimate of the cumulative profit achievable in a worst-case setting.
Fast networks have made it possible to coordinate distributed heterogeneous CPU, memory, and storage resources to provide a powerful platform for executing high-performance applications. However, the performance of th...
详细信息
ISBN:
(纸本)0818675829
Fast networks have made it possible to coordinate distributed heterogeneous CPU, memory, and storage resources to provide a powerful platform for executing high-performance applications. However, the performance of these applications on such systems is highly dependent on the allocation and efficient coordination of application tasks. A key component for a performance-efficient allocation strategy is a predictive model which provides a realistic estimate of application performance under varying resource loads. In this paper, we present a model for predicting the effects of contention on application behavior in heterogeneous systems. In particular, our model calculates the slowdown imposed on communication and computation for non-dedicated two-machine heterogeneous platforms. We describe the model for the Sun/CM2 and Sun/Paragon coupled heterogeneous systems. We present experiments on production systems with emulated contention which show the predicted communication and computation costs to be within 15% on average of actual costs.
Previously, algebraic properties of operations of abstract data types were used to generalized certain results and when exactly a single operation of a concurrent abstract data type could be made 'fast' in a l...
详细信息
ISBN:
(纸本)9780897918008
Previously, algebraic properties of operations of abstract data types were used to generalized certain results and when exactly a single operation of a concurrent abstract data type could be made 'fast' in a linearizable implementation was determined;the exact determination would have identified a set of algebraic properties both necessary and sufficient for allowing an operation to be fast. Unfortunately, the determination was not completed because there are operations which are not self-oblivious but are immediately self-commuting. Determining when a single operation can be optimized is an interesting problem because often a specific operation is known to be invoked most frequently in a given application. Optimizing that operation could lead to noticeable performance gains.
the Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure sha...
详细信息
ISBN:
(纸本)0818675829
the Legion project at the University of Virginia is an architecture for designing and building system services that provide the illusion of a single virtual machine to users, a virtual machine that provides secure shared object and shared name spaces, application adjustable fault-tolerance, improved response time, and greater throughput. Legion targets wide area assemblies of workstations, supercomputers, and parallel supercomputers. Legion tackles problems not solved by existing workstation based parallel processing tools;the system will enable fault-tolerance, wide area parallel processing, inter-operability, heterogeneity, a single global name space, protection, security, efficient scheduling, and comprehensive resource management. this paper describes the core Legion object model, which specifies the composition and functionality of Legion's core objects - those objects that cooperate to create, locate, manage, and remove objects in the Legion system. the object model facilitates a flexible extensible implementation, provides a single global name space, grants site autonomy to participating organizations, and scales to millions of sites and trillions of objects.
Recent advances in computing, like high-speed networks and data-compression, make extensible distributed multimedia applications a challenging application-domain of distributed systems. Such applications like VoD (Vid...
详细信息
Recent advances in computing, like high-speed networks and data-compression, make extensible distributed multimedia applications a challenging application-domain of distributed systems. Such applications like VoD (Video on Demand) or real-time conferencing are characterized by QoS (quality of service) requirements which depend on the quality of video and sound transmitted to the client and on the respect of time constraints associated to video and audio data. Much work has been done in order to provide system support aimed at meeting these requirements. However, existing proposals do not integrate the consequence of failure occurrence on the guaranteed QoS. To deal withthis issue, we propose a resource reservation model that integrates availability requirements of multimedia services in addition to the QoS constraints introduced above. Our paper details the resulting model together with its integration in a distributed system. In particular, we show how the model implementation can be customized in the case of a VoD server.
the design philosophy and implementation of the BALANCE system is described in this paper. BALANCE is a flexible, network independent and computerarchitecture independent load balancing system which is designed to su...
详细信息
the design philosophy and implementation of the BALANCE system is described in this paper. BALANCE is a flexible, network independent and computerarchitecture independent load balancing system which is designed to support a wide range of software, including parallel and distributed applications as well as schedulers. the generic server and server system call structures are used as bases to enhance flexibility and to build complex services. BALANCE is not tied to a particular scheduling algorithm, rather the users are allowed to build their own schedulers. To demonstrate the flexibility and power of BALANCE, a set of system services and scheduling algorithms has been implemented and evaluated. A new delay scheduling algorithm which postpones the execution of jobs in high load situation is proposed. It is shown that this algorithm effectively improves system throughput and yet bounds the response times for the console commands.
this paper presents a benchmark for dependable systems. the benchmark consists of two metrics, number of catastrophic incidents and performance degradation, which are obtained by a tool that (1) generates synthetic wo...
详细信息
this paper presents a benchmark for dependable systems. the benchmark consists of two metrics, number of catastrophic incidents and performance degradation, which are obtained by a tool that (1) generates synthetic workloads that produce a high level of CPU, memory, and I/O activity and (2) injects CPU, memory, and I/O faults according to an injection strategy. the benchmark has been installed on two TMR-based prototype machines: TMR Prototype A and TMR Prototype B. An implementation for a third prototype, is based on a duplex architecture, is in progress. the results demonstrate the utility of the benchmark in comparing the system-level fault tolerance of these machines and in providing insight into their design. In particular, the benchmark shows that Prototype B suffers fewer catastrophic incidents than Prototype A under the same workload conditions and fault injection method. However, Prototype B also suffers more performance degradation in the presence of faults, which might be an important concern for time-critical applications.
Distributed services are often provided by process groups for purposes of reliability, availability, and performance. It is often important for the members of a such a group to have a consistent view of the group'...
详细信息
Distributed services are often provided by process groups for purposes of reliability, availability, and performance. It is often important for the members of a such a group to have a consistent view of the group's membership. For this reason, membership services are important part of many distributed software systems. Despite their importance, the specification and implementation of membership services in completely asynchronous systems has challenged researchers. Recent papers have demonstrated that earlier specifications are either unsolvable or admit trivial solutions. Informally, membership services require a kind of agreement among processes and it has been shown that it is impossible to solve many consensus-like problems in completely asynchronous systems. If the specification of membership service is nearly as strong as that of consensus, the specification will be unsolvable. If it is much weaker, its solutions may be useless. this paper provides an alternative specification of group membership and exhibits an algorithm that satisfies it. the specification is solvable in spite of earlier impossibility results because it permits executions in which all processes are evicted from the process group yet none ever learns that the group has become empty. this represents a weakening of earlier specifications, which required that, at all times, at least one process be aware of a group's membership. However, the new specification cannot be trivially satisfied because it prohibits a potential solution from arbitrarily removing a process for no reason. this specification thus represents an important step towards a better understanding of membership services in completely asynchronous systems.
暂无评论