Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and ...
详细信息
Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and efficient usage of such infrastructure in a commercial context where several entities compete for shared resources. this has long been resolved for conventional utility resources such as gas and electricity through commoditization, a variety of market designs, customization, and decision support for the resulting portfolios of assets and commitments. the paper reviews the state of Grid commercialization and compares it to the commercialization of conventional resources. We draw specific lessons for commercialized Grids and detail them as architecture requirements at each level of the architecture stack. We provide an example to illustrate the benefits of commercialized resources in terms of the financial clarity it brings to decisions for different user groups, namely application users and IT managers.
the limited amount of instruction-level parallelism inherent in applications is a limiting factor for improving the performance of most conventional microprocessors. A promising solution to overcome this problem is to...
详细信息
the limited amount of instruction-level parallelism inherent in applications is a limiting factor for improving the performance of most conventional microprocessors. A promising solution to overcome this problem is to exploit coarser granularities of parallelism. In this paper, we propose exploiting loop-level parallelism in a multithreaded fashion. We use the Shift architecture as a baseline architecture, with improved compiler support and register file. the compiler converts iterations of a loop into threads, to be executed by multiple processing elements. the hardware provides a selective register shifting mechanism in order to allow the execution of loops containing loop-carried data dependences, which are very difficult to execute by using conventional architectures. In this paper, we simulate and discuss the parameters of major importance for the implementation of this architectural approach. Our initial results show that, on two simple numerical benchmarks, a considerable amount of iteration overlapping can be potentially achieved by an implementation of the Shift architecture, in comparison with a multiprocessor machine.
In systems consisting of multiple clusters of processors such as our Distributed ASCI Supercomputer (DAS), jobs may request co-allocation, i.e., the simultaneous allocation of processors in different clusters. We simu...
详细信息
In systems consisting of multiple clusters of processors such as our Distributed ASCI Supercomputer (DAS), jobs may request co-allocation, i.e., the simultaneous allocation of processors in different clusters. We simulate such systems ignoring communication among the tasks of jobs, and determine the response times for different types and sizes of job requests, and for different numbers and sizes of clusters. In many cases we also compute or approximate the maximum utilization. We find that the numbers and sizes of the clusters and of the job components have a strong impact on performance, and that in many cases co-allocation is a viable choice.
the emerging distributed applications require end-to-end support for various quality-of-service (QoS) aspects, including bandwidth, latency, jitter, and dependability. the Common Object Request Broker architecture (CO...
详细信息
the emerging distributed applications require end-to-end support for various quality-of-service (QoS) aspects, including bandwidth, latency, jitter, and dependability. the Common Object Request Broker architecture (CORBA) is an open distributed object computing infrastructure, standardized by the Object Management Group (OMG). Its goal is to minimize the effort required to develop high-quality systems by composing applications using reusable software components. However, CORBA is not sufficient to provide QoS guarantees in a distributed environment as it lacks capabilities to specify and enforce QoS constraints. For enforcing these constraints, OMG came up with Real-Time CORBA (RTCORBA) specification that can guarantee end-to-end predictability in a complex realtime system. Currently, RTCORBA specification addresses the location transparency, but does not address the resource monitoring and management issues critical for meeting the QoS constraints in a multi-server distributed processing environment. In this paper, we identify the functionalities that are required to ensure predictability in such an environment and propose a resource management framework for RTCORBA. We present its design, a prototype implementation using ORBit ORB, and an example to illustrate its use.
To improve performance of scientific applications in parallel and distributed environments, dynamic scheduling algorithms for parallel loops have been proposed. Such algorithms address performance degradations due to ...
the adequate occupation of the computing resources can influence, in a decisive way, the global performance of the system. therefore, in order to achieve a highperformance, it is mandatory to know all the computing r...
详细信息
the adequate occupation of the computing resources can influence, in a decisive way, the global performance of the system. therefore, in order to achieve a highperformance, it is mandatory to know all the computing resources involved and their respective occupation level in a certain moment. Withthe objective of improving the system performance, the paper presents the OpenTella model to update the information related to the occupation of resources and the respective analysis of this occupation so that the migration of processes among computers of a same cluster can be completed. Withthe objective of increasing the scale level in the system and decreasing the number of messages among the computers, this peer-to-peer protocol defines sub-nets, which are clusters that make up a more comprehensive cluster. thus, groups are defined to interchange information and update the occupation of resources, in order to minimize the communication and to achieve a calculation to balance the load and meet the system needs, resulting in the migration of processes.
the benefits of distributed computation present complex security considerations beyond those associated withthe traditional computing paradigm. this paper describes a bandwidth efficient approach to authenticate dist...
详细信息
the benefits of distributed computation present complex security considerations beyond those associated withthe traditional computing paradigm. this paper describes a bandwidth efficient approach to authenticate distributed Java code. Our system utilizes steganographic techniques to embed a cryptographic checksum as a tamper detection mark into Java class files. the properties of this mark make our system desirable in applications where low bandwidth utilization is a requirement (e.g., wireless networks and low power devices). We implemented our system in Java and evaluated its performancethrough an empirical study. the analysis indicates that our system detects any degree of alteration to a marked Java class file and can do so within a reasonable amount of time.
Non-deterministic choice supports efficient parallel speculation, but unrestricted non-determinism destroys the referential transparency of purely-declarative languages by removing unfoldability and it bears the dange...
详细信息
Non-deterministic choice supports efficient parallel speculation, but unrestricted non-determinism destroys the referential transparency of purely-declarative languages by removing unfoldability and it bears the danger of wasting resources on unnecessary computations. While numerous choice mechanisms have been proposed that preserve unfoldability, and some concurrent implementations exist, we believe that no compiled parallel implementation has previously been constructed this paper presents the design, semantics, implementation and use of a family of bottom-avoiding choice operators for Glasgow parallel Haskell. the subtle semantic properties of our choice operations are described, including a careful classification using an existing framework, together with a discussion of operational semantics issues and the pragmatics of distributed memory implementation. the expressiveness of our choice operators is demonstrated by constructing a branch and bound search, a merge and a speculative conditional. their effectiveness is demonstrated by comparing the parallel performance of the speculative search with naive and 'perfect' implementations. their efficiency is assessed by measuring runtime overhead and heap consumption.
the increasing gap between processor and memory performance has led to new architectural models for memory-intensive applications. In this paper, we use a set of memory-intensive benchmarks to evaluate a mixed logic a...
暂无评论