Recent investigations into resilience of large-scale high-performance computing (HPC) systems showed a continuous trend of decreasing reliability and availability. Newly installed systems have a lower mean-time to fai...
详细信息
ISBN:
(纸本)9780889867840
Recent investigations into resilience of large-scale high-performance computing (HPC) systems showed a continuous trend of decreasing reliability and availability. Newly installed systems have a lower mean-time to failure (MTTF) and a higher mean-time to recover (MTTR) than their predecessors. Modular redundancy is being used in many mission critical systems today to provide for resilience, such as for aerospace and command & control systems. The primary argument against modular redundancy for resilience in HPC has always been that the capability of a HPC system, and respective return on investment, would be significantly reduced. We argue that modular redundancy can significantly increase compute node availability as it removes the impact of scale from single compute node MTTR. We further argue that single compute nodes can be much less reliable, and therefore less expensive, and still be highly available, if their MTTR/MTTF ratio is maintained.
A distributed data store can satisfy two properties out of three properties which are (strict) consistency, availability and partition-tolerance. In case of distributed data stores satisfying availability and partitio...
详细信息
ISBN:
(纸本)9780889869790
A distributed data store can satisfy two properties out of three properties which are (strict) consistency, availability and partition-tolerance. In case of distributed data stores satisfying availability and partition-tolerance, they can satisfy weak consistency, especially causal consistency, which is the strongest consistency that can cohabit with other two properties. Moreover, if any networks between nodes have no problem and very low latency, the distributed data store can satisfy stronger consistency than causal consistency. Sequential consistency is one of the stronger consistency than causal consistency. In order to satisfy sequential consistency, a distributed data store needs to equalize an order of data changing in all nodes. In this paper, we propose distributed data store model containing special nodes "casting nodes" and algorithms in order to decide an order of operations. Thanks to the casting nodes, our model can satisfy sequential consistency when all networks can connect, and our model can satisfy causal consistency when any networks disconnect.
With the richness of present-day hardware architectures, research effort has been going into tightening the revealed synergy between hardware and software. A large focus has been put on the creation of software tools ...
详细信息
ISBN:
(纸本)9780889867840
With the richness of present-day hardware architectures, research effort has been going into tightening the revealed synergy between hardware and software. A large focus has been put on the creation of software tools to facilitate hardware design. Moreover, enormous efforts have been invested to develop high-level methodologies, formal techniques, parallelization procedures, and synthesis tools that target state-of-the-art hardware architectures including Field-programmable Gate Arrays (FPGAs). In this paper, we explore the effectiveness of a formal methodology in the design of parallel versions of the current Advanced Encryption Standard {AES), namely, the Rijndael cryptographic algorithm. The suggested methodology adopts a functional programming notation for specifying algorithms and for reasoning about them. The parallel behavior of the specification is then derived and mapped onto hardware. Several parallel AES implementations are developed with different performance characteristics. The refined designs are tested under Celoxica's RC-1000 reconfigurable computer with its 2 million gates Virtex-E FPGA. Performance analysis and evaluation of the proposed implementations are included.
The idea behind Cloud computing is to deliver Infrastructure-, Platform-, and Software as a Service (IaaS, PaaS, and SaaS) on a simple pay-per-use basis. In this paper, we introduce our work, OSGi Service Platform as ...
详细信息
ISBN:
(纸本)9780889868649
The idea behind Cloud computing is to deliver Infrastructure-, Platform-, and Software as a Service (IaaS, PaaS, and SaaS) on a simple pay-per-use basis. In this paper, we introduce our work, OSGi Service Platform as a Service (OSPaaS), a PaaS model for running an OSGi service platform in the cloud for e-Learning and teaching purposes. OSPaaS leverages OpenNebula, a virtual infrastructure manager, to dynamically launch virtual machines (VMs) on idle resources or dedicated servers. In addition, OSPaaS uses Shibboleth as a Single Sign-On mechanism for seamless authentication and authorization. To assess the suitability of OSGi for cloud computing, this paper investigates and analyzes three OSGi frameworks, i.e. Knopflerfish, Equinox and Apache Felix. Subsequently, an OSPaaS architecture is presented and described. Finally, this paper shows a use case scenario and advantages of OSPaaS for e-Learning & teaching purposes.
Permutations belong to communications patterns demanded frequently in massively-parallel computers especially of the SIMD type. A permutation is said "admissible" to a given interconnection network if it doe...
详细信息
ISBN:
(纸本)9780889867840
Permutations belong to communications patterns demanded frequently in massively-parallel computers especially of the SIMD type. A permutation is said "admissible" to a given interconnection network if it does not cause blockings in that network under a chosen routing algorithm. Determining the admissibility of a given permutation to various static connecting topologies is a fundamental problem. Based on congruence notion from number theory, this paper presents a simple method which solves admissibility problem for regular permutations to uniaxial 2D and 3D tori under deterministic dimension-order routing commonly used in practice. Here "uniaxial" means that in every routing step all data items participating in a permutation can move along the same axis only. It is assumed that all nodes of a system work in a synchronous fashion what is also characteristic to SIMDs. The efficiency of the method is illustrated by the examples with checking admissibility of some frequently used in parallel programming permutations which belong either to Omega or BPC (bit-permute-complement) classes.
In this paper we propose a new load balancing algorithm for the grid computing service. The proposed load balancing is based on the CPU speed of the workers in the grid system. We developed a simulation model using NS...
详细信息
ISBN:
(纸本)9780889866379
In this paper we propose a new load balancing algorithm for the grid computing service. The proposed load balancing is based on the CPU speed of the workers in the grid system. We developed a simulation model using NS2 to evaluate the performance of our load balancing algorithm. Our simulation results show an asymptotically optimal behaviour of our load balancing algorithms.
This paper demonstrates a distributed on-line service selection( probe/access) scheme: optimal stopping web service selection scheme based on the rate of return problem from optimal stopping theory. There are three di...
详细信息
ISBN:
(纸本)9780889869431
This paper demonstrates a distributed on-line service selection( probe/access) scheme: optimal stopping web service selection scheme based on the rate of return problem from optimal stopping theory. There are three differences between our scheme and the conventional schemes. Firstly, it does not need to probe all web services, and only probe a few web services. Secondly, our scheme focuses on maximizing the average QoS(Quality of Service) return per unit of cost over all stages of probe and access for a long period rather than maximizing QoS return per single stage of probe and access in usual schemes. Thirdly, our scheme develops a return function based on three factors: QoS return, user's requirement and probe cost which are seldom considered simultaneously before. Through theory analysis and computation, we demonstrate that compared with the conventional schemes our scheme has additional advantages while achieving same good performances.
Many attempts have been made to optimize the median filter from the software and hardware approach. An architectural design of hardware capable of performing real-time median filtering is presented. The architecture u...
详细信息
ISBN:
(纸本)9780889868205
Many attempts have been made to optimize the median filter from the software and hardware approach. An architectural design of hardware capable of performing real-time median filtering is presented. The architecture uses the histogram approach to calculate the median, while optimizing the sliding window method to reuse all its calculations. Data is output row by row and every input pixel is processed only once. The design is independent of window size or image size, and supports adding more processing elements to support wider images. The control unit design is minimized to enable self-adjustment of plug-and-play processing elements. The architecture is implemented in VHDL and synthesized to a Virtex-2 Pro FPGA. The architecture's performance as well as operation is compared to previous work.
Most modem parallel computers are clusters using Myrinet or Ethernet communication networks. Several studies have been published comparing the performance of these two networks for parallelcomputing, however these fo...
详细信息
ISBN:
(纸本)9780889866379
Most modem parallel computers are clusters using Myrinet or Ethernet communication networks. Several studies have been published comparing the performance of these two networks for parallelcomputing, however these focus on average performance, and do not address the distributions of communication times, which can have long tails due to contention effects. In the case of Ethernet with TCP, retransmit timeouts (RTOs) can also occur. Slow communication events may have significant impact, particularly for applications requiring frequent synchronization, where the performance is determined by the slowest process. We have analysed the distributions of communication times for standard MPI routines on Ethernet with TCP and Myrinet with GM communications networks on the same cluster, and studied the scalability of the distributions as the number of communicating processes is increased, and the effect of RTOs for Ethernet with TCP.
Active and passive replication are powerful techniques to improve the quality of multimedia streaming. Most systems follow either the active or the passive approach. A well known example for active replication are Con...
详细信息
ISBN:
(纸本)9780889866379
Active and passive replication are powerful techniques to improve the quality of multimedia streaming. Most systems follow either the active or the passive approach. A well known example for active replication are Content Distribution networks [8] that replicate data to predefined static locations. In contrast to that, P2P file sharing networks [2, 1] use passive replication where identical content is usually provided by different peers. We suggest a system that combines both techniques using Proxy Affinity, Request Affinity and Replication Affinity considering user preferences, user behaviour, hardware resources and networks capabilities.
暂无评论