Random walks constitute an attractive technique in distributed computing. In this paper, we present an original method using relationship between electrical resistance and random walks, to automatically compute quanti...
详细信息
ISBN:
(纸本)0769520693
Random walks constitute an attractive technique in distributed computing. In this paper, we present an original method using relationship between electrical resistance and random walks, to automatically compute quantities such as cover time, and more generally any processing time measure defined through hitting times. This method comes from electrical theory by using Millman's theorem.
parallel I/O performance is crucial to sustaining scientific applications on large-scale High-Performance Computing (HPC) systems. However, I/O load imbalance in the underlying distributed and shared storage systems c...
详细信息
ISBN:
(纸本)9781728112466
parallel I/O performance is crucial to sustaining scientific applications on large-scale High-Performance Computing (HPC) systems. However, I/O load imbalance in the underlying distributed and shared storage systems can significantly reduce overall application performance. There are two conflicting challenges to mitigate this load imbalance: (i) optimizing system-wide data placement to maximize the bandwidth advantages of distributed storage servers, i.e., allocating I/O resources efficiently across applications and job runs;and (ii) optimizing client-centric data movement to minimize I/O load request latency between clients and servers, i.e., allocating I/O resources efficiently in service to a single application and job run. Moreover, existing approaches that require application changes limit wide-spread adoption in commercial or proprietary deployments. We propose iez, an "end-to-end control plane" where clients transparently and adaptively write to a set of selected I/O servers to achieve balanced data placement. Our control plane leverages real-time load information for distributed storage server global data placement while our design model leverages trace-based optimization techniques to minimize I/O load request latency between clients and servers. We evaluate our proposed system on an experimental cluster for two common use cases: synthetic I/O benchmark IOR for large sequential writes and a scientific application I/O kernel, HACC-I/O. Results show read and write performance improvements of up to 34% and 32%, respectively, compared to the state of the art.
The analysis of time and realizability of parallel solving complex problems on distributed computer systems (CS) is presented. The derivation of equation for calculating the efficiency indices is based on the assumpti...
详细信息
ISBN:
(纸本)0769525547
The analysis of time and realizability of parallel solving complex problems on distributed computer systems (CS) is presented. The derivation of equation for calculating the efficiency indices is based on the assumption that the time of problem solution on CS is a function of time of problem solution on one elementary machine, and the function has a finite number of discontinuities. The discontinuities have the probabilistic character and correspond to the CS failures that require reconfiguration of the CS (structure readjustability with regard to working machine only). A notion of complex CS reconfiguration is introduced and the reconfiguration is investigated. A set of integral equations for calculating the function of realizability of problem solution on distributed CSs is derived. A parallel algorithm for its computing is described.
Approximate computing (AC) provides an efficient solution for reducing power, area, and complexity of digital systems. When backed with distributed arithmetic (DA), AC leverages the ability to implement ultra-efficien...
详细信息
ISBN:
(纸本)9781665424615
Approximate computing (AC) provides an efficient solution for reducing power, area, and complexity of digital systems. When backed with distributed arithmetic (DA), AC leverages the ability to implement ultra-efficient inner-product units in terms of area, power, and delay. Such units can be used in any inherently resilient application. This paper presents a novel scheme of approximate inner-product based on parallel DA for low-power fault-tolerant applications backed with a novel in-situ sliding window algorithm. Our model eliminates the need for an explicit error correction scheme, which further reduces the overhead while improving the accuracy. The experimental results show that our model achieves a state-of-the-art performance in terms of power delay product (PDP), area power product (APP) with a reduction of 39.26% and 48.83%, respectively.
The Bag-of-Tasks (BoT) behaviour has recently drawn the attention of scheduling researchers [2, 36, 37] and seems to be very common in workloads of parallelsystems (up to 70% of jobs [27]) and grids (up to 96% of the...
详细信息
ISBN:
(纸本)9781450305525
The Bag-of-Tasks (BoT) behaviour has recently drawn the attention of scheduling researchers [2, 36, 37] and seems to be very common in workloads of parallelsystems (up to 70% of jobs [27]) and grids (up to 96% of the total CPU time is consumed by BoTs [11]). To enable a reliable evaluation of BoT-oriented scheduling algorithms, researchers require realistic workload models that take BoTs into account. Regrettably, very few such models are available in the liturature. To our best knowledge, there are only two studies on modeling that incorporate BoTs into their models to generate synthetic workloads for parallelsystems [27] and grids [12]. However, these models only focus on fitting the marginal distributions and neglect several other statistical properties of BoTs such as periodicity, autocorrelation and cross-correlation among BoT attributes. We believe that these crucial characteristics deserve to be taken into account in modeling research. Therefore in this paper, we will focus on characterising the BoT behaviour to further improve researchers' understanding of this well-known behaviour in parallel system workloads. In addition, we also study how BoTs affect parallel system performance. Our experimental results indicate that the presence of BoTs leads to a considerable performance degradation, but it is interesting that a realistic association between job arrivals and job runtimes helps BoTs to improve the performance of parallelsystems. Moreover, we also show the necessity of using workloads with BoTs in scheduling evaluation to obtain reliable results.
暂无评论