In this work, we introduce slot selection and co-allocation algorithms for parallel jobs in distributed computing with non-dedicated and heterogeneous resources. A single slot is a time span that can be assigned to a ...
详细信息
In this work, we introduce slot selection and co-allocation algorithms for parallel jobs in distributed computing with non-dedicated and heterogeneous resources. A single slot is a time span that can be assigned to a task, which is a part of a job. The job launch requires a co-allocation of a specified number of slots starting synchronously. The challenge is that slots associated with different resources of distributed computational environments may have arbitrary start and finish points that do not match. Some existing algorithms assign a job to the first set of slots matching the resource request without any optimization (the first fit type), while other algorithms are based on an exhaustive search. In this paper, algorithms for effective slot selection of linear complexity on an available slots number are studied and compared with known approaches. The novelty of the proposed approach consists of allocating alternative sets of slots. It provides possibilities to optimize job scheduling.
Performing real-time multimedia transmissions through existing telecommunications infrastructures can prove to be economical. Such transmissions can be achieved by compression and decompression of multimedia informati...
详细信息
Performing real-time multimedia transmissions through existing telecommunications infrastructures can prove to be economical. Such transmissions can be achieved by compression and decompression of multimedia information at a higher rate than that at which the information is generated. This paper describes a method of improving decompression time through distributed computing.
Nowadays, Big Data becomes a research focus in industrial, banking, social network, and other fields. In addition, the explosive increase of data and information require efficient processing solutions. Therefore, Spar...
详细信息
Nowadays, Big Data becomes a research focus in industrial, banking, social network, and other fields. In addition, the explosive increase of data and information require efficient processing solutions. Therefore, Spark is considered as a promising candidate of Large-Scale distributed computing Systems for big data processing. One primary challenge is the straggler problem that occurred due to the presence of heterogeneity where a machine takes an extra-long time to finish execution of a task, which decreases the system throughput. To mitigate straggler tasks, Spark adopts speculative execution mechanism, in which the scheduler launches additional backup to avoid slow task processing and achieve acceleration. In this paper, a new Optimized Straggler Mitigation Framework is proposed. The proposed framework uses a dynamic criterion to determine the closest straggler tasks. This criterion is based on multiple coefficients to achieve a reliable straggler decision. Also, it integrates the historical data analysis and online adaptation for intelligent straggler judgment. This guarantees the effectiveness of speculative tasks by improving cluster performance. Experimental results on various benchmarks and applications show that the proposed framework achieves 23.5% to 30.7% execution time reductions, and 25.4 to 46.3% increase of the cluster throughputs compared with spark engine.
After a great advance by the industry on processes automation, an important challenge still remains: the automation under abnormal situations. The first step towards solving this challenge is the Fault Detection and D...
详细信息
After a great advance by the industry on processes automation, an important challenge still remains: the automation under abnormal situations. The first step towards solving this challenge is the Fault Detection and Diagnosis (FDD). This work proposes a batch-incremental adaptive methodology for fault detection and diagnosis based on mixture models trained on a distributed computing environment. The models used are from a family of Parsimonious Gaussian Mixture Models (PGMM), in which the reduced number of parameters of the model brings important advantages when there are few data available, an expected scenario of faulty conditions. On the other hand, a large number of different models rises another challenge, the best model selection for a given behaviour. For that, it is proposed to train a large number of models, using distributed computing techniques, for only then select the best model. This work proposes the usage of the Spark framework, ideal for iterative computations. The proposed methodology was validated in a simulated process, the Tennessee Eastman Process (TEP), showing good results for both the detection and the diagnosis of faults. Furthermore, numeric experiments show the viability of training a large number of models for the best model selection a posteriori. (C) 2016 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
For the past few years we have seen an exponential growth in the number of mobiledevices and in their computation, storage and communication capabilities. We also have seen an increase in the amount of data generated ...
详细信息
For the past few years we have seen an exponential growth in the number of mobiledevices and in their computation, storage and communication capabilities. We also have seen an increase in the amount of data generated by mobile devices while performing common tasks. Additionally, the ubiquity associated with these mobile devices, makes it more reasonable to start thinking in a different use for them, where they will begin to act as an important part in the computation of more demanding applications, rather than relying exclusively on external servers to do *** is also possible to observe an increase in the number of resource demanding applications, whereas these resort to the use of services, offered by infrastructure ***, with the use of these Cloud services, many problems arise, such as: the considerable use of energy and bandwidth, high latency values, unavailability of connectivity infrastructures, due to the congestion or the non existence of it. Considering all the above, for some applications it starts to make sense to do part or all the computation locally in the mobile devices *** propose a distributed computing framework, able to process a batch or a streamof data, which is being generated by a cloud composed of mobile devices, that doesnot require Internet services. Differently from the current state-of-the-art, where bothcomputation and data are offloaded to mobile devices, our system intends to move thecomputation to where the data is, reducing significantly the amount of data exchangedbetween mobile *** on the evaluation performed, both on a real and simulated environment, ourframework has proved to support scalability, by benefiting significantly from the usageof several devices to handle computation, and by supporting multiple devices submitting computation requests while not having a significant increase in the latency of a request. It also proved that is able to deal with churn without being highly penalized by it.
Coded distributed computing (CDC) can alleviate the communication load in distributed computing systems by leveraging coding opportunities via redundant computation. While the optimal computation-communication tradeof...
详细信息
Coded distributed computing (CDC) can alleviate the communication load in distributed computing systems by leveraging coding opportunities via redundant computation. While the optimal computation-communication tradeoff has been well studied for homogeneous systems, it remains largely unknown for heterogeneous systems where workers have different computation capabilities. This paper characterizes the upper and lower bounds of the optimal communication load as two linear programming problems for a general heterogeneous CDC system using the MapReduce framework. Our achievable scheme first designs a parametric data shuffling strategy for any given mapping strategy, and then jointly optimizes the mapping strategy and the data shuffling strategy to obtain the upper bound. The parametric data shuffling strategy allows adjusting the size of the multicast message intended for each worker set, so that it can largely decrease the number of unicast messages and hence increase the communication efficiency. Numerical results show that our achievable communication load is lower than those achieved in existing works. Our lower bound is established by unifying an improved cut-set bound and a peeling method. The obtained upper and lower bounds degenerate to the existing result in homogeneous systems, and coincide with each other when the system is approximately homogeneous or grouped homogeneous.
This paper introduces a renewed gateway to ENEAGRID distributed computing resources named Fast Access to Remote Objects 2.0 (FARO 2.0). FARO 2.0 is a tool for application and desktop virtualization with a focus toward...
详细信息
This paper introduces a renewed gateway to ENEAGRID distributed computing resources named Fast Access to Remote Objects 2.0 (FARO 2.0). FARO 2.0 is a tool for application and desktop virtualization with a focus towards user experience (UX), providing trained as well as untrained users with a collection of centralized services that can be seamlessly used on their client through a remote desktop protocol. FARO 2.0 is a javaFX application whose graphical user interface (GUI) and whose main logics have been implemented through the well-known Web technologies (HTML5, CSS3, Javascript) for easier maintainability and customizability, taking full advantage of the WebView component. The FARO 2.0 framework has been deployed both as a general purpose GUI for remote user access to ENEAGRID resources and as a specialized application or workflow-oriented GUI. They are applied in a set of applicative domains, ranging from materials science to technologies for energy and industry, environmental modeling, and nuclear fusion. Some examples and results are also presented. (C) 2017 Elsevier B.V. All rights reserved.
Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing. In this article, we take a detailed look at the problem of modelling and optimization of aforementi...
详细信息
Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing. In this article, we take a detailed look at the problem of modelling and optimization of aforementioned systems for k-nearest neighbour classifier. First, we present a comprehensive discussion on considered classification methods with a special focus on improving classification accuracy or response time through the use of partitions of original data set for the nearest neighbour rule. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of aforementioned recognition methods. The objective is to minimize the response time of the computing system applied for tasks related to k-nearest neighbours classifiers. We solve the problem using traditional branch and cut method and original algorithm GReTiMA based on a genetic approach as well. To illustrate our work, we provide results of numerical experiments showing the performance of the evolutionary approach compared against optimal results. Moreover, we show that the distributed approach enables significant improvement of the system response time.
Cosmology SAMR simulations have played a prominent role in the field of astrophysics. The emerging distributed computing systems provide an economic alternative to the traditional parallel machines, and enable scienti...
详细信息
Cosmology SAMR simulations have played a prominent role in the field of astrophysics. The emerging distributed computing systems provide an economic alternative to the traditional parallel machines, and enable scientists to conduct cosmological simulations that require vast computing power. An important issue of conducting distributed cosmological simulations is about performance and efficiency. In this paper, we present a dynamic load balancing scheme called DistDLB that is designed to improve the performance of distributed cosmology simulations Computation Grid, usually consist of heterogeneous resources connected with shared networks. By considering distributed systems. e.g. the these features of distributed systems and unique characteristics of cosmology SAMR simulations, DistDLB focuses on reducing the redistribution cost through a hierarchical load balancing approach and a run-time decision making mechanism. Heuristic methods have been proposed to adaptively adjust load balancing strategies based on the observation of the current system and application state. Our experiments with real-world cosmology simulations on production systems indicate that the proposed DistDLB scheme can effectively improve the performance of cosmology simulations by 2.56-79.14% as compared to the scheme that does not consider the heterogeneous and dynamic features of distributed systems. (c) 2006 Elsevier Inc. All rights reserved.
With the development of wireless computing devices, extending distributed computing to wireless networks deserves a closer look. This paper considers distributed computing over unreliable and insecure device-to-device...
详细信息
With the development of wireless computing devices, extending distributed computing to wireless networks deserves a closer look. This paper considers distributed computing over unreliable and insecure device-to-device (D2D) networks, in which each device is not always available to perform computation. The process of distributed devices exchanging calculated results with each other is vulnerable to eavesdropping in wireless environments. To handle the unreliable devices, we adopt repetition codes to build a novel system that supports general computations, called rho-replication system, where each device has rho -1 replicas with duplicate data. A coded computation scheme for the $\rho $ -replication system is proposed, which not only achieves the minimum communication load of the system but also ensures weak security of wireless transmissions during data exchange. Furthermore, the replication nature of the system can be exploited for beamforming transmissions, naturally leading to the idea of energy optimization. Simulation results show that increasing rho does not necessarily improve energy efficiency, as the benefit of increased beamforming gain may be outweighed by the drawback of heavier communication load.
暂无评论