For the past few years we have seen an exponential growth in the number of mobiledevices and in their computation, storage and communication capabilities. We also have seen an increase in the amount of data generated ...
详细信息
For the past few years we have seen an exponential growth in the number of mobiledevices and in their computation, storage and communication capabilities. We also have seen an increase in the amount of data generated by mobile devices while performing common tasks. Additionally, the ubiquity associated with these mobile devices, makes it more reasonable to start thinking in a different use for them, where they will begin to act as an important part in the computation of more demanding applications, rather than relying exclusively on external servers to do *** is also possible to observe an increase in the number of resource demanding applications, whereas these resort to the use of services, offered by infrastructure ***, with the use of these Cloud services, many problems arise, such as: the considerable use of energy and bandwidth, high latency values, unavailability of connectivity infrastructures, due to the congestion or the non existence of it. Considering all the above, for some applications it starts to make sense to do part or all the computation locally in the mobile devices *** propose a distributed computing framework, able to process a batch or a streamof data, which is being generated by a cloud composed of mobile devices, that doesnot require Internet services. Differently from the current state-of-the-art, where bothcomputation and data are offloaded to mobile devices, our system intends to move thecomputation to where the data is, reducing significantly the amount of data exchangedbetween mobile *** on the evaluation performed, both on a real and simulated environment, ourframework has proved to support scalability, by benefiting significantly from the usageof several devices to handle computation, and by supporting multiple devices submitting computation requests while not having a significant increase in the latency of a request. It also proved that is able to deal with churn without being highly penalized by it.
After a great advance by the industry on processes automation, an important challenge still remains: the automation under abnormal situations. The first step towards solving this challenge is the Fault Detection and D...
详细信息
After a great advance by the industry on processes automation, an important challenge still remains: the automation under abnormal situations. The first step towards solving this challenge is the Fault Detection and Diagnosis (FDD). This work proposes a batch-incremental adaptive methodology for fault detection and diagnosis based on mixture models trained on a distributed computing environment. The models used are from a family of Parsimonious Gaussian Mixture Models (PGMM), in which the reduced number of parameters of the model brings important advantages when there are few data available, an expected scenario of faulty conditions. On the other hand, a large number of different models rises another challenge, the best model selection for a given behaviour. For that, it is proposed to train a large number of models, using distributed computing techniques, for only then select the best model. This work proposes the usage of the Spark framework, ideal for iterative computations. The proposed methodology was validated in a simulated process, the Tennessee Eastman Process (TEP), showing good results for both the detection and the diagnosis of faults. Furthermore, numeric experiments show the viability of training a large number of models for the best model selection a posteriori. (C) 2016 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
distributed computation is the widely used methodology to overcome challenges that the application covering multiple mobile devices mostly experiences, such as the high complexity of the computation and the resource l...
详细信息
ISBN:
(数字)9798331506940
ISBN:
(纸本)9798331506957
distributed computation is the widely used methodology to overcome challenges that the application covering multiple mobile devices mostly experiences, such as the high complexity of the computation and the resource limitation. By splitting the required computation and distribute the computation across the multiple devices, it can achieve lowered computation time and resource required per device and effective utilization in terms of the total resource management. This is risen as an appropriate approach to manage problems that recent applications with deep learning process have. Followed by the generalization of Internet of Things (IoT) and the development of data collecting technology, the deep learning process has to handle much larger dataset which makes it hard to be transferred through the network. This also leads to more complex computation that a single device may not be able to operate itself. In this paper, we consider the distributed computation applied in various fields, and how it is applied to distribute the deep learning process through observing researches studying about it.
Coded distributed computing can alleviate the communication load by leveraging the redundant storage and computation resources with coding techniques in distributed computing. In this paper, we study a MapReduce-type ...
详细信息
Coded distributed computing can alleviate the communication load by leveraging the redundant storage and computation resources with coding techniques in distributed computing. In this paper, we study a MapReduce-type distributed computing framework over star topological network, where all the workers exchange information through a common access point. The optimal tradeoff among the normalized number of stored files (storage load), computed intermediate values (computation load), and transmitted bits in the uplink and downlink (communication loads) is characterized. A coded computing scheme is proposed to achieve the Pareto-optimal tradeoff surface, in which the access point only needs to perform simple chain coding between the signals it receives, and information- theoretical bound matching the surface is also provided.
In order to effectively reduce the running time of fingerprint feature data matching algorithm and improve the matching accuracy. This paper designs and proposes a fingerprint feature data equivalent matching algorith...
详细信息
In order to effectively reduce the running time of fingerprint feature data matching algorithm and improve the matching accuracy. This paper designs and proposes a fingerprint feature data equivalent matching algorithm based on distributed computing. In the process of de-counterfeiting, the method of combining distance de-counterfeiting with structure de-counterfeiting is adopted. The information of minutiae feature points for subsequent matching, sampling information of corresponding ridge and direction of minutiae points are obtained by ridge tracking method. On this basis, use the fingerprint matching algorithm based on the congruent triangle to match the extracted minutiae feature points, so as to achieve the purpose of matching fingerprint feature data equally. The simulation results show that the proposed algorithm can effectively reduce the running time and improve the matching accuracy, which fully verifies the practical application value of the algorithm.
Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing. In this article, we take a detailed look at the problem of modelling and optimization of aforementi...
详细信息
Since the amount of information is rapidly growing, there is an overwhelming interest in efficient network computing. In this article, we take a detailed look at the problem of modelling and optimization of aforementioned systems for k-nearest neighbour classifier. First, we present a comprehensive discussion on considered classification methods with a special focus on improving classification accuracy or response time through the use of partitions of original data set for the nearest neighbour rule. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of aforementioned recognition methods. The objective is to minimize the response time of the computing system applied for tasks related to k-nearest neighbours classifiers. We solve the problem using traditional branch and cut method and original algorithm GReTiMA based on a genetic approach as well. To illustrate our work, we provide results of numerical experiments showing the performance of the evolutionary approach compared against optimal results. Moreover, we show that the distributed approach enables significant improvement of the system response time.
This paper introduces a renewed gateway to ENEAGRID distributed computing resources named Fast Access to Remote Objects 2.0 (FARO 2.0). FARO 2.0 is a tool for application and desktop virtualization with a focus toward...
详细信息
This paper introduces a renewed gateway to ENEAGRID distributed computing resources named Fast Access to Remote Objects 2.0 (FARO 2.0). FARO 2.0 is a tool for application and desktop virtualization with a focus towards user experience (UX), providing trained as well as untrained users with a collection of centralized services that can be seamlessly used on their client through a remote desktop protocol. FARO 2.0 is a javaFX application whose graphical user interface (GUI) and whose main logics have been implemented through the well-known Web technologies (HTML5, CSS3, Javascript) for easier maintainability and customizability, taking full advantage of the WebView component. The FARO 2.0 framework has been deployed both as a general purpose GUI for remote user access to ENEAGRID resources and as a specialized application or workflow-oriented GUI. They are applied in a set of applicative domains, ranging from materials science to technologies for energy and industry, environmental modeling, and nuclear fusion. Some examples and results are also presented. (C) 2017 Elsevier B.V. All rights reserved.
Coded distributed computing (CDC) can alleviate the communication load in distributed computing systems by leveraging coding opportunities via redundant computation. While the optimal computation-communication tradeof...
详细信息
Coded distributed computing (CDC) can alleviate the communication load in distributed computing systems by leveraging coding opportunities via redundant computation. While the optimal computation-communication tradeoff has been well studied for homogeneous systems, it remains largely unknown for heterogeneous systems where workers have different computation capabilities. This paper characterizes the upper and lower bounds of the optimal communication load as two linear programming problems for a general heterogeneous CDC system using the MapReduce framework. Our achievable scheme first designs a parametric data shuffling strategy for any given mapping strategy, and then jointly optimizes the mapping strategy and the data shuffling strategy to obtain the upper bound. The parametric data shuffling strategy allows adjusting the size of the multicast message intended for each worker set, so that it can largely decrease the number of unicast messages and hence increase the communication efficiency. Numerical results show that our achievable communication load is lower than those achieved in existing works. Our lower bound is established by unifying an improved cut-set bound and a peeling method. The obtained upper and lower bounds degenerate to the existing result in homogeneous systems, and coincide with each other when the system is approximately homogeneous or grouped homogeneous.
We describe an environment for distributed computing that uses the concept of well-known paradigms. The main advantage of paradigm-oriented distributed computing (PODC) is that the user only needs to specify applicati...
详细信息
We describe an environment for distributed computing that uses the concept of well-known paradigms. The main advantage of paradigm-oriented distributed computing (PODC) is that the user only needs to specify application-specific sequential code, while the underlying infrastructure takes care of the parallelization and distribution. The main features of the proposed approach, called PODC, are the following: (1) It is intended for loosely coupled network environments, not specialized multiprocessors;(2) it is based on an infrastructure of mobile agents;(3) it supports programming in C, rather than a functional or special-purpose language, and (4) it provides an interactive graphics interface through which programs are constructed, invoked, and monitored. We discuss five paradigms presently supported in PODC: the bag-of-tasks, branch-and-bound search, genetic programming, finite difference, and individual-based simulation. We demonstrate their use, implementation, and performance within the mobile agent-based PODC environment. (c) 2005 Elsevier Inc. All rights reserved.
Nowadays, Big Data becomes a research focus in industrial, banking, social network, and other fields. In addition, the explosive increase of data and information require efficient processing solutions. Therefore, Spar...
详细信息
Nowadays, Big Data becomes a research focus in industrial, banking, social network, and other fields. In addition, the explosive increase of data and information require efficient processing solutions. Therefore, Spark is considered as a promising candidate of Large-Scale distributed computing Systems for big data processing. One primary challenge is the straggler problem that occurred due to the presence of heterogeneity where a machine takes an extra-long time to finish execution of a task, which decreases the system throughput. To mitigate straggler tasks, Spark adopts speculative execution mechanism, in which the scheduler launches additional backup to avoid slow task processing and achieve acceleration. In this paper, a new Optimized Straggler Mitigation Framework is proposed. The proposed framework uses a dynamic criterion to determine the closest straggler tasks. This criterion is based on multiple coefficients to achieve a reliable straggler decision. Also, it integrates the historical data analysis and online adaptation for intelligent straggler judgment. This guarantees the effectiveness of speculative tasks by improving cluster performance. Experimental results on various benchmarks and applications show that the proposed framework achieves 23.5% to 30.7% execution time reductions, and 25.4 to 46.3% increase of the cluster throughputs compared with spark engine.
暂无评论