Scheduling precedence constrained stochastic tasks on heterogenous cluster systems is an important issue which impact the performance of clusters significantly. Different with deterministic tasks, stochastic task mode...
详细信息
ISBN:
(纸本)9783319271613;9783319271606
Scheduling precedence constrained stochastic tasks on heterogenous cluster systems is an important issue which impact the performance of clusters significantly. Different with deterministic tasks, stochastic task model assumes that the workload of task and quantity of data transmission between tasks are stochastic variables, which is more realistic than other task models. Scheduling model and algorithms of precedence constrained stochastic tasks attract a large number of researchers' attention recently. An algorithm SDLS (Stochastic Dynamic Level Scheduling) has been proved performing well in scheduling stochastic tasks on heterogenous clusters. However, the assumption about communication time between tasks in SDLS is much simpler than its assumptions about task computing time, which makes it cannot depict the communication cost among heterogenous links well. In this paper, it is assumed that the quantity of data communication between tasks is a stochastic variable of normal distribution, instead of assuming communication time among heterogenous links a same stochastic variable immediately. Moreover, a modified scheduling model and algorithm SDLS-HC (Stochastic Dynamic Level Scheduling on Heterogenous Communication links) are proposed. Work in this paper focus on considering much more detailed communication cost in task scheduling based on SDLS. Evaluation on many random generated tasks experiments demonstrates that SDLS-HC achieves better performance than SDLS on cluster systems with heterogenous links.
We present a new distributed community detection algorithm for large graphs based on the Louvain method. We exploit a distributed delegate partitioning to ensure the workload and communication balancing among processo...
详细信息
ISBN:
(纸本)9781538683194
We present a new distributed community detection algorithm for large graphs based on the Louvain method. We exploit a distributed delegate partitioning to ensure the workload and communication balancing among processors. In addition, we design a new heuristic strategy to carefully coordinate the community constitution in a distributed environment, and ensure the convergence of the distributed clustering algorithm. Our intensive experimental study has demonstrated the scalability and the correctness of our algorithm with various large-scale real-world and synthetic graph datasets using up to 32,768 processors.
This paper presents dependable parallel multi-population differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The problem can be formulated as a mix...
详细信息
ISBN:
(纸本)9781538627266
This paper presents dependable parallel multi-population differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The problem can be formulated as a mixed integer nonlinear optimization problem (MINLP). Since Optimal operational planning of numbers of energy plants are calculated simultaneously in a data center, the problem is required to generate optimal operational planning as rapidly as possible considering control intervals and numbers of treated plants. One of the solutions for this challenge is speeding up by parallel and distributed processing (PDP). However, PDP utilizes numbers of processes and countermeasures for various faults of the processes should be considered. The problem requires successive calculation at every control interval for keeping customer services. Therefore, sustainable (dependable) calculation keeping appropriate solution quality are required even if some of the calculation results cannot be returned from distributed processes. Multi-population based evolutionary computation methods have been verified that they can improve solution quality. Using the proposed dependable parallel multi-population DEEPSO based method, it is observed that calculation time becomes about 4.3 times faster than the conventional sequential DEEPSO, and appropriate solution quality can be kept even if some of the calculation results cannot be returned from distributed processes.
Smart agriculture is one of the most diverse research. In addition, the quantity of data to be stored and the choice of the most efficient algorithms to process are significant elements in this field. The storage of c...
详细信息
Smart agriculture is one of the most diverse research. In addition, the quantity of data to be stored and the choice of the most efficient algorithms to process are significant elements in this field. The storage of collecting data from Internet of Things (IoT), existing on distributed, local databases and open data need a particular infrastructure to federate all these data to make complex treatments. The storage of this wide range of data that comes at high frequency and variable throughput is particularly difficult. In this paper, we propose the use of distributed databases and high-performance computing architecture in order to exploit multiple re-configurable computing and application specific processing such as CPUs, GPUs, TPUs and FPGAs efficiently. This exploitation allows an accurate training for an application to machine learning, deep learning and unsupervised modeling algorithms. The last ones are used for training supervised algorithms on images when it labels a set of images and unsupervised algorithms on IoT data which are unlabeled with variable qualities. The processing of data is based on Hadoop 3.1 MapReduce to achieve parallelprocessing and use containerization technologies to distribute treatments on Multi GPU, MIC and FPGA. This architecture allows efficient treatments of data coming from several sources with a cloud high-performance heterogeneous architecture. The proposed 4 layers infrastructure can also implement FPGA and MIC which are now natively supported by recent version of Hadoop. Moreover, with the advent of new technologies like Intel (R) Movidius (TM);it is now possible to deploy CNN at the Fog level in the IoT network and to make inference with the cloud and therefore limit significantly the network traffic that result in reducing the move of large amounts of data to the cloud. (C) 2018 The Authors. Published by Elsevier Ltd.
Following an exhaustive set of experiments, we identify slowdowns in I/O performance that occur when processor power and frequency are increased. Our initial analyses indicate slowdowns are more likely to occur and mo...
详细信息
ISBN:
(纸本)9781479986484
Following an exhaustive set of experiments, we identify slowdowns in I/O performance that occur when processor power and frequency are increased. Our initial analyses indicate slowdowns are more likely to occur and more acute when the number of parallel I/O threads increases and the variability between runs is high. We use a microbenchmark-driven methodology to simplify isolation of the root causes of I/O performance loss. We classify the observed performance loss into two categories: file synchronization and file write delays. We introduce LUC, a runtime system to Limit the Unintended Consequences of power scaling and dynamically improve I/O performance. We demonstrate the effectiveness of the LUC system running on two platforms for two critical parallel transaction-oriented workloads including a mail server (varMail) and online transaction processing (oltp).
The paper provides a new approach to designing a massively parallel logic engine for automated reasoning in a distributed decision support system. The design includes extension of classical linear resolution of first ...
详细信息
The paper provides a new approach to designing a massively parallel logic engine for automated reasoning in a distributed decision support system. The design includes extension of classical linear resolution of first order logic clauses by multi-resolution, where a set of clauses can be resolved concurrently without sacrificing any inference, thereby speeding-up the execution of a logic program. The speed-up and utilization rate of resources are used as the performance evaluation metric to compare the performance of the proposed system with the classical one. A high level logic architecture of the proposed multi-resolution system is presented to explore possible parallelism and pipelining among the tasks, thus determining the execution time of typical logic programs. Possible application of the proposed system in query evaluation of a logic program based database systems is also introduced.
Frequent Itemset Mining is one of the most investigated fields of data *** is expensive to mine frequent itemsets for a large scale data *** when some data is added into the data set,it is still time-consuming from th...
详细信息
ISBN:
(纸本)9781509036202
Frequent Itemset Mining is one of the most investigated fields of data *** is expensive to mine frequent itemsets for a large scale data *** when some data is added into the data set,it is still time-consuming from the scratch to re-compute the complete data set to update the frequent itemsets of the data *** to improve the performance of frequent itemset mining for large scale and dynamic data set,we propose a new incremental Apriori algorithm based on *** reuses existing results from previous computation to modify the frequent itemsets according to the newly added data,which avoid massive *** newly proposed algorithm also takes full advantage of distributed resources with the support of *** in theory ensures that the newly proposed algorithm is *** on the real-world data set demonstrate that the newly proposed algorithm effectively avoids reduplicated computation and improves the performance of frequent itemset mining with no additional storage overhead.
We discuss analytic procedures for evaluating the availability of parallel computer systems comprised of P processors with N tasks subject to failures and repairs. In addition, we argue, via analytic and numeric examp...
详细信息
ISBN:
(纸本)0780381386
We discuss analytic procedures for evaluating the availability of parallel computer systems comprised of P processors with N tasks subject to failures and repairs. In addition, we argue, via analytic and numeric examples, that not incorporating the task-stream into the model is an inadequate approach for evaluating system performance.
We present a parallel hierarchical graph clustering algorithm that uses modularity as clustering criteria to effectively extract community structures in large graphs of different types. In order to process a large com...
详细信息
ISBN:
(纸本)9781467365987
We present a parallel hierarchical graph clustering algorithm that uses modularity as clustering criteria to effectively extract community structures in large graphs of different types. In order to process a large complex graph (whose vertex number and edge number are around 1 billion), we design our algorithm based on the Louvain method by investigating graph partitioning and distribution schemes on distributed memory architectures and conducting clustering in a divide-and-conquer manner. We study the relationship between graph structure property and clustering quality, carefully deal with ghost vertices between graph partitions, and propose a heuristic partition method suitable for the Louvain method. Compared to the existing solutions, our method can achieve nearly well-balanced workload among processors and higher accuracy of graph clustering on real-world large graph datasets.
This paper evaluates speed-up and dependability of parallel differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The planning can be formulated as a...
详细信息
ISBN:
(纸本)9781509025978
This paper evaluates speed-up and dependability of parallel differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The planning can be formulated as a mixed integer nonlinear optimization problem (MINLP). When optimal operational planning of numbers of energy plants are calculated simultaneously in a data center, it is required to generate optimal operational planning as rapidly as possible considering control intervals and numbers of treated plants. One of the solutions for this challenge is speeding up by parallel and distributed processing (PDP). However, PDP utilizes numbers of processes and countermeasures for various faults of the processes should be considered. On-line optimal operational planning requires successive calculation at every control interval for keeping customer services. Therefore, sustainable (dependable) calculation keeping quality of solutions are required even if some of the calculation results cannot be returned from distributed processes. Using the proposed parallel DEEPSO based method, it is observed that calculation time becomes about 3 times faster than a sequential calculation, and high quality of solutions can be kept even with high fault probabilities.
暂无评论