Under the premise of the reform of the power system, the spot market has ushered in an era of vigorous development. In the face of the complex and diverse users and electricity bill settlement methods in the spot mark...
详细信息
ISBN:
(纸本)9798350375794;9798350375800
Under the premise of the reform of the power system, the spot market has ushered in an era of vigorous development. In the face of the complex and diverse users and electricity bill settlement methods in the spot market, this study proposes a trial calculation method of complex and multi-user electricity charges in the spot market based on concurrent computing architecture. Based on the current situation of the spot market in China, this study establishes a learning model based on Apriori algorithm to explore the relationship between different electricity bill settlement packages and multiple users in the face of different types of electricity tariff calculation methods such as fixed time-sharing, tariff and hybrid and complex and diverse user categories. According to the correlation results, different electricity packages are applicable to different types of users, so the concurrent calculation method can be established to effectively improve the efficiency of electricity bill settlement.
One of the most relevant and widely studied structural properties of networks is their community structure. Detecting communities is of great importance in social networks where systems are often represented as graphs...
详细信息
One of the most relevant and widely studied structural properties of networks is their community structure. Detecting communities is of great importance in social networks where systems are often represented as graphs. With the advent of web-based social networks like Twitter, Facebook and LinkedIn. community detection became even more difficult due to the massive network size, which can reach up to hundreds of millions of vertices and edges. This large graph structured data cannot be processed without using distributedalgorithms due to memory constraints of one machine and also the need to achieve high performance. In this paper, we present a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive social networks. For our simulations, we use synthetic graphs ranging from 100K to 16M vertices to show the scalability and quality performance of our algorithm. We also use two massive real world networks: (a) section of Twitter-2010 network having approximate to 41M vertices and approximate to 1.4 Bedges (b) UK-2007 (.uk web domain) having approximate to 105M vertices and approximate to 3.3B edges. Simulation results on MPI setup with 8 compute nodes having 16 cores each show that, upto approximate to 6X speedup is achieved for synthetic graphs in detecting communities without compromising the quality of the results. (C) 2017 Elsevier Inc. All rights reserved.
We develop an efficient paralleldistributed algorithm for matrix completion, named NOMAD (Non-locking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion). NOMAD is a decentralize...
详细信息
We develop an efficient paralleldistributed algorithm for matrix completion, named NOMAD (Non-locking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion). NOMAD is a decentralized algorithm with non-blocking communication between processors. One of the key features of NOMAD is that the ownership of a variable is asynchronously transferred between processors in a decentralized fashion. As a consequence it is a lock-free parallel algorithm. In spite of being asynchronous, the variable updates of NOMAD are serializable, that is, there is an equivalent update ordering in a serial implementation. NOMAD outperforms synchronous algorithms which require explicit bulk synchronization after every iteration: our extensive empirical evaluation shows that not only does our algorithm perform well in distributed setting on commodity hardware, but also outperforms stateof-the-art algorithms on a HPC cluster both in multi-core and distributed memory settings.
We propose a paralleldistributed model of a hybrid fuzzy genetics-based machine learning (GBML) algorithm to drastically decrease its computation time. Our hybrid algorithm has a Pittsburgh-style GBML framework where...
详细信息
We propose a paralleldistributed model of a hybrid fuzzy genetics-based machine learning (GBML) algorithm to drastically decrease its computation time. Our hybrid algorithm has a Pittsburgh-style GBML framework where a rule set is coded as an individual. A Michigan-style rule-generation mechanism is used as a kind of local search. Our paralleldistributed model is an island model where a population of individuals is divided into multiple islands. Training data are also divided into multiple subsets. The main feature of our model is that a different training data subset is assigned to each island. The assigned training data subsets are periodically rotated over the islands. The best rule set in each island also migrates periodically. We demonstrate through computational experiments that our model decreases the computation time of the hybrid fuzzy GBML algorithm by an order or two of magnitude using seven parallel processors without severely degrading the generalization ability of obtained fuzzy rule-based classifiers. We also examine the effects of the training data rotation and the rule set migration on the search ability of our model.
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this model is to allow reliable predictions...
详细信息
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this model is to allow reliable predictions to be made for the performance of the software on large numbers of processors of a given parallel system, by only benchmarking the code on small numbers of processors. Having described the methods used, and emphasized the simplicity of their implementation, the approach is tested on a range of engineering software applications that are built upon the use of multigrid algorithms. Despite their simplicity, the models are demonstrated to provide both accurate and robust predictions across a range of different parallel architectures, partitioning strategies and multigrid codes. In particular, the effectiveness of the predictive methodology is shown for a practical engineering software implementation of an elastohydrodynamic lubrication solver. (C) 2010 Civil-Comp Ltd and Elsevier Ltd. All rights reserved.
We propose a model for describing. and predicting the parallel performance of multigrid numerical software on distributed memory architectures for which different data partitioning and mapping strategies may be used. ...
详细信息
ISBN:
(纸本)9781905088294
We propose a model for describing. and predicting the parallel performance of multigrid numerical software on distributed memory architectures for which different data partitioning and mapping strategies may be used. The goal of the model is to allow reliable predictions to be made as to the execution time of a given code on a large number of processors of a given parallel system, by only benchmarking the code on small numbers of processors. Despite its relative simplicity the model is shown to be accurate and robust with respect to both the parallel architectures and the data partitioning strategies that are used.
We present a paralleldistributed solver that enables us to solve incremental dense least squares arising in some parameter estimation problems. This solver is based on ScaLAPACK [8] and PBLAS [9] kernel routines. In ...
详细信息
We present a paralleldistributed solver that enables us to solve incremental dense least squares arising in some parameter estimation problems. This solver is based on ScaLAPACK [8] and PBLAS [9] kernel routines. In the incremental process, the observations are collected periodically and the solver updates the solution with new observations using a QR factorization algorithm. It uses a recently defined distributed packed format [3] that handles symmetric or triangular matrices in ScaLAPACK-based implementations. We provide performance analysis on IBM pSeries 690. We also present an example of application in the area of space geodesy for gravity field computations with some experimental results.
We propose a model for describing and predicting the performance of parallel numerical software on distributed memory architectures within a multi-cluster environment. The goal of the model is to allow reliable predic...
详细信息
ISBN:
(纸本)978836810149
We propose a model for describing and predicting the performance of parallel numerical software on distributed memory architectures within a multi-cluster environment. The goal of the model is to allow reliable predictions to be made as to the execution time of a given code on a large number of processors of a given parallel system, and on a combination of systems, by only benchmarking the code on small numbers of processors. This has potential applications for the scheduling of jobs in a Grid computing environment where informed decisions about which resources to use in order to maximize the performance and/or minimize the cost of a job will be valuable. The methodology is built and tested for a particular class of numerical code, based upon the multilevel solution of discretized partial differential equations, and despite its simplicity it is demonstrated to be extremely accurate and robust with respect to both the processor and communications architectures considered. Furthermore, results are also presented which demonstrate that excellent predictions may also be obtained for numerical algorithms that are more general than the pure multigrid solver used to motivate the methodology. These are based upon the use of a practical parallel engineering code that is briefly described. The potential significance of this work is illustrated via two scenarios which consider a Grid user who wishes to use the available resources either (i) to obtain a particular result as quickly as possible, or (ii) to obtain results to different levels of accuracy.
In this paper we propose a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix while maintaining most of the efficienc...
详细信息
In this paper we propose a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix while maintaining most of the efficiency compared with a full storage for a wide range of operations. This work has been motivated by the fact that, in contrast to sequential linear algebra libraries (e.g. LAPACK), there is no routine or format that handles packed matrices in the currently available paralleldistributed libraries. The proposed algorithms exclusively use the existing ScaLAPACK computational kernels, which proves the generality of the approach, provides easy portability of the code and provides efficient re-use of existing software. The performance results obtained for the Cholesky factorization show that our packed format performs as good as or better than the ScaLAPACK full storage algorithm for a small number of processors. For a larger number of processors, the ScaLAPACK full storage routine performs slightly better until each processor runs out of memory. Copyright (c) 2006 John Wiley & Sons, Ltd.
The sectioned genetic algorithm (hereafter denoted as sectioned GA), which is presented in this paper, represents a modification of the standard GA and deals with large scale problems (i.e. problems involving pattern ...
详细信息
ISBN:
(纸本)9781424436491
The sectioned genetic algorithm (hereafter denoted as sectioned GA), which is presented in this paper, represents a modification of the standard GA and deals with large scale problems (i.e. problems involving pattern spaces with high dimensionalities). Instead of increasing the size of the population searching the pattern space when the problem dimensionality increases, the sectioned GA approach divides each individual into smaller parts (sections) and subsequently applies the genetic operators on each of these parts. Results from the application of sectioned GA on the problem of automatic morphological analysis are also presented in this article. Morphological analysis is by nature a large scale problem since a great number of words need to be segmented into stems and suffixes. The proposed system improves the segmentation accuracy substantially in comparison to standard GA algorithms.
暂无评论