This paper proposes a simple parallel and distributed computing framework for the conventional Newton-Raphson load flow (NRLF) solution of large interconnected power systems. The proposed approach is based on message-...
详细信息
This paper proposes a simple parallel and distributed computing framework for the conventional Newton-Raphson load flow (NRLF) solution of large interconnected power systems. The proposed approach is based on message-passing distributed-memory architecture with separate workstations, and involves the piecewise analysis of power systems utilizing the network tearing procedure. The NRLF solution method, applied to each torn system at the selected buses, employs the matrix inversion lemma consisting of the factorization, forward elimination and back substitution procedures. The computational requirements of the state-of-the art parallel algorithm to obtain the correction vector involved in the back substitution procedure is reduced with the proposed approach in which the back substitution is carried out in parallel taking into account the split buses, rather than the order in which the forward elimination is performed. The investigations are carried out on the IEEE 118 bus standard test system in a Redhat Linux based 100 Mbps Ethernet LAN environment. The investigations reveal that the proposed method is significantly faster than the conventional NRLF and also the NRLF based on the state-of-the-art parallel algorithm, and thus finds potential applications for the real-time load flow solution of both regulated and deregulated power systems distributed over large geographical areas. (C) 2015 Elsevier Ltd. All rights reserved.
This work presents a classification of weak models of distributed computing. We focus on deterministic distributed algorithms, and study models of computing that are weaker versions of the widely-studied port-numberin...
详细信息
This work presents a classification of weak models of distributed computing. We focus on deterministic distributed algorithms, and study models of computing that are weaker versions of the widely-studied port-numbering model. In the port-numbering model, a node of degree d receives messages through d input ports and sends messages through d output ports, both numbered with 1, 2,..., d. In this work, VVc is the class of all graph problems that can be solved in the standard port-numbering model. We study the following subclasses of VVc: VV: Input port i and output port i are not necessarily connected to the same neighbour. MV: Input ports are not numbered;algorithms receive a multiset of messages. SV: Input ports are not numbered;algorithms receive a set of messages. VB: Output ports are not numbered;algorithms send the same message to all output ports. MB: Combination of MV and VB. SB: Combination of SV and VB. Now we have many trivial containment relations, such as SB subset of MB subset of VB subset of VV subset of VVc, but it is not obvious if, for example, either of VB subset of SV or SV subset of VB should hold. Nevertheless, it turns out that we can identify a linear order on these classes. We prove that SB (sic) MB = VB (sic) SV = MV = VV (sic) VVc. The same holds for the constant-time versions of these classes. We also show that the constant-time variants of these classes can be characterised by a corresponding modal logic. Hence the linear order identified in this work has direct implications in the study of the expressibility of modal logic. Conversely, one can use tools from modal logic to study these classes.
The issue of simulation of decentralized mathematical models is discussed in the paper. The authors' knowledge is based on a theory of design of decentralized computer control systems. Their knowledge is gained in...
详细信息
The issue of simulation of decentralized mathematical models is discussed in the paper. The authors' knowledge is based on a theory of design of decentralized computer control systems. Their knowledge is gained in the process of designing mathematical models that are simulated. A decomposed control system is required to meet the conditions of observation and control. The methodology of a multi-model design is based on main principles of object orientation such as abstraction, hierarchy, and modularity. Modelling on a parallel architecture has an impact on a simulator system. The system is defined by the equations shown below. An important part is the way of analyzing the simulation method, an analytical approach, and corresponding software implementation tools.
We propose a framework for edge-facilitated wireless distributed computing, in which several mobile users connected to an access point collaborate for a distributed computing task. We characterize the minimum communic...
详细信息
ISBN:
(纸本)9781509013296
We propose a framework for edge-facilitated wireless distributed computing, in which several mobile users connected to an access point collaborate for a distributed computing task. We characterize the minimum communication load, both in uplink (from users to the access point) and downlink (from access point to the users), required for distributed computing. In particular, we develop a communication scheme and a dataset placement strategy that induces a particular overlap of computations at the users, which can then be exploited for coding at both users and the access point to significantly reduce the communication load. We demonstrate that the reduction in communication load (compared to uncoded solutions) can scale linearly with the size of the network (i.e., the number of users), hence our proposed scheme can result in a "scalable" design for edge-facilitated wireless distributed computing (i.e., accommodating any number of users without incurring extra communication load). Furthermore, we establish the optimality of the proposed scheme by developing a tight information theoretic outer-bound, and demonstrate that the proposed scheme achieves the minimum uplink and downlink communication load simultaneously. We also generalize the results to a decentralized setting, in which a random and a priori unknown subset of users may participate in distributed computing at each time, and characterize the minimum communication load for uniformly random dataset placement at users.
distributed computing, which leverages distributed storage and computing resources, is a promising paradigm for handling large-scale computational tasks. However, its potential is often hindered by high communication ...
详细信息
distributed computing, which leverages distributed storage and computing resources, is a promising paradigm for handling large-scale computational tasks. However, its potential is often hindered by high communication latency due to limited network bandwidth. In this paper, we study the computation-communication tradeoff of multi-cluster MapReduce systems where a central server connects to multiple clusters, each comprising a set of workers that jointly perform a MapReduce task. Workers can exchange information directly within their cluster (inner-cluster communication) or indirectly through the central server (cross-cluster communication). To reduce the communication load, we propose a nested coded distributed computing (CDC) scheme that is feasible for the heterogeneous scenario where different clusters could have arbitrary numbers of workers and computation loads. It is shown that our scheme can greatly reduce communication load compared to all existing schemes, and could achieve the optimal cross-cluster communication load. In addition, the proposed scheme can significantly reduce the computational complexity of the conventional CDC schemes, whose computational complexity exponentially increases with the computation load.
In this work we present analytic expressions for the expected values of the performance metrics of parallel applications when the distributed computing infrastructure has a complex topology. Through active probing tes...
详细信息
In this work we present analytic expressions for the expected values of the performance metrics of parallel applications when the distributed computing infrastructure has a complex topology. Through active probing tests we analyse the structure of a real distributed computing environment. From the resulting network we both validate the analytic expressions and explore the performance metrics under different conditions through Monte Carlo simulations. In particular we gauge computing paradigms with different hierarchical structures in computing services. Fully decentralised (i.e., peer-to-peer) environments provide the best performance. Moreover, we show that it is possible to improve significantly the parallel efficiency by implementing more intelligent configurations of computing services and task allocation strategies (e.g., by using a betweenness centrality measure). We qualitatively reproduce results of previous works and provide closed-form solutions that link topology, application's structure and allocation parameters when job dependencies and a complex network structure are considered.
To manage the smart electric grid of the future, fundamental changes are required in the system operational paradigm. Availability of high-resolution data at faster speed and advanced computational advancements provid...
详细信息
ISBN:
(纸本)9781479983988
To manage the smart electric grid of the future, fundamental changes are required in the system operational paradigm. Availability of high-resolution data at faster speed and advanced computational advancements provide opportunities to bring this fundamental change. Monitoring and control algorithms need to be evolved to match the transition of centralized generation to distributed generation. Intermittency of renewable generation and push towards real time control requires faster control actions, which is possible with decentralized power grid applications. With integration of distributed energy resources (DERs), the stability assessment application need to handle a large number of data points in real time. This requires massive computing resources, and requirements will increase for possible real time control action. Decentralized applications need to be coordinated and manged with existing centralized applications. This paper addresses the development of a fault-tolerant distributed computing architecture (DCBlocks) for implementing a decentralized voltage stability monitoring and control application. Results for IEEE 30 bus system have been provided to validate the developed architecture. distributed computing algorithms are implemented using open source platform Akka Java and DeterLab test bed.
distributed algorithms are designed to run on interconnected autonomous computing entities for achieving a common task: each entity executes asynchronously the same code and interacts locally with its immediate neighb...
详细信息
distributed algorithms are designed to run on interconnected autonomous computing entities for achieving a common task: each entity executes asynchronously the same code and interacts locally with its immediate neighbours. It is widely agreed that the lack of knowledge of the global state makes termination detection one of the most important and complex problems in distributed computing. By relying on refinement, we prove that an algorithm computing a spanning tree with Local Termination Detection (each entity is able to determine only its own termination condition), can be reused and adapted in order to compute the same algorithm with Global Termination Detection (at least one entity is aware that the entire computation is achieved in the network). The main idea relies upon specifying a combination of a well known algorithm namely SSP and the spanning tree algorithm, following a top/down approach. This paper is a starting point towards a general framework for enhancing termination detection property of distributed algorithms and reusing their proofs.
distributed architecture is expected to be an effective solution for large-scale edge computing tasks in terminal devices. However, it remains a great challenge to resolve the conflict between parallel efficiency and ...
详细信息
distributed architecture is expected to be an effective solution for large-scale edge computing tasks in terminal devices. However, it remains a great challenge to resolve the conflict between parallel efficiency and constrained physical resources in a specific embedded structure. This article proposes a universal scalable off-chip parallel computing architecture to maximize the computing efficiency for distributed embedded computing clusters. This architecture is based on an improved Message Passing Interface (Improved-MPI). To address the limited communication speed in embedded environments, a multilevel communication mechanism is employed to alleviate the communication pressure on nodes. By flexibly allocating computing tasks, efficient utilization of every embedded cluster node is ensured, while also solving the problem of single point of failure. In addition, to overcome the challenge of limited RAM in embedded devices, the architecture utilizes the interleaved memory initialization mechanism to run larger computing tasks. Based on this architecture, a specific embedded cluster platform is constructed using the RK3399 board. Various large-scale tasks are deployed on this platform to validate the performance of the architecture. First, a large-scale randomly connected neural network is executed, which serves to verify the architecture's outstanding computational performance and communication capability. Secondly, a functional model of Small-World Spiking Neural Network is constructed, achieving real-time and efficient digital speech recognition. Finally, the implementation of Large Language Models demonstrates that the embedded clusters can achieve performance comparable to modern computers.
Cellular Automata (CA) have been established as a dynamic mathematical modeling tool for scientific and engineering applications. Equal Length Cellular Automata (ELCA) are special classifications of CA having all gene...
详细信息
Cellular Automata (CA) have been established as a dynamic mathematical modeling tool for scientific and engineering applications. Equal Length Cellular Automata (ELCA) are special classifications of CA having all generated equal length CA subspaces (cycles). Potential usages of ELCA have been reported for engineering applications [1-4]. A detailed analysis of ELCA generating linear and complemented linear rules has been presented in our work. General forms of characteristic matrix and characteristic polynomial for ELCA generation have been reported. Mathematical relationships between cell length of CA and length of generated equal length cycles using explored rules have been reported in the paper.
暂无评论