This special issue of Concurrency and Computation: Practice and Experience provides a forum for presenting advances of current research and development in all aspects of Parallel and distributed computing and Communic...
详细信息
This special issue of Concurrency and Computation: Practice and Experience provides a forum for presenting advances of current research and development in all aspects of Parallel and distributed computing and Communications.
Coded distributed computing is used to mitigate the adverse effect of slow workers on the computation time in distributed computing systems. However, using error-correction codes results in encoding and decoding delay...
详细信息
Coded distributed computing is used to mitigate the adverse effect of slow workers on the computation time in distributed computing systems. However, using error-correction codes results in encoding and decoding delays. In this work, we consider a systematic maximum-distance separable (MDS) coded matrix-vector multiplication problem with multi-message communication (MMC), where the master assigns multiple sub-tasks to each worker. In this setup, we show that the received systematic outputs can be used to reduce the decoding time by implementing a proper decoding algorithm. To further reduce the decoding time, we use the MMC property that sub-tasks are executed sequentially to propose an allocation of the systematic sub-tasks that significantly increases the number of received systematic outputs. Our results further demonstrate that the reduction in the decoding time is even more significant in applications that require only a partial recovery. In these applications, it suffices to complete a certain percentage of the computation, and using our approach, we show that decoding may be completely avoided.
The MapReduce model of distributed computation accomplishes a task in three phases - two computation phases-Map and Reduce, with a communication phase - Shuffle, happening in between. This letter looks at the distribu...
详细信息
The MapReduce model of distributed computation accomplishes a task in three phases - two computation phases-Map and Reduce, with a communication phase - Shuffle, happening in between. This letter looks at the distributed communication problem in the shuffle phase with the assumption that the links through which the computing nodes exchange information is error-prone. Under this assumption, an optimal linear error-correcting transmission scheme is designed using index coding techniques.
In this work, we explore the problem of multi-user linearly-separable distributed computation, whereN servers help compute the desired functions (jobs) of K users, and where each desired function can be written as a l...
详细信息
In this work, we explore the problem of multi-user linearly-separable distributed computation, whereN servers help compute the desired functions (jobs) of K users, and where each desired function can be written as a linear combination of up to L (generally non-linear) subtasks (or sub-functions). Each server computes some of the subtasks, communicates a function of its computed outputs to some of the users, and then each user collects its received data to recover its desired function. We explore the computation and communication relationship between how many servers compute each subtask vs. how much data each user receives. For a matrix F representing the linearly-separable form of the set of requested functions, our problem becomes equivalent to the open problem of sparse matrix factorization F = DE over finite fields, where a sparse decoding matrix D and encoding matrix E imply reduced communication and computation costs respectively. This paper establishes a novel relationship between our distributed computing problem, matrix factorization, syndrome decoding and covering codes. To reduce the computation cost, the above D is drawn from covering codes or from a here-introduced class of so-called 'partial covering' codes, whose study here yields computation cost results that we present. To then reduce the communication cost, these coding-theoretic properties are explored in the regime of codes that have low-density parity check matrices. The work reveals first for the commonly used one-shot scenario - that in the limit of large N, the optimal normalized computation cost gamma is an element of (0, 1) is in the range gamma is an element of (H-q(-1) ( log(q)(L)/N), H-q(-1) (K/N))- where H-q is the q-ary entropy function- and that this can be achieved with normalized communication cost that vanishes as root log(q)(N)/N. The above reveals an unbounded coding gain over the uncoded scenario, as well as reveals the role of a certain functional rate log(q)(L)/N and functiona
In the paper an approximate algorithm for optimizing of distributed computing WAN network is proposed. distributed computing systems become the common tools in different kind of business, science and even entertainmen...
详细信息
In the paper an approximate algorithm for optimizing of distributed computing WAN network is proposed. distributed computing systems become the common tools in different kind of business, science and even entertainment. In order to minimize processing time of data and utilize spare resources available on remote systems, many companies and institutions decide to build and maintain own wide area networks (WAN) for ensuring reliable and secure distributed processing of data. Design of WANs in concerned with solving different optimization problems, like routing assignment, capacities of channel selection, resource (i.e. servers, management centre) allocation. Due to peculiar structure of wide area networks and nature of protocols, proper optimization methods and algorithms should be constructed for WAN-based distributed computing systems. In the paper the model of the distributed environment, built on WAN infrastructure is presented. Then, the optimization problem for routing assignment, channel capacities assignment and grid management center (data repository) allocation is formulated. Finally, an approximate algorithm is presented for formulated problem. Proposed algorithm, observations and conclusions should effect in improving of distributed computing systems design.
Atomic actions are control abstractions that may be used to promote consistency in distributed computer systems, despite system failures due to node crashes or programming errors. However, atomic actions must either r...
详细信息
Atomic actions are control abstractions that may be used to promote consistency in distributed computer systems, despite system failures due to node crashes or programming errors. However, atomic actions must either result in a new consistent state for the system, or be aborted when a node failure occurs, possibly resulting in the waste of a great deal of work. distributed systems can be made more resilient to internal failures through the use of nested atomic actions, which form a tree structure, and stable checkpoints. With nested atomic actions, the results of an internal atomic action termination will survive local node failures and not be made permanent until the outermost atomic action commits. By establishing checkpoints in stable storage, each individual atomic action can be recovered in the event of node failure. Nested atomic actions can be integrated with remote call primitives and recovery blocks to enhance program structure.
This research paper presents an approach to clustering the prevalence of chronic conditions among children with public insurance in the United States. The data consist of prevalence estimates at the community level fo...
详细信息
This research paper presents an approach to clustering the prevalence of chronic conditions among children with public insurance in the United States. The data consist of prevalence estimates at the community level for 25 pediatric chronic conditions. We employ a spatial clustering algorithm to identify clusters of communities with similar chronic condition prevalences. The primary challenge is the computational effort needed to estimate the spatial clustering for all communities in the U.S. To address this challenge, we develop a distributed computing approach to spatial clustering. Overall, we found that the burden of chronic conditions in rural communities tends to be similar but with wide differences in urban communities. This finding suggests similar interventions for managing chronic conditions in rural communities but targeted interventions in urban areas.
We present a client-server application for the distributed multivariate analysis of time series using standard PCs. We here concentrate on analyses of multichannel EEG/MEG data, but our method can easily be adapted to...
详细信息
We present a client-server application for the distributed multivariate analysis of time series using standard PCs. We here concentrate on analyses of multichannel EEG/MEG data, but our method can easily be adapted to other time series. Due to the rapid development of new analysis techniques, the focus in the design of our application was not only on computational performance, but also on high flexibility and expandability of both the client and the server programs. For this purpose, the communication between the server and the clients as well as the building of the computational tasks has been realized via the Extensible Markup Language (XML). Running our newly developed method in an asynchronous distributed environment with random availability of remote and heterogeneous resources, we tested the system's performance for a number of different univariate and bivariate analysis techniques. Results indicate that for most of the currently available analysis techniques, calculations can be performed in real time, which, in principle, allows on-line analyses at relatively low cost. (c) 2005 Elsevier B.V. All rights reserved.
We consider a distributed computing setting wherein a central entity seeks power from computational providers by offering a certain reward in return. The computational providers are classified into long-term stakehold...
详细信息
We consider a distributed computing setting wherein a central entity seeks power from computational providers by offering a certain reward in return. The computational providers are classified into long-term stakeholders that invest a constant amount of power over time and players that can strategize on their computational investment. In this paper, we model and analyze a stochastic game in such a distributed computing setting, wherein players arrive and depart over time. While our model is formulated with a focus on volunteer computing, it equally applies to certain other distributed computing applications such as mining in blockchain. We prove that, in Markov perfect equilibrium, only players with cost parameters in a relatively low range which collectively satisfy a certain constraint in a given state, invest. We infer that players need not have knowledge about the system state and other players' parameters, if the total power that is being received by the central entity is communicated to the players as part of the system's protocol. If players are homogeneous and the system consists of a reasonably large number of players, we observe that the total power received by the central entity is proportional to the offered reward and does not vary significantly despite the players' arrivals and departures, thus resulting in a robust and reliable system. We then study by way of simulations and mean field approximation, how the players' utilities are influenced by their arrival and departure rates as well as the system parameters such as the reward's amount and dispensing rate. We observe that the players' expected utilities are maximized when their arrival and departure rates are such that the average number of players present in the system is typically between 1 and 2, since this leads to the system being in the condition of least competition with high probability. Further, their expected utilities increase almost linearly with the offered reward and converge to a cons
暂无评论