In edge computing applications, data is distributed across several nodes. Failed nodes mean losing part of the data which may hamper edge computing. Node repair is needed for frequent nodes failure in edge computing s...
详细信息
ISBN:
(数字)9781728190747
ISBN:
(纸本)9781728183824
In edge computing applications, data is distributed across several nodes. Failed nodes mean losing part of the data which may hamper edge computing. Node repair is needed for frequent nodes failure in edge computing systems. Codes with both the combination property (CP) and Binary Zigzag Decodable (BZD) are referred to as CP-BZD codes. In this paper, without adding extra checking bits, new coding constructions of CP-BZD codes are proposed to repair the failed node in distributed storage systems. All constructed codes can be decoded by the zigzag-decoding algorithm. Numerical analysis shows that compared with the original CP-BZD codes, our proposed schemes obtain better repair efficiency.
In this paper we introduce SpiderWeb, a new methodology for building high speed soft networks on FPGAs. There are many reasons why greater internal bandwidth is an increasingly important issue for FPGAs. Compute densi...
详细信息
ISBN:
(数字)9781728174457
ISBN:
(纸本)9781728174570
In this paper we introduce SpiderWeb, a new methodology for building high speed soft networks on FPGAs. There are many reasons why greater internal bandwidth is an increasingly important issue for FPGAs. Compute density is rapidly growing on FGPA, from historical precisions such as single precision floating point, to the massive parallel low precision operations required by machine learning inference. It is difficult for current FPGA fabrics, with designs developed using standard methods and tool flows, to provide a reliable way of generating wide and/or high speed data distribution busses. In contrast, SpiderWeb uses a specific NoC generation methodology which provides a predictable area and performance for these structures, with area and speed accurately known before compile time. The generated NoCs can be incorporated into large, complex designs, implemented with standard design flows, without compromising routability of the system.
The partitioned global address space memory model has bridged the gap between shared and distributed memory, and with this bridge comes the ability to adapt shared memory concepts, such as non-blocking programming, to...
详细信息
ISBN:
(数字)9781728174457
ISBN:
(纸本)9781728174570
The partitioned global address space memory model has bridged the gap between shared and distributed memory, and with this bridge comes the ability to adapt shared memory concepts, such as non-blocking programming, to distributedsystems such as supercomputers. To enable nonblocking algorithms, we present ways to perform scalable atomic operations on objects in remote memory via remote direct memory access and pointer compression. As a solution to the problem of concurrent-safe reclamation of memory in a distributed system, we adapt Epoch-Based Memory Reclamation to distributed memory and implement it such that it supports global-view programming. This construct is designed and implemented for the Chapel programming language but can be adapted and generalized to work on other languages and libraries.
The multi-parallel chopper system is commonly used in modern MW-level wind turbine converters. Even though the chopper is usually considered as robust and reliable, the semiconductor module can suffer critical junctio...
详细信息
ISBN:
(数字)9781728169903
ISBN:
(纸本)9781728169910
The multi-parallel chopper system is commonly used in modern MW-level wind turbine converters. Even though the chopper is usually considered as robust and reliable, the semiconductor module can suffer critical junction temperature in extreme fault ride through (FRT) events. This paper proposes a method to monitor and identify if the individual chopper is functioning, degraded, or failed by means of collecting and comparing the parallel semiconductors' temperatures during FRT events. A simulation model is created in PLCES, the accuracy of which is further validated by a dedicated experimental setup including an infrared camera. The proposed method is verified by PLECS simulations.
The issues of offering services for providing resources of high-performance hybrid computer systems in applied and fundamental research within the framework of a unified digital platform are considered. Approaches for...
详细信息
The issues of offering services for providing resources of high-performance hybrid computer systems in applied and fundamental research within the framework of a unified digital platform are considered. Approaches for providing parallel execution of various tasks in a distributed computing cluster are proposed. The problems of organization of the computing process in the joint use of computing resources of a distributed hybrid cluster and organization of network interaction using software-defined networks are formulated. (C) 2019 The Authors. Published by Elsevier B.V.
Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not bee...
详细信息
ISBN:
(纸本)9781538692912
Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the clustered or grouped structure of real-world computing servers. Also, the large variations in the computing power and bandwidth capabilities across different servers have not been properly modeled. We suggest a group-based model to reflect practical conditions and develop an appropriate coding scheme for this model. The suggested code, called group code, employs parallel encoding for each group. We show that the suggested coding scheme can asymptotically achieve optimal computing time in the regime of infinite n, the number of workers. While theoretical analysis is conducted in the asymptotic regime, numerical results also show that the suggested scheme achieves near-optimal computing time for any finite but reasonably large n. Moreover, we demonstrate that decoding complexity of the suggested scheme is significantly reduced by the virtue of parallel decoding.
In speech emotion recognition, the features extracted by handmade design are generally low-level, they may not be enough to distinguish subjective emotions, and speech signals are usually have time sequence and every ...
详细信息
ISBN:
(数字)9781728196688
ISBN:
(纸本)9781728196695
In speech emotion recognition, the features extracted by handmade design are generally low-level, they may not be enough to distinguish subjective emotions, and speech signals are usually have time sequence and every frame signal has a different role. Therefore, this paper aims at the above problems, a DCNN BiGRU self-attention model is proposed. The model combines the spatial characteristics of convolutional neural networks, the advantages of circulating neural network in learning time series data, and the characteristics of attention mechanisms that can learn feature weights, thereby improving the accuracy of speech emotion recognition. This model achieved an average recognition rate of 89.53% and 91.74% in the EMO-DB and CASIA databases, and through comparison with other literatures, it is proved that this model can obtain more ideal results in speech emotion recognition.
Many big data algorithms executed on MapReducelike systems have a shuffle phase that often dominates the overall job execution time. Recent work has demonstrated schemes where the communication load in the shuffle pha...
详细信息
ISBN:
(纸本)9781538692912
Many big data algorithms executed on MapReducelike systems have a shuffle phase that often dominates the overall job execution time. Recent work has demonstrated schemes where the communication load in the shuffle phase can be traded off for the computation load in the map phase. In this work, we focus on a class of distributed algorithms, broadly used in deep learning, where intermediate computations of the same task can be combined. Even though prior techniques reduce the communication load significantly, they require a number of jobs that grows exponentially in the system parameters. This limitation is crucial and may diminish the load gains as the algorithm scales. We propose a new scheme which achieves the same load as the state-of-the-art while ensuring that the number of jobs as well as the number of subfiles that the data set needs to be split into remain small.
Recently, with the development of distributed stream processing systems, the elastic resource scaling technique has been significantly improved. Many researchers focus on leveraging the approaches based on predicting ...
详细信息
ISBN:
(纸本)9781728143286
Recently, with the development of distributed stream processing systems, the elastic resource scaling technique has been significantly improved. Many researchers focus on leveraging the approaches based on predicting the trend of data load to implement the elastic scaling. However, the existing predicting methods cannot track and predict the fluctuation of performance online accurately, and they need to utilize more dimensions of the raw data and resources to enhance the performance of prediction. To address these issues, we propose a framework named OMOPredictor to make an accurate prediction of operator performance online. The experimental results show that OMOPredictor can enhance the prediction of the operator performance on three real-world datasets.
The development of ever larger and more energy-efficient computer systems in recent years has led to more and more systems with heterogeneous computing units (CPUs, GPUs or FPGAS) and systems with heterogeneous storag...
详细信息
ISBN:
(数字)9781728174457
ISBN:
(纸本)9781728174570
The development of ever larger and more energy-efficient computer systems in recent years has led to more and more systems with heterogeneous computing units (CPUs, GPUs or FPGAS) and systems with heterogeneous storage systems (High Memory Bandwidth). With the rise of persistent memory, attached to the PCIe bus or to the memory DIMMs, the border between storage and memory becomes more and more fluid. Other systems offer different types of compute nodes, so that a group of nodes build the accelerator (modular supercomputing). Hierarchical storage architectures, for example using burst buffers, try to overcome the IO problems. Programming such a system can be a real challenge along with locality, scheduling, load balancing, concurrency and so on.
暂无评论