distributed matrix computations - matrix-matrix or matrix-vector multiplications - are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in this area is (i) eit...
详细信息
distributed matrix computations - matrix-matrix or matrix-vector multiplications - are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in this area is (i) either sub-optimal in terms of its straggler resilience, or (ii) suffers from numerical problems, i.e., there is a blow-up of round-off errors in the decoded result owing to the high condition numbers of the corresponding decoding matrices. Our work presents a convolutional coding approach to this problem that removes these limitations. It is optimal in terms of its straggler resilience, and has excellent numerical robustness as long as the workers' storage capacity is slightly higher than the fundamental lower bound. Moreover, it can be decoded using a fast peeling decoder that only involves add/subtract operations. Our second approach has marginally higher decoding complexity than the first one, but allows us to operate arbitrarily close to the storage capacity lower bound. Its numerical robustness can be theoretically quantified by deriving a computable upper bound on the worst case condition number over all possible decoding matrices by drawing connections with the properties of large block Toeplitz matrices. All above claims are backed up by extensive experiments done on the AWS cloud platform.
More than two decades ago, combinatorial topology was shown to be useful for analyzing distributed fault-tolerant algorithms in shared memory systems and in message passing systems. In this work, we show that combinat...
详细信息
More than two decades ago, combinatorial topology was shown to be useful for analyzing distributed fault-tolerant algorithms in shared memory systems and in message passing systems. In this work, we show that combinatorial topology can also be useful for analyzing distributed algorithms in failure-free networks of arbitrary structure. To illustrate this, we analyze consensus, set-agreement, and approximate agreement in networks, and derive lower bounds for these problems under classical computational settings, such as the LOCAL model and dynamic networks. (C) 2020 The Author(s). Published by Elsevier B.V.
Graph encoding methods have been proven exceptionally useful in many classification tasks - from molecule toxicity prediction to social network recommendations. However, most of the existing methods are designed to wo...
详细信息
Graph encoding methods have been proven exceptionally useful in many classification tasks - from molecule toxicity prediction to social network recommendations. However, most of the existing methods are designed to work in a centralized environment that requires the whole graph to be kept in memory. Moreover, scaling them on very large networks remains a challenge. In this work, we propose a distributed and permutation invariant graph embedding method denoted as distributed Graph Statistical Distance (DGSD) that extracts graph representation on independently distributed machines. DGSD finds nodes' local proximity by considering only nodes' degree, common neighbors and direct connectivity that allows it to run in the distributed environment. On the other hand, the linear space complexity of DGSD makes it suitable for processing large graphs. We show the scalability of DGSD on sufficiently large random and real-world networks and evaluate its performance on various bioinformatics and social networks with the implementation in a distributed computing environment. (C) 2021 Elsevier B.V. All rights reserved.
Internet of Things (IoT) are one of the key enablers of personalized health. However, IoT devices often have stringent constraints in terms of resources, e.g., energy budget, and, therefore, limited possibilities to e...
详细信息
Internet of Things (IoT) are one of the key enablers of personalized health. However, IoT devices often have stringent constraints in terms of resources, e.g., energy budget, and, therefore, limited possibilities to exploit the state-of-the-art Deep Neural Networks (DNNs). Energy-aware Neural Architecture Search (NAS) is proposed to tackle this challenge, by exploring lightweight DNN (DNN) architectures on a single IoT device, but not leveraging the inherently distributed nature of IoT systems. As a result, the joint optimization of DNN architectures and DNN computation partitioning/offloading has not been addressed to date. In this paper, we propose an energy-aware NAS framework for distributed IoT, aiming to search for distributed Deep Neural Networks (DNNs) to maximize prediction performance subjected to Flash Memory (Flash), Random Access Memory (RAM), and energy constraints. Our framework searches for lightweight DNN architecture with optimized prediction performance and its corresponding optimal computation partitioning to offload the partial DNN from edge to fog in a joint optimization. We evaluate our framework in the context of two common health applications, namely, seizure detection and arrhythmia classification, and demonstrate the effectiveness of our proposed joint optimization framework compared to NAS benchmarks.
Sensor networks are critical for building smart environments for monitoring various physical and environmental conditions. Several automated tasks involving continuous and critical practically becomes infeasible for h...
详细信息
Sensor networks are critical for building smart environments for monitoring various physical and environmental conditions. Several automated tasks involving continuous and critical practically becomes infeasible for humans to perform with precision. Therefore, wireless sensor networks have emerged as the next-generation technology to permeate the technological upgradations into our daily activities. Such intelligent networks, embedded with sensing expertise, however, are severely energy-constrained. Sensor networks have to process and transmit large volumes of data from sensors to sink or base station, requiring a lot of energy consumption. Since energy is a critical resource in the sensor network to drive all its basic functioning, hence, it needs to be efficiently utilized for elongating network lifetime. This makes energy conservation primarily significant in sensor network design, especially at the sensor node level. Our research proposes an On-balance volume indicator-based Data Prediction (ODP) model for predicting the temperature in the sensor network. Our proposed model can be used to predict temperature with a permissible error of tolerance. This helps in reducing excessive power consumption expended in redundant transmissions, thereby increasing the network lifetime. The proposed data prediction model is compared with existing benchmark time series prediction models, namely Linear Regression (LR) and Auto-Regressive Integrated Moving Average (ARIMA). Experimental outcomes endorsed that our proposed prediction model outperformed the existing counterparts in terms of prediction accuracy and reduction in the number of transmissions in clustered architecture.
A six-month-long Atlantic hurricane season impacts Florida residents every year and can result in devastating consequences, including loss of life, property damage, and business interruptions. Hurricane risk assessmen...
详细信息
A six-month-long Atlantic hurricane season impacts Florida residents every year and can result in devastating consequences, including loss of life, property damage, and business interruptions. Hurricane risk assessment and loss prediction are critical to various uses such as determining homeowner insurance premiums, regulating these premiums, conducting scenario analysis, conducting stress tests for companies, disaster management, and evaluating the benefits of disaster mitigation techniques. This article describes the Florida Public Hurricane Loss Model (FPHLM): a large-scale catastrophe model with massive databases and analytics tools for business and government decision-making. We will discuss the design and implementation of each component in FPHLM and explain the tools and techniques utilized to tackle challenges in data availability, data analytics, and the interface between the data, analytical techniques, and computing. Results are shown to validate the software system's effectiveness and reliability and illustrate some of the system's use cases.
Recently, wide-area distributed computing environments have become popular owing to their huge resource capability. In a wide-area distributed computing environment, joint scheduling of tasks and data is the main stra...
详细信息
Recently, wide-area distributed computing environments have become popular owing to their huge resource capability. In a wide-area distributed computing environment, joint scheduling of tasks and data is the main strategy to improve system performance. However, the geographically distributed diverse resources exhibit high variations, making it challenging to design efficient joint scheduling of tasks and data. To accurately adapt to the dynamic variations of geographically distributed diverse resources and achieve a high system performance, this study proposes a hypergraph-partitioning-based online joint scheduling method. The proposed method constructs a hypergraph of geographically distributed tasks, data, and diverse resources to clearly describe the correlation among the three elements and quantitatively reflect the time cost of different process in the environment. The hypergraph is dynamically updated according to the generated scheduling scheme and the collected information to reflect the dynamic variations of resource states. Then, a hypergraph partition optimization mechanism is proposed to generate efficient joint scheduling schemes, thus reducing the overall completion time in the system. The experimental results indicate that compared with the state-of-the-art joint scheduling methods, the proposed method reduces the overall completion time by up to 25.67% and significantly reduces the task waiting time, although it makes a concession in the data migration time.
In this paper, we address the problem of supporting stateful workflows following a Function-as-a-Service (FaaS) model in edge networks. In particular we focus on the problem of data transfer, which can be a performanc...
详细信息
In this paper, we address the problem of supporting stateful workflows following a Function-as-a-Service (FaaS) model in edge networks. In particular we focus on the problem of data transfer, which can be a performance bottleneck due to the limited speed of communication links in some edge scenarios and we propose three different schemes: a pure FaaS implementation, StateProp, i.e., propagation of the application state throughout the entire chain of functions, and StateLocal, i.e., a solution where the state is kept local to the workers that run functions and retrieved only as needed. We then extend the proposed schemes to the more general case of applications modeled as Directed Acyclic Graphs (DAGs), which cover a broad range of practical applications, e.g., in the Internet of Things (IoT) area. Our contribution is validated via a prototype implementation. Experiments in emulated conditions show that applying the data locality principle reduces significantly the volume of network traffic required and improves the end-to-end delay performance, especially with local caching on edge nodes and low link speeds.(c) 2022 Elsevier B.V. All rights reserved.
The enumeration of hop-constrained simple paths is a building block in many graph-based areas. Due to the enormous search spaces in large-scale graphs, a single machine can hardly satisfy the requirements of both effi...
详细信息
The enumeration of hop-constrained simple paths is a building block in many graph-based areas. Due to the enormous search spaces in large-scale graphs, a single machine can hardly satisfy the requirements of both efficiency and memory, which causes an urgent need for efficient distributed methods. In practice, it is inevitable to produce plenty of intermediate results when directly extending centralized methods to the distributed environment, thereby causing a memory crisis and weakening the query performance. The state-of-the-art distributed method HybridEnum designed a hybrid search paradigm to enumerate simple paths. However, it makes massive exploration for the redundant vertices not located in any simple path, thereby resulting in poor query performance. To alleviate this problem, we design a distributed approach DistriEnum to optimize query performance and scalability with well-bound memory consumption. Firstly, DistriEnum adopts a graph reduction strategy to rule out the redundant vertices without satisfying the constraint of hop number. Then, a core search paradigm is designed to simultaneously reduce the traversal of shared subpaths and the storage of intermediate results. Moreover, DistriEnum is equipped with a task division strategy to theoretically achieve workload balance. Finally, a vertex migration strategy is devised to reduce the communication cost during the enumeration. The comprehensive experimental results on 10 real-world graphs demonstrate that DistriEnum achieves up to 3 orders of magnitude speedup than HybridEnum in query performance and exhibits superior performances on scalability, communication cost, and memory consumption.
Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker's features in the speech signal, and recognize the speaker thereafter. In general, the speech re...
详细信息
Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker's features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves three main steps: acoustic processing, feature extraction, and classification/recognition. The purpose of feature extraction is to illustrate a speech signal using a predetermined number of signal components. This is because all information in the acoustic signal is excessively cumbersome to handle, and some information is irrelevant in the identification task. This study proposes a machine learning-based approach that performs feature parameter extraction from speech signals to improve the performance of speech recognition applications in real-time smart city environments. Moreover, the principle of mapping a block of main memory to the cache is used efficiently to reduce computing time. The block size of cache memory is a parameter that strongly affects the cache performance. In particular, the implementation of such processes in real-time systems requires a high computation speed. Processing speed plays an important role in speech recognition in real-time systems. It requires the use of modern technologies and fast algorithms that increase the acceleration in extracting the feature parameters from speech signals. Problems with overclocking during the digital processing of speech signals have yet to be completely resolved. The experimental results demonstrate that the proposed method successfully extracts the signal features and achieves seamless classification performance compared to other conventional speech recognition algorithms.
暂无评论