Pattern matching in big graphs is important for different modern applications. Recently, this problem was defined in terms of multiple extensions of graph simulation, to reduce complexity and capture more meaningful r...
详细信息
Pattern matching in big graphs is important for different modern applications. Recently, this problem was defined in terms of multiple extensions of graph simulation, to reduce complexity and capture more meaningful results. These results were achieved through the relaxation of commonly used constraint in subgraph isomorphism pattern matching. Nevertheless, these graph simulation variant models are still too strict to provide results in many cases, especially when analyzed graphs contain anomalies and incomplete information. To deal with this issue, we introduce a new graph pattern matching (GPM) method, called partial simulation, capable of retrieving matches despite missing parts of the pattern graph, such as vertices and/or edges. Furthermore, considering the number and inequality of the outputs, we define a relevance function to compute a value expressing how each match vertex respects the pattern graph. Similarly, we define partial dual simulation GPM that returns vertices that satisfy a part of the dual simulation constraints and assigns a relevance value to them. Additionally, we provide distributed scalable algorithms to evaluate the proposed partial simulation methods based on the distributed vertex-centric programming paradigm. Finally, our experiments on real-world data graphs demonstrate the effectiveness of the proposed models and the efficiency of their associated algorithms.
More and more, edge devices embark Artificial Neuron Networks. In this context, a trend is to simultaneously decentralize their training as much as possible while shrinking their resource requirements, both for infere...
详细信息
More and more, edge devices embark Artificial Neuron Networks. In this context, a trend is to simultaneously decentralize their training as much as possible while shrinking their resource requirements, both for inference and training-tasks that are typically intensive in terms of data, memory, and computation. At the edge's extremity, a specific challenge arises concerning the inclusion of microcontroller-based devices typically deployed in the IoT. So far, no general framework has been provided for that. Such devices not only have extremely challenging resource constraints (weak CPUs, slow network connections, memory budgets measured in kilobytes) but also exhibit high polymorphism, leading to large variability in computational performance among these devices. In this paper, we design and implement TDMiL, a versatile framework for distributed training, and transfer learning. TDMiL interconnects and combines logical components including CoAPerator (a central aggregator) and various tiny embedded software runtimes that are specifically tailored for networks comprising heterogeneous, resource-constrained devices built on diverse types of microcontrollers. We report on experiments conducted with the TDMiL framework, which we use to comparatively evaluate several schemes devised to address computational variability among distributed learning microcontroller-based devices, i.e., stragglers. Additionally, we release the code of our implementation of TDMiL as an open-source project, which is compatible with common commercial off-the-shelf IoT hardware and a well-known open-access IoT testbed.
We present Modular Polynomial (MP) Codes for Secure distributed Matrix Multiplication (SDMM). The construction is based on the observation that one can decode certain proper subsets of the coefficients of a polynomial...
详细信息
We present Modular Polynomial (MP) Codes for Secure distributed Matrix Multiplication (SDMM). The construction is based on the observation that one can decode certain proper subsets of the coefficients of a polynomial with fewer evaluations than is necessary to interpolate the entire polynomial. We also present Generalized Gap Additive Secure Polynomial (GGASP) codes. Both MP and GGASP codes are shown experimentally to perform favorably in terms of recovery threshold when compared to other polynomials codes for SDMM which use the grid partition. Both MP and GGASP codes achieve the recovery threshold of Entangled Polynomial Codes for robustness against stragglers, but MP codes can decode below this recovery threshold depending on the set of worker nodes which fails. The decoding complexity of MP codes is shown to be lower than other approaches in the literature, due to the user not being tasked with interpolating an entire polynomial.
Cybersecurity at the edge requires fast computing in energy-constrained environments. Decision trees can provide an explainable solution for network intrusion detection with high detection accuracy at the packet level...
详细信息
Cybersecurity at the edge requires fast computing in energy-constrained environments. Decision trees can provide an explainable solution for network intrusion detection with high detection accuracy at the packet level. However, their hardware implementation needs to support efficient real-time operation. In this paper, we propose a spatially distributed decision tree for network intrusion detection, using memristor-based chiplet leaves. Each chiplet processes an input by comparing it to a predefined boundary stored in the memristor cell and provides a binary output to select one of the interconnected leaves on the lower level, with an estimated power consumption in a 130nm node design of 389 $\mu$ W. The delay is 2.5 $\mu$ s for one inference decision. This chiplet approach is reconfigurable and in line with the natural architecture of decision trees. It also supports the prototyping with known good dies, overcoming the non-idealities challenge prevalent in memristor technologies. Our memristor-based decision trees show high intrusion detection accuracy of 82%, 84%, and 73% on the benchmark UNSW, CIC-IDS, and ACI-IoT datasets respectively, considering 6-bit device precision in one memristor vs. three memristor per boundary configurations. This distributed approach opens the way to utilizing memristor technology despite device defects for applications in need of local real-time computing.
Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obt...
详细信息
Efficient key-value (KV) store becomes important for concurrent and distributed systems to deliver high performance. The promising learned indexes leverage deep-learning models to complement existing KV stores and obtain significant performance improvements. However, existing schemes show limited scalability in concurrent systems due to containing high dependency among data. The practical system performance decreases when inserting a large amount of new data due to triggering frequent and inefficient retraining operations. Moreover, existing learned indexes become inefficient in distributed systems, since different machines incur high overheads to guarantee the data consistency when the index structures dynamically change. To address these problems in concurrent and distributed systems, we propose a fine-grained learned index scheme with high scalability, called FineStore, which constructs independent models with a flattened data structure under the trained data array to concurrently process the requests with low overheads. FineStore processes the new requests in-place with the support of non-blocking retraining, hence adapting to the new distributions without blocking the systems. In the distributed systems, different machines efficiently leverage the extended RCU barrier to guarantee the data consistency. We evaluate FineStore via YCSB and real-world datasets, and extensive experimental results demonstrate that FineStore improves the performance respectively by up to 1.8x and 2.5x than state-of-the-art XIndex and Masstree. We have released the open-source codes of FineStore for public use in GitHub.
We introduce pattern models, a dynamic epistemic logic for analyzing distributed systems. First, we present a version of pattern models where the full-information protocol, widely studied in distributed computability,...
详细信息
We introduce pattern models, a dynamic epistemic logic for analyzing distributed systems. First, we present a version of pattern models where the full-information protocol, widely studied in distributed computability, is static in the product definition of pattern models. Next, we parametrize such a logic so as to add the capability to model dynamics of arbitrary deterministic protocols. We thus give a systematic construction of pattern models for a large variety of distributed-computing models called dynamic-network models. Using pattern models, the epistemic dynamics of a proper subclass of dynamic-network models called oblivious can be described using a static pattern model, hence using constant space. For this case, we present a sufficient unsolvability condition for the consensus task that can be easily verified analyzing the structure of the initial epistemic model and the pattern model for a given oblivious dynamic-network model.
In response to the growing demand for handling unlabeled data in visual tasks, this paper introduces a novel federated self-supervised learning model (FedSSL), which employs a federated learning framework to conduct d...
详细信息
In response to the growing demand for handling unlabeled data in visual tasks, this paper introduces a novel federated self-supervised learning model (FedSSL), which employs a federated learning framework to conduct distributed model training across multiple decentralized datasets. This model effectively harnesses the unlabeled data that is discreetly distributed among various terminals, thereby collaboratively training high-performance models. Moreover, comparative experiments conducted under standard experimental parameters and on general datasets demonstrate the model's efficacy. FedSSL not only reduces the computational complexity of the model but also enhances classification accuracy.
Fast detection of power line outages is critical for maintaining the stable operation of the power system. The aim of this article is to address the real time detection problem of multiple line outages in power system...
详细信息
Fast detection of power line outages is critical for maintaining the stable operation of the power system. The aim of this article is to address the real time detection problem of multiple line outages in power systems. To effectively tackle the high-computational complexity issues associated with traditional approaches, we propose a multiple line outages detection algorithm that utilizes a distributed finite-time observer. The proposed method utilizes only local measurements and information from neighboring buses to update local observer for each bus. The proposed observer is mathematically proven to converge in a finite time, ensuring rapid detection of multiple line outages. Finally, simulation results demonstrate the effectiveness and rapidity of the proposed detection algorithm.
Federated learning (FL) offers a promising solution for effectively leveraging the data scattered across the distributed cloud system. Despite its potential, the huge communication overhead greatly burdens the distrib...
详细信息
Federated learning (FL) offers a promising solution for effectively leveraging the data scattered across the distributed cloud system. Despite its potential, the huge communication overhead greatly burdens the distributed cloud system. Federated distillation (FD) is a novel distributed learning technique with low communication cost, in which the clients communicate only the model logits rather than the model parameters. However, FD faces challenges related to data heterogeneity and security. Additionally, the conventional aggregation method in FD is vulnerable to malicious uploads. In this article, we discuss the limitations of FL and the challenges of FD in the context of distributed cloud system. To address these issues, we propose a blockchain-based framework to achieve secure and robust FD. Specifically, we develop a pre-training data preparation method to reduce data distribution heterogeneity and an aggregation method to enhance the robustness of the aggregation process. Moreover, a committee/workers selection strategy is devised to optimize the task allocation among clients. Experimental evaluations are conducted to evaluate the effectiveness of the proposed framework.
作者:
Wei, WeiLi, HaoyiZhang, QinghuiHenan Univ Technol
KeyLab Grain Informat Proc & Control Minist Educ Zhengzhou 450001 Peoples R China Henan Univ Technol
Henan Key Lab Grain Storage Informat Intelligent P Zhengzhou 450001 Peoples R China Henan Univ Technol
Henan Grain Big Data Anal & Applicat Engn Res Ctr Zhengzhou 450001 Peoples R China
With the development of cloud technology, hierarchical and distributed cross-cloud architecture is gradually replacing traditional centralized architecture, for example, used in edge (or fog) computing. Due to the flu...
详细信息
With the development of cloud technology, hierarchical and distributed cross-cloud architecture is gradually replacing traditional centralized architecture, for example, used in edge (or fog) computing. Due to the fluctuation of resource requirements, if a node does not have sufficient resources to process requests, the same or higher-level nodes can share their resources by offloading or redirecting requests to themselves, at the possible cost of reduced service quality. However, it is difficult to effectively optimize the sharing effect based on mean requirements. We formulate the multilevel problem with horizontal and vertical resource sharing using stochastic models, identify the optimal structures with embedded subproblems, and obtain the approximation solution in an efficient dynamic programming manner. In the problem setting with a wide range of different parameters, the proposed algorithm can outperform existing mean and heuristic algorithms in all scenarios to improve the total satisfied requirements by up to 26%, and can be hundreds of times faster than these heuristic algorithms.
暂无评论