This article considers a set of sensors, which, as a group, are tasked with taking measurements of the environment and sending a small subset of the measurements to a centralized data fusion center, where the measurem...
详细信息
This article considers a set of sensors, which, as a group, are tasked with taking measurements of the environment and sending a small subset of the measurements to a centralized data fusion center, where the measurements will be used to estimate the overall state of the environment. The sensors' goal is to send the most informative set of measurements so that the estimate is as accurate as possible. This problem is formulated as a submodular maximization problem, for which there exists a well-studied greedy algorithm, where each sensor sequentially chooses a set of measurements from its own local set and communicates its decision to the future sensors in the sequence. In this work, sensors can additionally share measurements with one another, in order to augment the decision set of each sensor. We explore how this increase in communication can be exploited to improve the results of the nominal greedy algorithm. Specifically, we show that this measurement passing can improve the quality of the resulting measurement set by up to a factor of n+1, where n is the number of sensors.
Volume data these days is usually massive in terms of its topology, multiple fields, or temporal component. With the gap between compute and memory performance widening, the memory subsystem becomes the primary bottle...
详细信息
Volume data these days is usually massive in terms of its topology, multiple fields, or temporal component. With the gap between compute and memory performance widening, the memory subsystem becomes the primary bottleneck for scientific volume visualization. Simple, structured, regular representations are often infeasible because the buses and interconnects involved need to accommodate the data required for interactive rendering. In this state-of-the-art report, we review works focusing on large-scale volume rendering beyond those typical structured and regular grid representations. We focus primarily on hierarchical and adaptive mesh refinement representations, unstructured meshes, and compressed representations that gained recent popularity. We review works that approach this kind of data using strategies such as out-of-core rendering, massive parallelism, and other strategies to cope with the sheer size of the ever-increasing volume of data produced by today's supercomputers and acquisition devices. We emphasize the data management side of large-scale volume rendering systems and also include a review of tools that support the various volume data types discussed.
Traditional digital mining algorithms face difficulties in the face of large-scale charging demand and limited charging pile resources, so new methods are needed to optimize the arrangement of charging post. Aiming at...
详细信息
k-truss, a type of cohesive subgraphs of a network, is an important measure for a social network graph. However, with the emergence of large online social networks, the running time of the traditional batch algorithms...
详细信息
ISBN:
(纸本)9781479956678
k-truss, a type of cohesive subgraphs of a network, is an important measure for a social network graph. However, with the emergence of large online social networks, the running time of the traditional batch algorithms for k-truss decomposition is usually prohibitively long on such a graph with billions of edges and millions of vertices. Moreover, the size of a graph becomes too large to load into the main memory of a single machine. Currently, cloud computing has become an imperative way to process the big data. Thus, our aim is to design a scalable algorithm of k-truss decomposition in the scenario of cloud computing. In this paper, we first improve the existing distributed k-truss decomposition in the MapReduce framework. We then propose a theoretical basis for k-truss and use it to design an algorithm based on graph-parallel abstractions. Our experiment results show that our method in the graph-parallel abstraction significantly outperforms the methods based on MapReduce in terms of running time and disk usage.
Radar systems are used in safety-critical applications in vehicles, so it is necessary to ensure their functioning is reliable and trustworthy. System-on-chip (SoC) radars, which are commonly used now-a-days, are inhe...
详细信息
Radar systems are used in safety-critical applications in vehicles, so it is necessary to ensure their functioning is reliable and trustworthy. System-on-chip (SoC) radars, which are commonly used now-a-days, are inherently vulnerable to data manipulation and attacks to gain intellectual property (IP) of the system. This article outlines the vulnerabilities of the SoC radars and proposes a distributed signal processing to improve the resilience of the system. The trustworthiness of the system is improved by partitioning the signal processing into smaller modules. We propose to implement these modules on separate processors such that it is made up of multiple application-specific integrated circuits (ASICs). Furthermore, a sparse antenna topology is proposed to limit the information stored in these modules. Therefore, it is difficult to execute a successful attack or gain any knowledge of the targets or system design based on the compromised data in one ASIC. This article introduces the generic structure for partitioning the signal processing steps involved in target detection and the sparse array topology used by the 77-GHz radar. A method for estimating the azimuth and elevation angles for the considered sparse array is also introduced.
Aiming at solving the performance degradation of federated learning (FL) under heterogeneous data distribution, personalized FL (PFL) was proposed. It is designed to produce a dedicated model for each client. However,...
详细信息
Aiming at solving the performance degradation of federated learning (FL) under heterogeneous data distribution, personalized FL (PFL) was proposed. It is designed to produce a dedicated model for each client. However, the existing PFL solution only focuses on the performance of personalized model, ignoring the performance of global model, which will affect the willingness of new clients to participate. In order to solve this problem, this paper proposes a new PFL solution, a two-stage PFL based on sparse pretraining, which can not only train a sparse personalized model for each client, but also obtain a sparse global model. The whole training process is divided into sparse pretraining and sparse personalized training, which focus on the performance of global model and personalized model respectively. Also, we propose a mask sparse aggregation technique to maintain the sparsity of the global model in the sparse personalized training stage. Experimental results show that compared with existing algorithms, our proposed algorithm can improve the accuracy of the global model while maintaining advanced personalized model accuracy, and has higher communication efficiency. In order to address the current problem of sparse personalized federated learning where the global model is dense and poorly performing, a new solution is proposed in the authors' work. The two-stage training approach allows for a focus on the training of global and personalized models in the early and late stages, respectively. In addition, sparse mask aggregation techniques have been proposed to guarantee the sparsity of the global model. image
We propose a decentralized optimization algorithm that preserves the privacy of agents' cost functions without sacrificing accuracy, termed EFPSN. The algorithm adopts Paillier cryptosystem to construct zero-sum f...
详细信息
We propose a decentralized optimization algorithm that preserves the privacy of agents' cost functions without sacrificing accuracy, termed EFPSN. The algorithm adopts Paillier cryptosystem to construct zero-sum functional perturbations. Then, based on the perturbed cost functions, any existing decentralized optimization algorithm can be utilized to obtain the accurate solution. We theoretically prove that EFPSN is (epsilon, delta)-differentially private and can achieve infinitesimally small epsilon, delta under deliberate parameter settings. Numerical experiments further confirm the effectiveness of the algorithm.
We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewar...
详细信息
We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewards. To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) algorithm that provably learns a near-globally-optimal policy using only local information. In particular, we show that, despite restricting each agent's attention to only its.. -hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in... In addition, we show the finite-sample convergence of LPI to the global optimal policy, which explicitly captures the trade-off between optimality and computational complexity in choosing kappa. Numerical simulations demonstrate the effectiveness of LPI.
Exploration of different network topologies is one of the fundamental problems of distributed systems. The problem has been studied on networks like lines, rings, tori, rectangular grids, etc. In this work, we have co...
详细信息
ISBN:
(纸本)9783031744976;9783031744983
Exploration of different network topologies is one of the fundamental problems of distributed systems. The problem has been studied on networks like lines, rings, tori, rectangular grids, etc. In this work, we have considered a rectangle enclosed triangular grid (RETG). A RETG is a part of an infinite triangular grid and the part is enclosed by a rectangle whose one pair of parallel sides aligns with a family of parallel straight lines of the infinite triangular grid. We have studied the problem of perpetual exploration on a RETG using oblivious robots. We have considered the robots with limited visibility i.e. the robots are myopic. Infinite visibility becomes impractical for a very large network. Limited visibility is more practical than infinite visibility. The robots have neither any chirality nor any axis agreement. An algorithm is provided to explore the RETG perpetually without any collision. The algorithm works under a synchronous scheduler. The algorithm requires three robots with two hop visibility.
Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments;it identifies sets of independent data that can be updated in parallel. Many algorithms exist fo...
详细信息
ISBN:
(纸本)9781665415576
Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments;it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring on a single GPU or in distributed memory, but hybrid MPI+GPU algorithms have been unexplored until this work, to the best of our knowledge. We present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al. and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs. We further extend our approaches to solve for distance-2 coloring, giving the first known distributed and multi-GPU algorithm for this problem. In addition, we propose novel methods to reduce communication in distributed graph coloring. Our experiments show that our approaches operate efficiently on inputs too large to fit on a single GPU and scale up to graphs with 76.7 billion edges running on 128 GPUs.
暂无评论