With the change of energy structure and the continuous development of power system, new loads such as distributed energy and energy storage equipment are gradually connected to distribution network. These devices have...
详细信息
Satisfiability Modulo Theories on arithmetic theories have significant applications in many important domains. Previous efforts have been mainly devoted to improving the techniques and heuristics in sequential SMT sol...
详细信息
ISBN:
(纸本)9783031656262;9783031656279
Satisfiability Modulo Theories on arithmetic theories have significant applications in many important domains. Previous efforts have been mainly devoted to improving the techniques and heuristics in sequential SMT solvers. With the development of computing resources, a promising direction to boost performance is parallel and even distributed SMT solving. We explore this potential in a divide-and-conquer view and propose a novel dynamic parallel framework with variable-level partitioning. To the best of our knowledge, this is the first attempt to perform variable-level partitioning for arithmetic theories. Moreover, we enhance the interval constraint propagation algorithm, coordinate it with Boolean propagation, and integrate it into our variable-level partitioning strategy. Our partitioning algorithm effectively capitalizes on propagation information, enabling efficient formula simplification and search space pruning. We apply our method to three state-of-the-art SMT solvers, namely CVC5, OpenSMT2, and Z3, resulting in efficient parallel SMT solvers. Experiments are carried out on benchmarks of linear and nonlinear arithmetic over both real and integer variables, and our variable-level partitioning method shows substantial improvements over previous partitioning strategies and is particularly good at non-linear theories.
For effective energy management in smart grids, this research suggests a FOG computing architecture that bridges the gap between the edge and CLOUD. Facing data congestion in both traditional edge and CLOUD computing,...
详细信息
The rise of the Internet of Things and Fog computing has increased substantially the number of interconnected devices at the edge of the network. As a result, a large amount of computations is now performed in the fog...
详细信息
ISBN:
(纸本)9783031506833;9783031506840
The rise of the Internet of Things and Fog computing has increased substantially the number of interconnected devices at the edge of the network. As a result, a large amount of computations is now performed in the fog generating vast amounts of data. To process this data in near real time, stream processing is typically employed due to its efficiency in handling continuous streams of information in a scalable manner. However, most stream processing approaches do not consider the underlying network devices as candidate resources for processing data. Moreover, many existing works do not take into account the incurred network latency of performing computations on multiple devices in a distributed way. Consequently, the fog computing resources may not be fully exploited by existing stream processing approaches. To avoid this, we formulate an optimization problem for utilizing the existing fog resources, and we design heuristics for solving this problem efficiently. Furthermore, we integrate our heuristics into Apache Storm, and we perform experiments that show latency-related benefits compared to alternatives.
Federated learning is a promising paradigm that utilizes widely distributed devices to jointly train a machine learning model while maintaining privacy. However, when oriented to distributed resource-constrained edge ...
详细信息
In this paper, we consider the efficient computation of all eigenvalues and eigenvectors of Symmetric Hierarchically Semiseparable (HSS) matrices, which have an inherent structure: the off-diagonal blocks have hierarc...
详细信息
ISBN:
(纸本)9798400717932
In this paper, we consider the efficient computation of all eigenvalues and eigenvectors of Symmetric Hierarchically Semiseparable (HSS) matrices, which have an inherent structure: the off-diagonal blocks have hierarchical bases and have low ranks. State-of-the-art is a divide-conquer algorithm, SuperDC, to compute eigenvectors and eigenvalues in an order of magnitude faster than popular and commercial solvers. We improve on the state-of-the-art and present novel shared- and distributed-memory parallel algorithms for computing eigenvalues of HSS matrices. We take advantage of the recursive divide-conquer approach employed in SuperDC to parallelize the eigenvalue computation, present a span and available parallelism analysis, and optimize the original SuperDC algorithm to reduce the storage requirement from O(N-2) to O(N) in the case of banded matrices. We do a systematic evaluation with different parallel programming paradigms, scheduling policies, and scalability configurations. We observe that in the shared-memory parallel implementations, OpenMP implementations perform better than Cilk versions, work stealing offers no significant performance advantage, and in the distributed-memory implementations, asynchronous communication yields better performance than implementation with barrier-based communication. We find the optimal input decomposition at which the parallel implementations provide the best speedup. For input symmetric matrices of different sparsity structures and sizes ranging from 4096 to 256k rows, on up to 512 cores, the implementations scale well and show a significant speedup of up to 147x compared to the available SuperDC implementation.
Edge computing is considered a promising architecture for handling latency-sensitive and computationally intensive tasks. The lack of consideration for the timing of jobs and their unique topology in the existing rese...
详细信息
The proceedings contain 76 papers. The special focus in this conference is on Network and parallelcomputing. The topics include: AsymFB: Accelerating LLM Training Through Asymmetric Model parallelism;DaCP: Accelerati...
ISBN:
(纸本)9789819628636
The proceedings contain 76 papers. The special focus in this conference is on Network and parallelcomputing. The topics include: AsymFB: Accelerating LLM Training Through Asymmetric Model parallelism;DaCP: Accelerating Synchronization-Free SpTRSV via GPU-Friendly Data Communication and parallelism Strategies;Diagnosability of the Lexicographic Product of Paths and Complete Bipartite Graphs Under PMC Model;DTuner: A Construction-Based Optimization Method for Dynamic Tensor Operators Accelerating;Efficient Implementation of the LOBPCG Algorithm on a CPU-GPU Cluster;HP-CSF: An GPU Optimization Method for CP Decomposition of Incomplete Tensors;JediGAN: A Fully Decentralized Training of GAN with Adaptive Discriminator Averaging and Generator Selection;optimizing Vo-Viso: A Modified Methodology to parallelcomputing with Isolating Data in Memristor Arrays;parallel Computation of the Combination of Two Point Operations in Conic Curves Cryptosystem over GF(2n) Using Tile Self-assembly;parallel Construction of Independent Spanning Trees on 3-ary n-cube Networks;SpecInF: Exploiting Idle GPU Resources in distributed DL Training via Speculative Inference Filling;swDarknet: A Heterogeneous parallel Deep Learning Framework Suitable for SW26010 Pro Processor;VConv: Autotiling Convolution Algorithm Based on MLIR for Multi-core Vector accelerators;ACH-Code: An Efficient Erasure Code to Reduce Average Repair Cost in Cloud Storage Systems of Multiple Availability Zones;CMS: A Computility Resource Status Management and Storage Framework;fast Memory Disaggregation with SwiftSwap;HASLB: Huge Page Allocation Strategy Optimized for Load-Balance in parallelcomputing Programs;lightFinder: Finding Persistent Items with Small Memory;miDedup: A Restore-Friendly Deduplication Method on Docker Image Storage Systems;SPLR: A Selective Packet Loss Recovery for Improved RDMA Performance;a Cluster-Based Platoon Formation Scheme for Realistic Automated Vehicle Platooning;AnaNET: Anatomical Network fo
The increasing demand for electricity has led to the use of multiple energy generation sources. Among these sources, renewable energy is favored due to its environmentally friendly nature. The integration of distribut...
详细信息
Edge computing has transformed machine learning by using computing closer to the data sources, thereby reducing latency. The ever-increasing volume of data has necessitated forming clusters of edge devices, possibly w...
详细信息
暂无评论