The current Single-User Key Derivation (SKD) caters to individual management of blockchain's tree-structured assets but falls short for threshold signatures aimed at multi-party control of blockchain assets. We in...
详细信息
The Edmonds Blossom algorithm is implemented here using depth-first search, which is intrinsically serial. By streamlining the code, our serial implementation is consistently three to five times faster than the previo...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
The Edmonds Blossom algorithm is implemented here using depth-first search, which is intrinsically serial. By streamlining the code, our serial implementation is consistently three to five times faster than the previously fastest general graph matching code. By extracting parallelism across iterations of the algorithm, with coarse -grain locking, we are able to further reduce the run lime on random regular graphs fourfold and obtain a two-fold reduction of run time on real-world graphs with similar topology. Solving very sparse graphs (average degree less than four) exhibiting comnwnity structure with eight threads led to a slow down of three-fold, but this slow down is replaced by marginal speed up once the average degree is greater than four. We conclude that our parallel coarse -grain locking implementation performs well when extracting parallelism from this augmenting-path-based algorithm and may work well for similar algorithms.
The proceedings contain 28 papers. The topics discussed include: performance and usability implications of multiplatform and WebAssembly containers;operations patterns for hybrid quantum applications;optimization of c...
ISBN:
(纸本)9789897587474
The proceedings contain 28 papers. The topics discussed include: performance and usability implications of multiplatform and WebAssembly containers;operations patterns for hybrid quantum applications;optimization of cloud-native application execution over the edge cloud continuum enabled by DVFS;energy-aware node selection for cloud-based parallel workloads with machine learning and infrastructure as code;security-aware allocation of replicated data in distributed storage systems;performance analysis of mdx ii: a next-generation cloud platform for cross-disciplinary data science research;data orchestration platform for AI workflows execution across computing continuum;framework for decentralized data strategies in virtual banking: navigating scalability, innovation, and regulatory challenges in Thailand;and anomaly detection for partially observable container systems based on architecture profiling.
In the era of Big Data, the computational demands of machine learning (ML) algorithms have grown exponentially, necessitating the development of efficient parallel computing techniques. This research paper delves into...
详细信息
We investigate the timestamp allocation scheme in classical concurrency controls of the database management systems (DBMS) on many-core machines. Then we discuss a distributed logical timestamp allocation scheme with ...
详细信息
ISBN:
(纸本)9798400701559
We investigate the timestamp allocation scheme in classical concurrency controls of the database management systems (DBMS) on many-core machines. Then we discuss a distributed logical timestamp allocation scheme with uniqueness and fairness to improve the performance of DBMS concurrency control algorithms on many-core machines. Further, the proposed logical timestamp generator is free of bottlenecks such as accessing the system clock counter, calling for atomic add operation, and synchronization. Finally, we experiment with an optimistic concurrency control algorithm based on the proposed and other allocation schemes. The results show that the performance of an optimistic concurrency control algorithm based on the proposed timestamp allocation outperforms one based on other allocations. Furthermore, it has better linear scalability under heavy loads.
Deep Neural Network (DNN) models have been widely deployed in a variety of applications. Driven by privacy concerns and great improvement in the computational power of mobile devices, the idea of training machine lear...
详细信息
ISBN:
(纸本)9781665473156
Deep Neural Network (DNN) models have been widely deployed in a variety of applications. Driven by privacy concerns and great improvement in the computational power of mobile devices, the idea of training machine learning models on mobile devices has become more and more important. Directly applying parallel training frameworks designed for data center networks to train DNN models on mobile devices may not achieve the ideal performance, since mobile devices usually have multiple types of computation resources such as ASIC, neural engine, and FPGA. Moreover, the communication time is not negligible when training on mobile devices. With the objective of minimizing DNN training time, we propose to extend the pipeline parallelism, which can hide the communication time behind computation for DNN training by integrating the resource allocation. Fine-tuning the ratio of resources allocated to forward and backward propagation can improve resource utilization. We focus on homogeneous workers and theoretically analyze the ideal cases where resources are linearly separable. We also discuss the model partition and resource allocation for a more realistic case. Additionally, we investigate the heterogeneous worker case. Trace-based simulation results show that our scheme can efficiently reduce the time cost of a training iteration.
parallel/distributed particle filters estimate the states of dynamic systems by using Bayesian interference and stochastic sampling techniques with multiple processing units (PUs). The sampling procedure and the resam...
详细信息
distributed Antenna systems (DAS) are gaining popularity as a solution for providing cellular coverage and capacity in areas with weak signal strength or high user density. In this paper, we propose and validate a pho...
详细信息
Operating systems mediate user-device interactions, crucially managing resources, software execution, and application interfaces. In the face of technological advancements, the demand for secure, resilient, and scalab...
详细信息
The primary bottleneck of blockchain is shifting from consensus to execution due to recent advances in DAG-based consensus algorithms supporting over 100k TPS. Many blockchain systems segregate execution from ordering...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
The primary bottleneck of blockchain is shifting from consensus to execution due to recent advances in DAG-based consensus algorithms supporting over 100k TPS. Many blockchain systems segregate execution from ordering, missing the opportunity to harness potential parallelism in consensus-produced batches. In this paper, we propose a new deterministically orderable concurrency control algorithm, OptME, which improves the performance of execution phase by exploiting inherent parallelism among transactions. This algorithm analyzes transaction dependencies to extract parallelism, and determines the total order of transaction execution. OptME consists of three steps: (1) building a transaction dependency graph, (2) generating a parallel execution schedule, and (3) executing transactions based on the schedule. We employ several optimizations, including parallel dependency graph construction, early abort detection, and efficient reordering with an optimistic assumption. Our evaluation demonstrates that OptME achieves up to 350k TPS and outperforms a state-of-the-art concurrency control algorithm, even under high contention scenarios.
暂无评论