This paper introduces PowerInfer, a high-speed Large Language Model (LLM) inference engine on a personal computer (PC) equipped with a single consumer-grade GPU. The key principle underlying the design of PowerInfer i...
详细信息
Context The growing usage of platforms for distributedcomputing and their workloads are requiring more energy to power data centers, and the current consumption is already high. The increasing availability of green e...
详细信息
ISBN:
(纸本)9798400709968
Context The growing usage of platforms for distributedcomputing and their workloads are requiring more energy to power data centers, and the current consumption is already high. The increasing availability of green energy sources brings opportunities to reduce carbon emissions. Problem It is hard to create software aiming for both performance (makespan) and low brown energy usage. To reduce carbon emissions on distributed platforms, developers need an easy way to program efficiently applications and achieve these objectives. Solution Using OpenMP is an easy way to create and run distributed applications. This paper proposes the use of OpenMP with a new energy-aware scheduling algorithm that aims to minimize brown energy consumption and makespan. IS theory Our multiobjective algorithm (G-MOHEFT) deals with Complexity theory to leverage the Dynamic capabilities of modern distributed platforms. The algorithm implements a heuristic for adapting and redistributing workloads accordingly to different scenarios. Method Containers were used to simulate a distributed OpenMP Cluster (OMPC) platform. Different simulations using previously measured data were used to distribute workloads in different combinations of green energy availability. Summary of Results We study the solution tradeoffs obtained using G-MOHEFT, which varies from saving none to some brown energy consumption by keeping the same or increasing the makespan in exchange. Depending on the scenario, it could even reduce the brown energy consumption to zero. Contributions and impacts to IS Developers can use our algorithm to easily develop distributed software with a reduced carbon footprint using OpenMP in OMPC.
Edge-assisted wireless sensing is increasingly popular, where complex neural network models perform inference tasks on wireless channel state information (CSI) data streamed from IoT devices. However large volumes of ...
详细信息
The rise of deep learning methods has ignited interest in efficient hardware and software systems for tensor-based computing. A question worth investigating is whether other areas in computing can benefit as well from...
详细信息
In recent years, there has been a surge in research on autonomous systems such as unmanned aerial vehicles and autonomous-driving systems. These systems require high computing power with low power consumption and stri...
详细信息
BFT protocols in the synchronous setting rely on a strong assumption: every message sent by a party will arrive at its destination within a known bounded time. To allow some degree of asynchrony while still tolerating...
详细信息
ISBN:
(纸本)9781450385480
BFT protocols in the synchronous setting rely on a strong assumption: every message sent by a party will arrive at its destination within a known bounded time. To allow some degree of asynchrony while still tolerating a minority corruption, recently, in Crypto'19, a weaker synchrony assumption called mobile sluggish faults was introduced. In this work, we investigate the support for mobile sluggish faults in existing synchronous protocols such as Dfinity, Streamlet, Sync HotStuff, OptSync and the optimal latency BFT protocol. We identify key principles that can be used to "compile" these synchronous protocols to tolerate mobile sluggish faults.
We develop a general deterministic distributed method for locally rounding fractional solutions of graph problems for which the analysis can be broken down into analyzing pairs of vertices. Roughly speaking, the metho...
详细信息
ISBN:
(纸本)9781611977554
We develop a general deterministic distributed method for locally rounding fractional solutions of graph problems for which the analysis can be broken down into analyzing pairs of vertices. Roughly speaking, the method can transform fractional/probabilistic label assignments of the vertices into integral/deterministic label assignments for the vertices, while approximately preserving a potential function that is a linear combination of functions, each of which depends on at most two vertices (subject to some conditions usually satisfied in pairwise analyses). The method unifies and significantly generalizes prior work on deterministic local rounding techniques [Ghaffari, Kuhn FOCS'21;Harris FOCS'19;Fischer, Ghaffari, Kuhn FOCS'17;Fischer DISC'17] to obtain polylogarithmic-time deterministic distributed solutions for combinatorial graph problems. Our general rounding result enables us to locally and efficiently derandomize a range of distributed algorithms for local graph problems, including maximal independent set (MIS), maximum-weight independent set approximation, and minimum-cost set cover approximation. As highlights, we in particular obtain the following results. We obtain a deterministic O(log(2)Delta center dot log n)-round algorithm for computing an MIS in the LOCAL model and an almost as efficient O(log(2)Delta center dot log log Delta center dot log n)-round deterministic MIS algorithm in the CONGEST model. As a result, the best known deterministic distributed time complexity of the four most widely studied distributed symmetry breaking problems (MIS, maximal matching, (Delta + 1)-vertex coloring, and (2 Delta - 1)-edge coloring) is now O(log(2)Delta center dot log n). Our new MIS algorithm is also the first direct polylogarithmic-time deterministic distributed MIS algorithm, which is not based on network decomposition. We obtain efficient deterministic distributed algorithms for rounding fractional solutions for maximum (weighted) independent set and mini
distributed online convex optimization (DOCO) has emerged as a promising approach in scenarios where multiple learners collaboratively serve sequential (possibly untrusted) clients using AI models. However, ensuring c...
详细信息
ISBN:
(纸本)9781450399265
distributed online convex optimization (DOCO) has emerged as a promising approach in scenarios where multiple learners collaboratively serve sequential (possibly untrusted) clients using AI models. However, ensuring clients' privacy and minimizing regret while keeping the communication cost reasonable poses a significant challenge. To address this issue, we propose private DOCO algorithms, termed PDOM, for both oblivious and stochastic settings. Our approach involves a mini-batch strategy that optimally balances the effects of slower model updates and differential privacy (DP) perturbation. Our theoretical analysis shows that compared to state-of-the-art algorithms, PDOM reduces the regret bounds and communication cost by an O(d/epsilon) factor for the oblivious setting. For the stochastic setting, the impact of DP perturbation becomes negligible if the learning time T = Omega(d(4)/(n epsilon(4))) provided that the loss functions are Lipschitz, convex, and smooth, where d is the model dimension, n is the number of learners, and epsilon is the privacy budget. Our evaluations validate these results, demonstrating that PDOM can reduce classification error rates of the state-of-the-art methods by up to 20% in distributed online logistic regression tasks while achieving communication savings of above 90%.
Approximate agreement is a weaker version of consensus where two or more processes must agree on a real number within a distance e of each other. Many variants of this task have been considered in the literature: cont...
详细信息
ISBN:
(纸本)9781450385480
Approximate agreement is a weaker version of consensus where two or more processes must agree on a real number within a distance e of each other. Many variants of this task have been considered in the literature: continuous or discrete ones;multi-dimensional ones;as well as agreement on graphs and other spaces. We focus on two variants of approximate agreement on graphs, edge agreement and clique agreement. We show that both tasks arise as special cases of a more general, higher-dimensional, approximate agreement task, where the processes must agree on the vertices of a simplex in a given simplicial complex. This new point of view gives rise to a novel topological perspective on the solvability of clique agreement.
This work focuses on understanding the quantum message complexity of two central problems in distributedcomputing, namely, leader election and agreement in synchronous message-passing communication networks. We show ...
详细信息
ISBN:
(纸本)9798400718854
This work focuses on understanding the quantum message complexity of two central problems in distributedcomputing, namely, leader election and agreement in synchronous message-passing communication networks. We show that quantum communication gives an advantage for both problems by presenting quantum distributed algorithms that significantly outperform their respective classical counterparts under various network *** prior works have studied and analyzed quantum distributed algorithms in the context of (improving) round complexity, a key conceptual contribution of our work is positing a framework to design and analyze the message complexity of quantum distributed algorithms. We present and demonstrate how quantum algorithmic techniques, such as Grover search, quantum counting, and quantum walks, can significantly enhance the message efficiency of distributed *** particular, our leader election protocol for diameter-2 networks uses quantum walks to achieve the improved message complexity. To the best of our knowledge, this is the first such application of quantum walks in distributedcomputing.
暂无评论