In the realm of computer science, it may seem that distributed computing and machine learning exist on opposite ends of the spectrum. However, there are many connections between the two domains, both in theory and pra...
详细信息
ISBN:
(纸本)9798400701214
In the realm of computer science, it may seem that distributed computing and machine learning exist on opposite ends of the spectrum. However, there are many connections between the two domains, both in theory and practice. Recently, machine learning research has become excited about graphs. And when machine learning meets graphs, researchers familiar with distributed algorithms may experience a sense of deja vu, as many classic distributed computing paradigms are being rediscovered. It feels a bit like "machine learning + graphs = distributed algorithms." In my talk, I am going to introduce some key concepts in graph machine learning such as underreaching and oversquashing. These concepts have been known in the distributed computing community as local and congest, respectively. In the main part of the talk, I am going to present some recent breakthroughs in this exciting intersection of fields. Finally, I will also present some intriguing open problems.
Researchers conduct post-processing on the simulation results by running an interactive data analysis tool on a High-Performance computing (HPC) system installed at an HPC center and retrieving the post-processed resu...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Researchers conduct post-processing on the simulation results by running an interactive data analysis tool on a High-Performance computing (HPC) system installed at an HPC center and retrieving the post-processed results. Certain data analysis scenarios require to transfer the simulation results directly from the center. in such scenarios, a portion of the data would usually be streamed over the network to achieve interactivity. However, there still exist two challenges in maintaining interactivity: (1) limited network bandwidth and (2) long network latency. To tackle these challenges, we propose a system to enable interactive array analysis over the network. We employ error-bounded lossy compression to increase the effective network bandwidth. Furthermore, we employ multi-level caching to hide the network latency and combine prefetching to improve the cache hit ratio. The cache replacement and prefetching policies are designed considering the data access pattern of interactive analysis. We compared our proposed system with TileDB, one of the state-of-the-art array databases, by measuring the average latency for various access patterns. Compared to TileDB, the proposed system reduces the average latency by up to 91.6% by allowing 10% of error because the cache hit ratio was improved by more than 40% due to the proper cache replacement and prefetching policy and network transfer time was reduced more than 75% by using lossy compression.
Federated learning (FL) emerges as a promising solution to train machine learning (ML) models from distributed data sources. In FL, the heterogeneous and imbalanced data distribution of local clients could severely hu...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
Federated learning (FL) emerges as a promising solution to train machine learning (ML) models from distributed data sources. In FL, the heterogeneous and imbalanced data distribution of local clients could severely hurt the fairness of the aggregated global model. In this paper, we identify two key obstacles of developing fair FL models w.r.t. the global distribution: the domain shifts from local clients to the global data distribution and the fairness heterogeneity across local clients. Therefore, considering these two obstacles, we present a novel fairness-aware FL training framework Robust-Fair Domain Smoothing (RFDS) to address the bias issue of FL models from a unique domain-shifting perspective. In particular, we design two novel components to build RFDS: 1) local robust-fair training, and 2) reference domain smoothing. Local robust-fair training aims to train robust-fair local models whose fairness is robust against the domain shifts from local distributions to the global distribution. Reference domain smoothing reduces the heterogeneity of fairness across clients to improve the fairness of the aggregated global model. We further provide a theoretical analysis to show the connection between the domain discrepancy of local data distributions and the heterogeneity of fairness across clients. Empirical evaluation results on multiple real-world datasets show that RFDS achieves promising performance gains in improving demographic fairness compared to state-of-the-art baselines.
In this paper, we review the key features and major drawbacks of the NeuroEvolution of Augmenting Topologies (NEAT) algorithm, such as slow training speed that limits its area of application. The main reason for the p...
详细信息
ISBN:
(纸本)9789819735556;9789819735563
In this paper, we review the key features and major drawbacks of the NeuroEvolution of Augmenting Topologies (NEAT) algorithm, such as slow training speed that limits its area of application. The main reason for the performance issues of the NEAT algorithm is the huge number of calculations required at the end of each epoch to estimate the fitness of each organism in the population. We propose a software system architecture that can be implemented to solve NEAT performance problems based on Ray cluster-computing framework. Finally, we demonstrate how fitness estimation computations can be distributed across stateless distributed workers deployed either on-premise or in the cloud using Ray framework.
Semi-supervised learning (SSL) has been applied to many practical applications over the past few years. Recently, distributed graph-based semi-supervised learning (DGSSL) has shown to have good performance. Traditiona...
详细信息
ISBN:
(纸本)9781665450850
Semi-supervised learning (SSL) has been applied to many practical applications over the past few years. Recently, distributed graph-based semi-supervised learning (DGSSL) has shown to have good performance. Traditional DGSSL algorithms usually have the problem of the straggler effect that algorithm execution time is limited by the slowest node. To solve this problem, a novel coded DGSSL(CDGSSL) algorithm based on the Maximum Distance Separable (MDS) code is proposed in this paper. Specifically, the proposed algorithm is based on the Maximum Distance Separable (MDS) code. In general, the proposed coded distributed algorithm is straggler-tolerant. Moreover, we provide optimal parameters design for the proposed algorithm. The superiority of the proposed algorithm has been confirmed via experiments on Alibaba Cloud Elastic Compute Service.
Swarm Learning (SL) has been recently proposed for distributed learning, where a group of individual centers perform a synchronized training. Unlike traditional machine learning models that rely on a central server, s...
详细信息
ISBN:
(纸本)9798350383744;9798350383737
Swarm Learning (SL) has been recently proposed for distributed learning, where a group of individual centers perform a synchronized training. Unlike traditional machine learning models that rely on a central server, swarm learning distributes the learning process across multiple nodes. Each node independently processes data and contributes to the overall learning task. This collaboration allows the swarm to benefit from individual nodes' different data. Unlike federated learning, here model parameters are not handled by a central server but are randomly handled across each individual node. The intrinsic attention of swarm learning to data privacy makes it suitable for distributed healthcare analysis, where a clinical center wants to benefit from all the other ones in the swarm network. However, the benefit for a single center or for the whole network could vary depending on data distribution. In this paper, we want to analyze the performance of the swarm learning in a network with multiple nodes, where different data distribution scenarios are taken into account. This analysis will show the gain of the whole swarm network and a specific (reference) node, focusing on scenarios where this node has a different amount of data with respect to the other nodes. To perform a more analytical analysis, we introduce a new Key Performance Indicator (KPI) to measure such gain. We then applied this method using ICU data extracted from the MIMIC EHR database and discussed the results obtained by analyzing 5 nodes with different data distribution scenarios.
This paper examines the equilibrium between user transaction fees and miner profitability within proof-of-work-based blockchains, specifically focusing on Bitcoin. We analyze the dependency of mining profit on factors...
详细信息
ISBN:
(数字)9783031626388
ISBN:
(纸本)9783031626371;9783031626388
This paper examines the equilibrium between user transaction fees and miner profitability within proof-of-work-based blockchains, specifically focusing on Bitcoin. We analyze the dependency of mining profit on factors such as transaction fee adjustments and operational costs, particularly electricity. By applying a multidimensional profitability model and performing a sensitivity analysis, we evaluate the potential for profit maximization through operational cost reduction versus fee increases. Our model integrates variable electricity costs, market-driven Bitcoin prices, mining hardware efficiency, network hash rate, and transaction fee elasticity. We show that mining strategies aimed at reducing electricity expenses are far more profitable than pursuing transactions with higher fees.
Large Language Models (LLMs) have changed the way we access and interpret information, communicate with each other and even operate computer systems through autonomous code generation. Typically, these billion-paramet...
详细信息
ISBN:
(纸本)9798350374247;9798350374230
Large Language Models (LLMs) have changed the way we access and interpret information, communicate with each other and even operate computer systems through autonomous code generation. Typically, these billion-parameter models rely on cloud storage and execution due to their computational demands. In this paper, we challenge this status quo by proposing JARVIS, a distributed LLM framework that splits model layers across edge devices with limited compute resources, trading off computation for increased peer-level communication. JARVIS is robust to individual node failures, including recovery methods for lost layers via peer-level duplication. We evaluate JARVIS using Google's open-source Gemma LLM (2B parameters) deployed over 18 software-defined radios in the NSF Colosseum RF emulator. Our evaluation explores LLM performance degradation from node losses, providing insights into node prioritization in tactical environments. The JARVIS software code is released for community exploration and adoption.
The Event Horizon Telescope (EHT) recently used 10 petabyte-scale observation data to construct the first images of black holes and 100 terabyte-scale simulation data to constrain the plasma properties around supermas...
详细信息
ISBN:
(纸本)9798400704192
The Event Horizon Telescope (EHT) recently used 10 petabyte-scale observation data to construct the first images of black holes and 100 terabyte-scale simulation data to constrain the plasma properties around supermassive black holes. This work leveraged the Open Science Grid (OSG) high throughput resources provided by the Partnership to Advance Throughput computing (PATh). While EHT has successfully utilized PATh to create the most extensive black hole simulation library to date, the broad adoption of this resource for data processing has been slower. The sophisticated command-line-driven HTCondor environment creates barriers for less technical researchers, limiting PATh's reach and impact on the broader astronomy and science communities. In May of 2023, the Cyberinfrastructure Integration Research Center (CIRC) at Indiana University was awarded an NSF EAGER award to collaborate with EHT and PATh in implementing a targeted science gateway instance that integrates critical EHT application functionality to leverage OSG within the Apache Airavata framework. The project leverages modern state-of-the-art User Experience (UX) techniques and participatory design methods to lower the barrier to adopting OSG resources for researchers trying to discover the properties of black holes.
The emergence of programmable data planes (PDPs) has paved the way for in-network computing (INC), a paradigm wherein networking devices actively participate in distributed computations. However, PDPs are still a nich...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
The emergence of programmable data planes (PDPs) has paved the way for in-network computing (INC), a paradigm wherein networking devices actively participate in distributed computations. However, PDPs are still a niche technology, mostly available to network operators, and rely on packet-processing DSLs like P4. This necessitates great networking expertise from INC programmers to articulate computational tasks in networking terms and reason about their code. To lift this barrier to INC we propose a unified compute interface for the data plane. We introduce C/C++ extensions that allow INC to be expressed as kernel functions processing in-flight messages, and APIs for establishing INC-aware communication. We develop a compiler that translates kernels into P4, and thin runtimes that handle the required network plumbing, shielding INC programmers from low-level networking details. We evaluate our system using common INC applications from the literature.
暂无评论