The proceedings contain 3 papers. The topics discussed include: mapping parallel matrix multiplication in GotoBLAS2 to the AMD versal ACAP for deep learning;exploring post quantum cryptography with quantum key distrib...
ISBN:
(纸本)9798400706448
The proceedings contain 3 papers. The topics discussed include: mapping parallel matrix multiplication in GotoBLAS2 to the AMD versal ACAP for deep learning;exploring post quantum cryptography with quantum key distribution for sustainable mobile network architecture design;and energy efficiency: a lattice Boltzmann study.
The proceedings contain 35 papers. The topics discussed include: affordable HPC: leveraging small clusters for big data and graph computing;systematic literature review of VANET simulators: comparative analysis, techn...
ISBN:
(纸本)9798350349658
The proceedings contain 35 papers. The topics discussed include: affordable HPC: leveraging small clusters for big data and graph computing;systematic literature review of VANET simulators: comparative analysis, technological advancements, and research challenges;parallel swarm propagation for neural networks;implementing censorship-resistant trusted email using blockchain technology;new parallel order maintenance data structure;an optimization method for national cryptography algorithm;interactive data visualization to optimize decision-making process;the influence of implementing enterprise resource planning (ERP), human capital management (HCM), and supply chain management (SCM) to enhance company’s performance effectiveness;and research on user identity authentication method based on edge computing.
The proceedings contain 5 papers. The topics discussed include: parallel and distributed frugal tracking of a quantile;defining the boundaries for endpoint congestion management in networks for high-performance comput...
ISBN:
(纸本)9798400706486
The proceedings contain 5 papers. The topics discussed include: parallel and distributed frugal tracking of a quantile;defining the boundaries for endpoint congestion management in networks for high-performance computing;accelerating application bulk synchronous writes in HPC environments;flying base station channel capacity;and eGossip: optimizing resource utilization in gossip-based clusters through eBPF.
The proceedings contain 3 papers. The topics discussed include: ECO-LLM: LLM-based edge cloud optimization;toward using representation learning for cloud resource usage forecasting;and MPIrigen: MPI code generation th...
ISBN:
(纸本)9798400706523
The proceedings contain 3 papers. The topics discussed include: ECO-LLM: LLM-based edge cloud optimization;toward using representation learning for cloud resource usage forecasting;and MPIrigen: MPI code generation through domain-specific language models.
Metastable failures in distributeddatabases, characterized by their self-sustaining feedback loops leading to significant performance degradation, have become increasingly prevalent with the rise of complex distribut...
详细信息
ISBN:
(纸本)9798331530044;9798331530037
Metastable failures in distributeddatabases, characterized by their self-sustaining feedback loops leading to significant performance degradation, have become increasingly prevalent with the rise of complex distributedsystems [1]. One of the main sustained feedback loops in these failures is retry storms. These failures are triggered by temporary changes in load, leading to a cascade of retrial requests that overwhelm the system even after the initial load spike has recovered [2]. We have leveraged queuing theory to propose an analytical method for modeling metastable failures due to retry storms [3]. Building on our previous work, this proposal outlines a systematic approach to mitigate these failures by eliminating retrial requests in distributed transaction systems. We focus on existing concurrency controlmechanisms where retries of distributed transactions can occur frequently. Specifically, we focus on two-phase locking (2PL) under high contention workloads, where many distributed transactions can abort due to deadlocks and be retried, causing metastable failures. We propose that by preprocessing distributed transactions, we can reorder the locking mechanism to avoid deadlocks and transaction retries under high contention workloads. The behavior and correctness of this approach will be validated using the queuing model we developed.
Nonuniform grid refinement plays a fundamental role in simulating realistic flows with a multitude of length scales. We introduce the first GPU-optimized implementation of this technique in the context of the lattice ...
详细信息
ISBN:
(纸本)9798350387117;9798350387124
Nonuniform grid refinement plays a fundamental role in simulating realistic flows with a multitude of length scales. We introduce the first GPU-optimized implementation of this technique in the context of the lattice Boltzmann method. Our approach focuses on enhancing GPU performance while minimizing memory access bottlenecks. We employ kernel fusion techniques to optimize memory access patterns, reduce synchronization overhead, and minimize kernel launch latencies. Additionally, our implementation ensures efficient memory management, resulting in lower memory requirements compared to the baseline LBM implementations that were designed for distributedsystems. Our implementation allows simulations of unprecedented domain size (e.g., 1596 x 840 x 840) using a single A100-40 GB GPU thanks to enabling grid refinement capabilities on a single GPU. We validate our code against published experimental data. Our optimization improves the performance of the baseline algorithm by 1.3-2X. We also compare against state-of-the-art current solutions for grid refinement LBM and show an order of magnitude speedup.
Anomaly detection plays a critical role in microservices systems by enabling system administrators to promptly detect and respond to anomalies. However, existing anomaly detection systems often necessitate the central...
详细信息
ISBN:
(纸本)9798350387117;9798350387124
Anomaly detection plays a critical role in microservices systems by enabling system administrators to promptly detect and respond to anomalies. However, existing anomaly detection systems often necessitate the centralization of log and trace data from diverse system components and rely on resource-intensive statistical methods or deep learning models for analysis. This approach impedes real-time anomaly detection and requires a significant demand on computing resources. In this paper, we design a multi-agent-based, distributed anomaly detection architecture called MAAD to address the limitations. MAAD consists of a collection of agents that cooperate together to identify abnormal behaviors in a distributed manner. Each agent is deployed along with a single service and applies lightweight machine learning techniques to perform local anomaly detection based on its own logs, local context, and information extracted from its parent span service. To preserve the graph information in a microservices request, an agent can communicate essential features with each other, taking into account the collective patterns learned from the prior services. We evaluate the effectiveness of MAAD on two microservices datasets, TrainTicket and MicroSS, and show that MAAD achieved high precision (up to 95.8%) and recall (up to 99.6%), outperforming state-of-the-art centralized anomaly detection approaches. Compared to centralized approaches, MAAD reduces the amount of transferred data before anomaly detection by approximately 88%, facilitating real-time anomaly detection. Furthermore, the lightweight nature of MAAD allows for rapid anomaly detection with minimal impact on microservices systems. Compared to DeepLog, MAAD saves approximately 92% detection time without using GPU accelerators.
The Order-Maintenance (OM) data structure is to keep a fully ordered list of items, supporting operations including insertions, deletions, and comparisons. As a crucial data structure, OM is widely used in various app...
详细信息
Boundary value problems involving elliptic PDEs such as the Laplace and the Helmholtz equations are ubiquitous in mathematical physics and engineering. Many such problems can be alternatively formulated as integral eq...
详细信息
ISBN:
(纸本)9798350387117;9798350387124
Boundary value problems involving elliptic PDEs such as the Laplace and the Helmholtz equations are ubiquitous in mathematical physics and engineering. Many such problems can be alternatively formulated as integral equations that are mathematically more tractable. However, an integral-equation formulation poses a significant computational challenge: solving large dense linear systems that arise upon discretization. In cases where iterative methods converge rapidly, existing methods that draw on fast summation schemes such as the Fast Multipole Method are highly efficient and well-established. More recently, linear complexity direct solvers that sidestep convergence issues by directly computing an invertible factorization have been developed. However, storage and computation costs are high, which limits their ability to solve large-scale problems in practice. In this work, we introduce a distributed-memory parallel algorithm based on an existing direct solver named "strong recursive skeletonization factorization [1]." Specifically, we apply low-rank compression to certain off-diagonal matrix blocks in a way that minimizes computation and data movement. Compared to iterative algorithms, our method is particularly suitable for problems involving ill-conditioned matrices or multiple right-hand sides. Large-scale numerical experiments are presented to show the performance of our Julia implementation.
暂无评论