检索结果-内蒙古大学图书馆

USENIX Annual Technical Conference (USENIX ATC)

作者： Shen, Sijie Yao, Zihang Shi, Lin Wang, Lei Lai, Longbin Tao, Qian Su, Li Chen, Rong Yu, Wenyuan Chen, Haibo Zang, Binyu Zhou, Jingren Shanghai Jiao Tong Univ Inst Parallel & Distributed Syst Shanghai Peoples R China Alibaba Grp Hangzhou Peoples R China Shanghai AI Lab Shanghai Peoples R China

ISBN: (纸本)9781939133359

Recently, many applications have required the ability to perform dynamic graph analytical processing (GAP) tasks on the datasets generated by relational OLTP in real time. To meet the two key requirements of performance and freshness, this paper presents GART, an in-memory system that extends hybrid transactional/analytical processing (HTAP) systems to support GAP, resulting in hybrid transactional and graph analytical processing (HTGAP). GART fulfills two unique goals that are not encountered by HTAP systems. First, to adapt to rich workloads flexibility, GART proposes transparent data model conversion by graph extraction interfaces, which define rules for relational-graph mapping. Second, to ensure GAP performance, GART proposes an efficient dynamic graph storage with good locality that stems from key insights into HTGAP workloads, including (1) an efficient and mutable compressed sparse row (CSR) representation to guarantee the locality of edge scan, (2) a coarse-grained multi-version concurrency control (MVCC) scheme to reduce the temporal and spatial overhead of versioning, and (3) a flexible property storage to efficiently run different GAP workloads. Evaluations show that GART performs several orders of magnitude better than existing solutions in terms of freshness or performance. Meanwhile, for GAP workloads on the LDBC SNB dataset, GART outperforms the state-of-the-art general-purpose dynamic graph storage (i.e., LiveGraph) by up to 4.4x.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

A Brief Survey of Formal Models of Concurrency

arXiv

引用

arXiv 2024年

作者： Averill, Charles

The ubiquity of networking infrastructure in modern life necessitates scrutiny into networking fundamentals to ensure the safety and security of that infrastructure. The formalization of concurrent algorithms, a cornerstone of networking, is a longstanding area of research in which models and frameworks describing distributed systems are established. Despite its long history of study, the challenge of concisely representing and verifying concurrent algorithms remains unresolved. Existing formalisms, while powerful, often fail to capture the dynamic nature of real-world concurrency in a manner that is both comprehensive and scalable. This paper explores the evolution of formal models of concurrency over time, investigating their generality and utility for reasoning about real-world networking programs. Four foundational papers on formal concurrency are considered: Hoare's parallel programming: An axiomatic approach [3], Milner's A Calculus of Mobile Processes [7], O'Hearn's Resources, Concurrency and Local Reasoning [8], and the recent development of Coq's Iris framework [5]. © 2024, CC BY-NC-ND.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

TIGHT time COMPLEXITIES IN parallel STOCHASTIC OPTIMIZATION WITH ARBITRARY COMPUTATION DYNAMICS

arXiv

引用

arXiv 2024年

作者： Tyurin, Alexander AIRI Moscow Russia Skoltech Moscow Russia

In distributed stochastic optimization, where parallel and asynchronous methods are employed, we establish optimal time complexities under virtually any computation behavior of workers/devices/CPUs/GPUs, capturing potential disconnections due to hardware and network delays, time-varying computation powers, and any possible fluctuations and trends of computation speeds. These real-world scenarios are formalized by our new universal computation model. Leveraging this model and new proof techniques, we discover tight lower bounds that apply to virtually all synchronous and asynchronous methods, including Minibatch SGD, Asynchronous SGD (Recht et al., 2011), and Picky SGD (Cohen et al., 2021). We show that these lower bounds, up to constant factors, are matched by the optimal Rennala SGD and Malenia SGD methods (Tyurin & Richtárik, 2023). Copyright © 2024, The Authors. All rights reserved.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

Towards an Optimized Heterogeneous distributed Task Scheduler in OpenMP Cluster

Towards an Optimized Heterogeneous Distributed Task Schedule...

引用

High Performance Computing, Networking, Storage and Analysis, SC-W: workshops of the International Conference for

作者： Rémy Neveu Rodrigo Ceccato Gustavo Leite Guido Araujo Jose M. Monsalve Diaz Hervé Yviquel Instituto de Computação Universidade Estadual de Campinas (UNICAMP) Campinas Brazil

ISBN: (数字)9798350355543

ISBN: (纸本)9798350355550

This paper addresses the challenges of optimizing task scheduling for a distributed, task-based execution model in OpenMP for cluster computing environments. Traditional OpenMP implementations are primarily designed for shared-memory parallelism and offer limited control over task scheduling. However, improved scheduling mechanisms are critical to achieving performance and portability in distributed and heterogeneous environments. OpenMP Cluster (OMPC) was introduced to overcome these limitations, extending OpenMP with the Heterogeneous Earliest Finish time (HEFT) task scheduling algorithm tailored for large-scale systems. To improve scheduling and enable better system utilization, the runtime system must resolve challenges such as changes in the application balance, amount of parallelism, and varying communication *** work presents three key contributions: first, the refactoring of the OMPC runtime to unify task scheduling across devices and hosts; second, the optimization of the HEFT-based scheduling algorithm to ensure efficient task execution in distributed environments; and third, an extensive evaluation of Work Stealing and HEFT scheduling mechanisms in real-world clusters. While the HEFT implementation in OMPC is not fully optimized, this work provides a significant step toward improving distributed task scheduling in cluster computing, offering insights and incremental advancements that support the development of scalable and high-performance applications. Results show improvements of up to 24% in scheduling time while opening up to more extensions in the scheduling methods.

关键词： Runtime Scheduling algorithms parallel programming Optimal scheduling Cluster computing parallel processing Dynamic scheduling Large-scale systems Resource management Iterative methods

来源：评论

学校读者我要写书评

暂无评论

Symmetric distributed applications 8

Symmetric distributed applications

引用

8th ACM SIGPLAN International workshop on Reactive and Event-Based Languages and systems, REBLS 2021, co-located with SPLASH 2021

作者： Sant'anna, Francisco Santos, Rodrigo Rodriguez, Noemi Rio de Janeiro State University Brazil Microsoft Brazil PUC-Rio Brazil

ISBN: (纸本)9781450391085

A program is deterministic if multiple re-executions with the same inputs always lead to the same state. Even concurrent instances of a deterministic program should observe identical behavior - -in real time - -if assigned the same set of inputs. In this work, we propose real-time reproducibility for distributed programs. Multiple instances of the same interactive application can broadcast asynchronous inputs and yet conform to identical behavior. Collaborative networked applications, such as watch parties, document editing, and video games can benefit from this approach. We name this class of applications as symmetric distributed applications. Using a standard event-driven API to wait and emit events, programmers write code as if the application executes in a single machine. Our middleware intercepts event generation and synchronizes all instances in a consistent timeline so that receipt is identically reproducible. Not only distributed applications benefit from consistency and determinism but also development and testing can be done in a single instance with the same guarantees. In our experiments, the middleware can handle applications with 25 FPS, distributed in up to 25 nodes over the Internet, with an event latency below 350ms. © 2021 ACM.

关键词： Middleware

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation of Apache Hadoop, Spark, and Flink for Batch Processing of Big Data: A Comparative Analysis

Performance Evaluation of Apache Hadoop, Spark, and Flink fo...

引用

Electrical, Electronics, Information and Communication Technologies (ICEEICT), International Conference on

作者： Lakshmana Kumar Yenduri Senior Staff Software Engineer Payment Products Development at Visa Inc San Francisco Bay Area California USA

ISBN: (数字)9798350369083

ISBN: (纸本)9798350369090

The work explores the demand for Big Data processing and delves into the functioning of large-scale data processing architectures, focusing on batch and real-time processing. The experiment conducted analyzes the execution time of Apache Hadoop (AH), Apache Spark (AS), and Apache Flink (AF) tools. Results indicate that Spark outperformed Flink and Hadoop across all experiments, demonstrating notable speed advantages. In the first experiment with a 1 GB data source, Spark was 186% faster than Flink and 251% faster than Hadoop. Similarly, in the second experiment with a 3 GB data source, Spark surpassed both competitors, being 233% faster than Hadoop and 334% faster than Flink. Processing a 5 GB data source further highlighted Spark's superiority, with a 197% improvement over Hadoop and 316% over Flink. Despite the absence of parallelism and a distributed execution environment, the findings unanimously conclude that Spark achieved superior performance in batch processing large datasets in a pseudo-distributed cluster setting.

关键词： Performance evaluation Soft sensors Batch production systems Cluster computing Big Data parallel processing Data processing real-time systems Information and communication technology Sparks

来源：评论

学校读者我要写书评

暂无评论

GPU-LSolve: An Efficient GPU-Based Laplacian Solver for Million-Scale Graphs

GPU-LSolve: An Efficient GPU-Based Laplacian Solver for Mill...

引用

IEEE International Symposium on parallel and distributed Processing workshops and Phd Forum (IPDPSW)

作者： Sumiaya Dabeer Amitabha Bagchi Rahul Narain Department of Computer Science Indian Institute of Technology Delhi New Delhi India

ISBN: (数字)9798350364606

ISBN: (纸本)9798350364613

Breaking from the general run of Laplacian solvers that depend on algebraic primitives, we present the first GPU implementation of a message-passing-based solver. Our solver called GPU-LSolve, implements a randomized algorithm that simulates a queueing network where some nodes act as sources that generate messages and one node acts as a sink that removes messages from the network. The steady state of this network provides a solution for the Laplacian system of equations. We show how the simplicity of the primitives of this algorithm can be leveraged in a G PU setting to provide an efficient implementation that can solve Laplacian systems on million-scale graphs. Our solver takes advantage of GPU parallelism through sorting and key-value reduction. We have provided an extensive experimental evaluation on real data sets against several recently developed solvers. It is shown from the results that the presented solver does not suffer much in terms of memory footprint and execution time.

关键词： distributed processing Laplace equations Conferences Graphics processing units parallel processing Mathematical models Steady-state

来源：评论

学校读者我要写书评

暂无评论

基于流计算和大数据平台的实时交通流预测

引用

计算机工程与设计 2024年第2期45卷 553-561页

作者：李星辉曾碧魏鹏飞广东工业大学计算机学院广东广州510006

目前交通流预测实时性差,很难满足在线分析和预测任务的需求,基于此提出一种Flink流计算框架和大数据平台结合的实时交通流预测方法。基于流计算框架实时捕捉和预处理数据,包括采用Flink的transform算子对数据进行校验和处理,将处理后... 详细信息

目前交通流预测实时性差,很难满足在线分析和预测任务的需求,基于此提出一种Flink流计算框架和大数据平台结合的实时交通流预测方法。基于流计算框架实时捕捉和预处理数据,包括采用Flink的transform算子对数据进行校验和处理,将处理后的数据sink到大数据的HDFS文件系统,交由下一步的大数据并行框架进行分析建模与训练,实现基于流计算和大数据平台的实时交通流预测。实验结果表明,Flink能够实时捕捉和预处理交通流数据,把数据准时无误送入分布式文件系统中,在此基础上借助大数据框架下的并行分析和建模优势,在实时性数据分析与预测方面取得了较好的效果。

关键词：大数据数据并行流计算框架实时处理交通流预测分布式系统实时性分析

来源：评论

学校读者我要写书评

暂无评论

PIM-Opt: Demystifying distributed Optimization Algorithms on a real-World Processing-In-Memory System

arXiv

引用

arXiv 2024年

作者： Rhyner, Steve Luo, Haocong Gómez-Luna, Juan Sadrosadati, Mohammad Jiang, Jiawei Olgun, Ataberk Gupta, Harshita Zhang, Ce Mutlu, Onur ETH Zurich Switzerland NVIDIA United States Wuhan University China University of Chicago United States

Modern Machine Learning (ML) training on large-scale datasets is a very time-consuming workload. It relies on the optimization algorithm Stochastic Gradient Descent (SGD) due to its effectiveness, simplicity, and generalization performance (i.e., test performance on unseen data). Processor-centric architectures (e.g., CPUs, GPUs) commonly used for modern ML training workloads based on SGD are bottlenecked by data movement between the processor and memory units due to the poor data locality in accessing large training datasets. As a result, processor-centric architectures suffer from low performance and high energy consumption while executing ML training workloads. Processing-In-Memory (PIM) is a promising solution to alleviate the data movement bottleneck by placing the computation mechanisms inside or near memory. Several prior works propose PIM techniques to accelerate ML training;however, prior works either do not consider real-world PIM systems or evaluate algorithms that are not widely used in modern ML training. Our goal is to understand the capabilities and characteristics of popular distributed SGD algorithms on real-world PIM systems to accelerate data-intensive ML training workloads. To this end, we 1) implement several representative centralized parallel SGD algorithms, i.e., based on a central node responsible for synchronization and orchestration, on the real-world general-purpose UPMEM PIM system, 2) rigorously evaluate these algorithms for ML training on large-scale datasets in terms of performance, accuracy, and scalability, 3) compare to conventional CPU and GPU baselines, and 4) discuss implications for future PIM hardware. We highlight the need for a shift to an algorithm-hardware codesign to enable decentralized parallel SGD algorithms in real-world PIM systems, which significantly reduces the communication cost and improves scalability. Our results demonstrate three major findings: 1) The general-purpose UPMEM PIM system can be a viable alternat

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

realtime Middleware-based distributed Micro-Smart Electricity Meters

Realtime Middleware-based Distributed Micro-Smart Electricit...

引用

International Smart Grid workshop and Certificate Program (ISGWCP)

作者： Basem Almadani Abdullahi Sani Shuaibu Sami Ul Haq Farouq Aliyu Computer Engineering Department King Fahd University of Petroleum and Minerals Dhahran Saudi Arabia Center of Excellence in Development of Nonprofit Organizations King Fahd University of Petroleum and Minerals Dhahran Saudi Arabia

ISBN: (数字)9798350361612

ISBN: (纸本)9798350361629

Research shows that consumers want to know their appliances’ energy consumption. Also, providing consumers with more information about their energy use and giving them more control over it can lead to choices that reduce energy consumption. This research proposes a sensor network of Smart Electricity Meters (SEMs) that system uses a Data Distribution Service (DDS) publish/subscribe middleware (DPSM) for real-time fine-grain sensing of electric loads. The SEMs are connected to household appliances to measure their power consumption and send it to a server for storage and data analysis. Our results indicate an average throughput of 302 bytes/second and an average latency of 25.44 ms. The proposed solution demonstrates promise for real-time monitoring of household appliances’ power consumption in Smart Grid (SG) environments.

关键词： Meters Energy consumption Power demand Power measurement Electricity Throughput real-time systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：