检索结果-内蒙古大学图书馆

Proceedings of the ACM on Management of data 2025年第3期3卷 1-28页

作者： Haralampos Gavriilidis Kaustubh Beedkar Matthias Boehm Volker Markl BIFOLD Berlin Germany and Technische Universität Berlin Berlin Germany Indian Institute of Technology Delhi New Delhi India BIFOLD Berlin Germany Technische Universität Berlin Berlin Germany and DFKI GmbH Berlin Germany

Fast and scalable data transfer is crucial in today's decentralized data ecosystems and data-driven applications. Example use cases include transferring data from operational systems to consolidated data warehouse environments, or from relational database systems to data lakes for exploratory data analysis or ML model training. Traditional data transfer approaches rely on efficient point-to-point connectors or general middleware with generic intermediate data representations. Physical environments (e.g., on-premise, cloud, or consumer nodes) also have become increasingly heterogeneous. Existing work still struggles to achieve both, fast and scalable data transfer as well as generality in terms of heterogeneous systems and environments. Hence, in this paper, we introduce a holistic data transfer framework. Our XDBC framework splits the data transfer pipeline into logical components and provides a wide variety of physical implementations for these components. This design allows a seamless integration of different systems as well as the automatic optimizations of data transfer configurations according to workload and environment characteristics. Our evaluation shows that XDBC outperforms state-of-the-art generic data transfer tools by up to 5x, while being on par with specialized approaches.

关键词： data loading optimization data system interoperability data transfer

来源：评论

学校读者我要写书评

暂无评论

Memory Pooling for Enhanced data loading in GPU-Accelerated Environments

引用

IEEE ACCESS 2025年 13卷 87175-87182页

作者： Khan, Ayaz H. Al-Mehdhar, Hamed King Fahd Univ Petr & Minerals Comp Engn Dept Dhahran 31261 Saudi Arabia SDAIA KFUPM Joint Res Ctr Artificial Intelligence Dhahran 31261 Saudi Arabia

The RAPIDS Memory Manager (RMM) is developed by NVIDIA as a package that would enable developers to customize GPU memory allocation. RMM enables the use of pool allocation which could improve the performance greatly. This paper proposes a systematic profiling and evaluation framework that leverages NVIDIA's RMM to optimize and understand data loading performance of the ***_csv operation in GPU accelerated environments. It examines RMM's impact from multiple aspects, by measuring the execution time required to complete the operation, the memory consumption effect, and by profiling the operation with and without utilizing RMM across various dataset sizes. The finding demonstrates that RMM can have significant speedup of up to 24% by improving the memory management strategy of cuDF. As for other time series data preprocessing operations were overall improved by 14% when utilizing RMM. It could also improve the scalability of cuDF by utilizing managed memory to overcome the limited GPU memory constrains, allowing cuDF to handle datasets that exceeds the GPU memory while maintaining similar to 10x faster execution than the CPU based Pandas dataFrame. The effect of RMM on GPU memory consumption is also highlighted indicating a trade-off between faster execution and increased memory consumption.

关键词： Graphics processing units Memory management loading Resource management Scalability optimization Libraries Time series analysis data science data preprocessing GPU-accelerated computing memory pooling RAPIDS memory manager (RMM) data loading optimization CUDA memory management performance profiling time series data preprocessing dataFrame operations parallel data processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：