检索结果-内蒙古大学图书馆

Design and Implementation of an Asymmetric Block-Based parallel file system

IEEE TRANSACTIONS ON COMPUTERS 2014年第7期63卷 1723-1735页

作者： Yi, Letian Shu, Jiwu Zhao, Ying Qian, Yingjin Lu, Youyou Zheng, Weimin Tsinghua Univ Dept Comp Sci Beijing 100084 Peoples R China

Existing block-based parallel file systems, which are deployed in the storage area network (SAN), blend metadata with data in underlying disks. Unfortunately, such symmetric architecture is prone to system-level failures, as metadata on shared disks can be damaged by a malfunctioning client. In this paper, we present an asymmetric block-based parallel file system, Redbud, which isolates the metadata storage in the metadata server (MDS) access domain. Although centralized metadata management can effectively improve the reliability of the system, it faces some challenges in providing high performance and availability. Towards this end, we introduce an embedded directory mechanism to explore the disk bandwidth of the metadata storage;we also introduces adaptive layout operations to deliver high I/O throughput for various file access pattern. Besides, by taking the MDS's load into consideration, we propose an adaptive timeout algorithm to make the MDS failure detection adaptive to the evolving workloads, improving the system availability. Measurements of a wide range of workloads demonstrate the benefit of our design and that Redbud gains good scalability.

关键词： parallel file system asymmetric

来源：评论

学校读者我要写书评

暂无评论

Enabling dynamic file I/O path selection at runtime for parallel file system

引用

JOURNAL OF SUPERCOMPUTING 2014年第2期68卷 996-1021页

作者： Li, Xiuqiao Xiao, Limin Qiu, Meikang Dong, Bin Ruan, Li State Key Lab Software Dev Environm Beijing 100191 Peoples R China Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China San Jose State Univ Dept Comp Engn San Jose CA 95192 USA

parallel file systems are experiencing more and more applications from various fields. Various applications have different I/O workload characteristics, which have diverse requirements on accessing storage resources. However, parallel file systems often adopt the "one-size-fits-all" solution, which fails to meet specific application needs and hinders the full exploitation of potential performance. This paper presents a framework to enable dynamic file I/O path selection with fine granularity at runtime. The framework adopts a file handle-rich scheme to allow file systems choose corresponding optimizations to serve I/O requests. Consistency control algorithms are proposed to ensure data consistency while changing optimizations at runtime. One case study on our prototype shows that choosing proper optimizations can improve the I/O performance for small files and large files by up to 40 and 64.4 %, respectively. Another case study shows that the data prefetch performance for real-world application traces can be improved by up to 193 % by selecting correct prefetch patterns. Simulations in large-scale environment also show that our method is scalable and both the memory consumption and the consistency control overhead can be negligible.

关键词： parallel file system Rich file handle I/O path selection Consistency control Small file optimization Data prefetching

来源：评论

学校读者我要写书评

暂无评论

Tarazu: An Adaptive End-to-end I/O Load-balancing Framework for Large-scale parallel file systems

引用

ACM TRANSACTIONS ON STORAGE 2024年第2期20卷 1-42页

作者： Paul, Arnab K. Neuwirth, Sarah Wadhwa, Bharti Wang, Feiyi Oral, Sarp Butt, Ali R. BITS Pilani Birla Inst Technol & Sci Goa Campus Zuarinagar 403726 Goa India Johannes Gutenberg Univ Mainz Zentrum Datenverarbeitung D-55099 Mainz Germany IBM Res Yorktown Hts NY 10598 USA Oak Ridge Natl Lab 1 Bethel Valley Rd Oak Ridge TN 37830 USA Virginia Tech Blacksburg VA 24061 USA

The imbalanced I/O load on large parallel file systems affects the parallel I/O performance of high-performance computing (HPC) applications. One of the main reasons for I/O imbalances is the lack of a global view of system-wide resource consumption. While approaches to address the problem already exist, the diversity of HPC workloads combined with different file striping patterns prevents widespread adoption of these approaches. In addition, load-balancing techniques should be transparent to client applications. To address these issues, we propose Tarazu, an end-to-end control plane where clients transparently and adaptively write to a set of selected I/O servers to achieve balanced data placement. Our control plane leverages real-time load statistics for global data placement on distributed storage servers, while our design model employs trace-based optimization techniques to minimize latency for I/O load requests between clients and servers and to handle multiple striping patterns in files. We evaluate our proposed system on an experimental cluster for two common use cases: the synthetic I/O benchmark IOR and the scientific application I/O kernel HACC-I/O. We also use a discrete-time simulator with real HPC application traces from emerging workloads running on the Summit supercomputer to validate the effectiveness and scalability of Tarazu in large-scale storage environments. The results show improvements in load balancing and read performance of up to 33% and 43%, respectively, compared to the state-of-the-art.

关键词： parallel file system progressive file layout lustre time series modeling

来源：评论

学校读者我要写书评

暂无评论

Formal Definitions and Performance Comparison of Consistency Models for parallel file systems

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED systemS 2024年第6期35卷 937-951页

作者： Wang, Chen Mohror, Kathryn Snir, Marc Lawrence Livermore Natl Lab Livermore CA 94550 USA Univ Illinois Dept Comp Sci Urbana IL 61801 USA

The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the exception of the POSIX standard and its strict consistency model. The use of POSIX consistency imposes a performance penalty that becomes more significant as the scale of parallel file systems increases and the access time to storage devices, such as node-local solid storage devices, decreases. While some efforts have been made to adopt relaxed storage consistency models, these models are often defined informally and ambiguously as by-products of a particular implementation. In this work, we establish a connection between memory consistency models and storage consistency models and revisit the key design choices of storage consistency models from a high-level perspective. Further, we propose a formal and unified framework for defining storage consistency models and a layered implementation that can be used to easily evaluate their relative performance for different I/O workloads. Finally, we conduct a comprehensive performance comparison of two relaxed consistency models on a range of commonly seen parallel I/O workloads, such as checkpoint/restart of scientific applications and random reads of deep learning applications. We demonstrate that for certain I/O scenarios, a weaker consistency model can significantly improve the I/O performance. For instance, in small random reads that are typically found in deep learning applications, session consistency achieved a 5x improvement in I/O bandwidth compared to commit consistency, even at small scales.

关键词： Synchronization Load modeling Program processors Computational modeling Deep learning Writing Standards Consistency model parallel i/o parallel file system storage consistency

来源：评论

学校读者我要写书评

暂无评论

Early Exploration of Using ChatGPT for Log-based Anomaly Detection on parallel file systems Logs 23

Early Exploration of Using ChatGPT for Log-based Anomaly Det...

引用

32nd International Symposium on High-Performance parallel and Distributed Computing (HPDC) part of the ACM Federated Computing Research Conference (FCRC)

作者： Egersdoerfer, Chris Zhang, Di Dai, Dong Univ N Carolina Charlotte NC 27599 USA

ISBN: (纸本)9798400701559

Log-based anomaly detection has been extensively studied to help detect complex runtime anomalies in production systems. However, existing techniques exhibit several common issues. First, they rely heavily on expert-labeled logs to discern anomalous behavior patterns. But labelling enough log data manually to effectively train deep neural networks may take too long. Second, they rely on numeric model prediction based on numeric vector input which causes model decisions to be largely non-interpretable by humans which further rules out targeted error correction. In recent years, we have witnessed groundbreaking advancements in large language models (LLMs) such as ChatGPT. These models have proven their ability to retain context and formulate insightful responses over entire conversations. They also present the ability to conduct few-shot and in-context learning with reasoning ability. In light of these abilities, it is only natural to explore their applicability in understanding log content and conducting anomaly classification among parallel file system logs.

关键词： anomaly detection parallel file system large language model

来源：评论

学校读者我要写书评

暂无评论

A GPU-Accelerated In-Memory Metadata Management Scheme forLarge-Scale parallel file systems

引用

Journal of Computer Science & Technology 2021年第1期36卷 44-55页

作者： Zhi-Guang Chen Yu-Bo Liu Yong-Feng Wang Yu-Tong Lu School of Computer Science and Engineering Sun Yat-sen UniversityGuangzhou 510006China

Driven by the increasing requirements of high-performance computing applications,supercomputers are prone to containing more and more computing *** running on such a large-scale computing system are likely to spawn millions of parallel processes,which usually generate a burst of I/O requests,introducing a great challenge into the metadata management of underlying parallel file *** traditional method used to overcome such a challenge is adopting multiple metadata servers in the scale-out manner,which will inevitably confront with serious network and consistence *** work instead pursues to enhance the metadata performance in the scale-up ***,we propose to improve the performance of each individual metadata server by employing GPU to handle metadata requests in *** proposal designs a novel metadata server architecture,which employs CPU to interact with file system clients,while offloading the computing tasks about metadata into *** take full advantages of the parallelism existing in GPU,we redesign the in-memory data structure for the name space of file *** new data structure can perfectly fit to the memory architecture of GPU,and thus helps to exploit the large number of parallel threads within GPU to serve the bursty metadata requests *** implement a prototype based on BeeGFS and conduct extensive experiments to evaluate our proposal,and the experimental results demonstrate that our GPU-based solution outperforms the CPU-based scheme by more than 50%under typical metadata *** superiority is strengthened further on high concurrent scenarios,e.g.,the high-performance computing systems supporting millions of parallel threads.

关键词： GPU-accelerated in-memory metadata management parallel file system

来源：评论

学校读者我要写书评

暂无评论

CDBB: AN NVRAM-BASED BURST BUFFER COORDINATION system FOR parallel file systemS 26

CDBB: AN NVRAM-BASED BURST BUFFER COORDINATION SYSTEM FOR PA...

引用

26th High Performance Computing Symposium (HPC) / Spring Simulation Multiconference (SpringSim)

作者： Fan, Ziqi Wu, Fenggang Diehl, Jim Du, David H. C. Voigt, Doug Univ Minnesota Comp Sci & Engn Minneapolis MN 55455 USA Hewlett Packard Enterprise Storage Div Boise ID 83714 USA

ISBN: (纸本)9781510860162

For modern HPC systems, failures are treated as the norm instead of exceptions. To avoid rerunning applications from scratch, checkpoint/restart techniques are employed to periodically checkpoint intermediate data to parallel file systems. To increase HPC checkpointing speed, distributed burst buffers (DBB) have been proposed to use node-local NVRAM to absorb the bursty checkpoint data. However, without proper coordination, DBB is prone to suffer from low resource utilization. To solve this problem, we propose an NVRAM-based burst buffer coordination system, named collaborative distributed burst buffer (CDBB). CDBB coordinates all the available burst buffers, based on their priorities and states, to help overburdened burst buffers and maximize resource utilization. We built a proof-of-concept prototype and tested CDBB at the Minnesota Supercomputing Institute. Compared with a traditional DBB system, CDBB can speed up checkpointing by up to 8.4x under medium and heavy workloads and only introduces negligible overhead.

关键词： burst buffer non-volatile memory parallel file system coordination system

来源：评论

学校读者我要写书评

暂无评论

Leveraging Pre-Built Catalogs and Object-Level Scheduling to Eliminate I/O Bottlenecks in HPC Environments

引用

IEEE ACCESS 2025年 13卷 55984-55995页

作者： Lee, Seoyeong Park, Junghwan Kim, Yoochan Jamil, Safdar Khan, Awais Woo Son, Seung Lee, Jae-Kook An, Do-Sik Hong, Taeyoung Kim, Youngjae Sogang Univ Dept Comp Sci & Engn Seoul 04107 South Korea Oak Ridge Natl Lab Oak Ridge TN 37831 USA Univ Massachusetts Lowell Dept Elect & Comp Engn Amherst MA 01003 USA Korea Inst Sci & Technol Informat KISTI Daejeon 34141 South Korea

Modern High-Performance Computing (HPC) environments face mounting challenges due to the shift from large to small file datasets, along with an increasing number of users and parallelized applications. As HPC systems rely on parallel file systems (PFS), such as Lustre for data processing, performance bottlenecks stemming from Object Storage Target (OST) contention have become a significant concern. Existing solutions, such as LADS with its object-level scheduling approach, fall short in large-scale HPC environments due to their inability to effectively address metadata I/O bottlenecks and the growing number of I/O processes. This study highlights the pressing need for a comprehensive solution that tackles both OST contention and metadata I/O challenges in diverse HPC workloads. To address these challenges, we propose SwiftLoad, an object-level I/O scheduling framework that leverages a metadata catalog to enhance the performance and efficiency of parallel HPC utilities. The adoption of the metadata catalog mitigates the metadata I/O bottlenecks that commonly occur in HPC utilities, a challenge that is particularly pronounced in object-level I/O scheduling. SwiftLoad addresses OST contention and the uneven distribution of I/O processes across different OSTs through mathematical modeling and incorporates a Loader Configuration Module to regulate the number of I/O processes. Evaluated with two representative utilities-data deduplication profiling and data augmentation-SwiftLoad achieved performance improvements of up to $5.63\times $ and $11.0\times $ , respectively, on a production supercomputer.

关键词： Metadata Layout file systems Servers Processor scheduling Writing Mathematical models Loading Load modeling Data augmentation HPC I/O parallel file system parallel processing

来源：评论

学校读者我要写书评

暂无评论

Seismic data IO and sorting optimization in HPC through ANNs prediction based auto-tuning for ExSeisDat

引用

NEURAL COMPUTING & APPLICATIONS 2023年第8期35卷 5855-5888页

作者： Tipu, Abdul Jabbar Saeed Conbhui, Padraig O. Howley, Enda Univ Galway Sch Comp Sci Galway Ireland Irish Ctr High End Comp Dublin Ireland

ExSeisDat is designed using standard message passing interface (MPI) library for seismic data processing on high-performance super-computing clusters. These clusters are generally designed for efficient execution of complex tasks including large size IO. The IO performance degradation issues arise when multiple processes try accessing data from parallel networked storage. These complications are caused by restrictive protocols running by a parallel file system (PFS) controlling the disks and due to less advancement in storage hardware itself as well. This requires and leads to the tuning of specific configuration parameters to optimize the IO performance, commonly not considered by users focused on writing parallel application. Despite its consideration, the changes in configuration parameters are required from case to case. It adds up to further degradation in IO performance for a large SEG-Y format seismic data file scaling to petabytes. The SEG-Y IO and file sorting operations are the two of the main features of ExSeisDat. This research paper proposes technique to optimize these SEG-Y operations based on artificial neural networks (ANNs). The optimization involves auto-tuning of the related configuration parameters, using IO bandwidth prediction by the trained ANN models through machine learning (ML) process. Furthermore, we discuss the impact on prediction accuracy and statistical analysis of auto-tuning bandwidth results, by the variation in hidden layers nodes configuration of the ANNs. The results have shown the overall improvement in bandwidth performance up to 108.8% and 237.4% in the combined SEG-Y IO and file sorting operations test cases, respectively. Therefore, this paper has demonstrated the significant gain in SEG-Y seismic data bandwidth performance by auto-tuning the parameters settings on runtime by using an ML approach.

关键词： HPC MPI-IO parallel file system Machine learning ANN SEG-Y ExSeisDat

来源：评论

学校读者我要写书评

暂无评论

Design and Implementation of Burst Buffer Over-Subscription Scheme for HPC Storage systems

引用

IEEE ACCESS 2023年 11卷 3386-3401页

作者： Bang, Jiwoo Sim, Alexander Lockwood, Glenn K. K. Eom, Hyeonsang Sung, Hanul Seoul Natl Univ Dept Comp Sci & Engn Seoul 08826 South Korea Lawrence Berkeley Natl Lab Computat Res Div Berkeley CA 94720 USA Sangmyung Univ Dept Game Design & Dev Seoul 03016 South Korea

Burst Buffer is widely used in supercomputer centers to bridge the performance gap between computational power and the high-performance I/O systems. The primary role of Burst Buffer is to temporarily absorb the bursty I/O and reduce the heavy access on parallel file system (PFS). However, the job resource manager on High-Performance Computer (HPC) systems prefers to use a dedicated Burst Buffer allocation approach, which eventually leads to the severely underutilized Burst Buffer resource. To improve the efficiency of using the expensive Burst Buffer resource, we analyze the I/O patterns on Burst Buffer in depth. We propose Burst Buffer over-subscription allocation method, which improves Burst Buffer utilization by allowing each job to access Burst Buffer only during its I/O phases so that the jobs can overlap each other. Furthermore, we develop a new I/O congestion-aware scheduler and a transparent data management system between Burst Buffer and PFS. Our approach also reduces the memory overhead and improves the data persistence of the data management system by adapting the persistent memory. With the proposed approach, not only the Burst Buffer utilization can be improved, but also HPC applications can achieve high I/O performance by exploiting the powerful Burst Buffer hardware capabilities. Experimental results show that BBOS can improve Burst Buffer utilization by up to 120% while more stable and higher checkpoint performance is guaranteed even under high I/O loads compared to other state-of-the-art schedulers. Besides, our approach can improve the hit ratio of restart requests by up to 96.4% and provides up to 210% higher restart throughput on Burst Buffer.

关键词： Resource management Supercomputers Memory management Hardware file systems Engines Writing Storage automation Publishing Burst buffer checkpoint demotion over-subscription parallel file system restart

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：