检索结果-内蒙古大学图书馆

Mosaic: Detection and Categorization of I/O Patterns in HPC Applications

Mosaic: Detection and Categorization of I/O Patterns in HPC ...

2024 Workshops of the International Conference for High Performance computing, Networking, Storage and analysis, SC Workshops 2024

作者： Jolivel, Theo Tessier, Francois Monniot, Julien Pallez, Guillaume Univ Rennes Inria CNRS IRISA Rennes France

ISBN: (纸本)9798350355543

With the gap between computing power and I/O performance growing ever wider on HPC systems, it is becoming crucial to optimize how applications perform I/O on storage resources. To achieve this, a good understanding of application I/O behavior is an essential preliminary step. In this paper, we introduce Mosaic, a method for categorizing applications according to their I/O behavior. We first propose an abstraction for characterizing I/O operations in terms of periodicity, temporality and metadata access. We then present a set of segmentation-based techniques for quickly and automatically detecting meaningful data access patterns. In the end, Mosaic is able to characterize a full set of real-world I/O traces from the Blue Waters supercomputer with 92% accuracy. © 2024 IEEE.

关键词： analysis characterization I/O traces

来源：评论

学校读者我要写书评

暂无评论

Replication-Based Quantum Annealing Error Mitigation 21

Replication-Based Quantum Annealing Error Mitigation

引用

21st acm International Conference on computing Frontiers (CF)

作者： Djidjev, Hristo Bulgarian Acad Sci Inst Informat & Commun Technol Sofia Bulgaria Los Alamos Natl Lab Los Alamos NM 87545 USA

ISBN: (纸本)9798400705977

Quantum annealers like those from D-Wave systems implement adiabatic quantum computing to solve optimization problems, but their analog nature and limited control functionalities present challenges to correcting or mitigating errors. As quantum computing advances towards applications, effective error suppression is an important research goal. We propose a new approach called replication based mitigation (RBM) based on parallel quantum annealing. In RBM, physical qubits representing the same logical qubit are dispersed across different copies of the problem embedded in the hardware. This mitigates hardware biases, is compatible with limited qubit connectivity in current annealers, and is suited for available noisy intermediate-scale quantum (NISQ) annealers. Our experimental analysis shows that RBM provides solution quality on par with previous methods while being compatible with a much wider range of hardware connectivity patterns. In comparisons against standard quantum annealing without error mitigation, RBM consistently improves the energies and ground state probabilities across parameterized problem sets.

关键词： Quantum computing Quantum annealing Quantum error mitigation Quantum annealing correction Parallel quantum annealing Hardware biases D-Wave

来源：评论

学校读者我要写书评

暂无评论

Demo: On-Device Video analysis with LLMs 24

Demo: On-Device Video Analysis with LLMs

引用

25th International Workshop on Mobile computing systems and Applications (HOTMOBILE)

作者： Jaganathan, Vishnu Gouda, Deepak Arora, Kriti Aggarwal, Mohit Zhang, Chao Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (纸本)9798400704970

We present a new on-device pipeline that efficiently summarizes lecture videos and provides relevant answers directly from a smartphone. We utilize widely accessible tools like OCR and Vosk speech-to-text, coupled with powerful large language models (LLMs), to identify crucial sentences and generate summaries. By harnessing the capabilities of LLMs and the computational power of mobile devices, we fine-tune and quantize BERT and GPT-2 to achieve efficient lecture video summarization and question answering on consumer-grade smartphones like the Pixel 8 Pro. Notably, this approach eliminates the need for cloud APIs, ensuring enhanced user privacy and minimal mobile data usage. https://***/shorts/zwGdONlKays

关键词： LLM on-device ML video understanding

来源：评论

学校读者我要写书评

暂无评论

Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud systems 15

Demystifying the Fight Against Complexity: A Comprehensive S...

引用

15th Annual acm Symposium on Cloud computing, SoCC 2024

作者： Sruthi, P.C. Guo, Zinan Chu, Deming Chen, Zhengyan Zhang, Yongle Purdue University West LafayetteIN United States University of Georgia AthensGA United States Tongji University Interning at Purdue China Peking University interning at Purdue China

ISBN: (纸本)9798400712869

Debugging in production cloud systems (or live debugging) is a critical yet challenging task for on-call developers due to the financial impact of cloud service downtime and the inherent complexity of cloud systems. Unfortunately, how debugging is performed, and the unique challenges faced in the production cloud environment have not been investigated in detail. In this paper, we perform the first fine-grained, observational study of 93 real-world debugging experiences of production cloud failures in 15 widely adopted open-source distributed systems including distributed storage systems, databases, computing frameworks, message passing systems, and container orchestration systems. We examine each debugging experience with a fine-grained lens and categorize over 1700 debugging steps across all incidents. Our study provides a detailed picture of how developers perform various diagnosis activities including failure reproduction, anomaly analysis, program analysis, hypothesis formulation, information collection and online experiments. Highlights of our study include: (1) Analyses of the taxonomies and distributions of both live debugging activities and the underlying reasons for hypothesis forking, which confirm the presence of expert debugging strategies in production cloud systems, and offer insights to guide the training of novice developers and the development of tools that emulate expert behavior. (2) The identification of the primary challenge in anomaly detection (or, observability) for end-to-end debugging: the collection of system-specific data (17.1% of data collected). In comparison, nearly all (96%) invariants utilized to detect anomalies are already present in existing monitoring tools. (3) The identification of the importance of online interventions (i.e., in-production experiments that alter system execution) for live debugging - they are performed as frequently as information collection - with an investigation of different types of interventions and chal

关键词： Cloud platforms

来源：评论

学校读者我要写书评

暂无评论

Smoothed Online Optimization with Unreliable Predictions

引用

proceedings of the acm on measurement and analysis of computing systems 2023年第1期7卷 1-36页

作者： Rutten, Daan Christianson, Nicolas Mukherjee, Debankur Wierman, Adam Georgia Inst Technol Atlanta GA 30332 USA CALTECH Pasadena CA USA

We examine the problem of smoothed online optimization, where a decision maker must sequentially choose points in a normed vector space to minimize the sum of per-round, non-convex hitting costs and the costs of switching decisions between rounds. The decision maker has access to a black-box oracle, such as a machine learning model, that provides untrusted and potentially inaccurate predictions of the optimal decision in each round. The goal of the decision maker is to exploit the predictions if they are accurate, while guaranteeing performance that is not much worse than the hindsight optimal sequence of decisions, even when predictions are inaccurate. We impose the standard assumption that hitting costs are globally alpha-polyhedral. We propose a novel algorithm, Adaptive Online Switching (AOS), and prove that, for a large set of feasible delta > 0, it is (1 +delta)-competitive if predictions are perfect, while also maintaining a uniformly bounded competitive ratio of 2 (O) over tilde ((1/(alpha delta))) even when predictions are adversarial. Further, we prove that this trade-off is necessary and nearly optimal in the sense that any deterministic algorithm which is (1 +delta)-competitive if predictions are perfect must be at least 2 (Omega) over tilde ((1/(alpha delta)))-competitive when predictions are inaccurate. In fact, we observe a unique threshold-type behavior in this trade-off: if.. is not in the set of feasible options, then no algorithm is simultaneously (1+delta)-competitive if predictions are perfect and.. -competitive when predictions are inaccurate for any zeta< 8. Furthermore, we discuss that memory is crucial in AOS by proving that any algorithm that does not use memory cannot benefit from predictions. We complement our theoretical results by a numerical study on a microgrid application.

关键词： online algorithms non-convex optimization competitive analysis

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of Runtime Handling of Zero-Copy for OpenMP Programs on MI300A APUs

Performance Analysis of Runtime Handling of Zero-Copy for Op...

引用

2024 Workshops of the International Conference for High Performance computing, Networking, Storage and analysis, SC Workshops 2024

作者： Bertolli, Carlo Blass, Thorsten Stringer, Lynd Aschenbrenner, Nicole Lehr, Jan-Patrick Bercea, Doru Chakrabarti, Dhruva Meadows, Lawrence Lieberman, Ron Austin United States Munich Germany Santa Clara United States

ISBN: (纸本)9798350355543

In current discrete GPU systems, the penalty of data movement between host and device memory is inevitable, forcing many large-scale applications to include optimizations that amortize this cost. On systems like the AMD Instinct™ MI300A series accelerators, based on the accelerated processing unit (APU) architecture, host and device memories are unified into a single physical storage. On an APU, the GPU can access memory in the same way the CPU does, thus avoiding the need for additional data movement (zero-copy). To inform developers of MI300A on expected advantages and potential overheads, we follow an experimental approach to study our OpenMP implementation that leverages MI300A zero-copy. Performance results show that zero-copy is faster than the legacy "copy"implementation by a ratio of 1.2X-2.3X for a production-ready application, but that incurs up to 11% penalty for one SPECaccel 2023 benchmark. © 2024 IEEE.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep Feature Compression with Rate-Distortion Optimization for Networked Camera systems 23

Deep Feature Compression with Rate-Distortion Optimization f...

引用

14th acm Multimedia systems Conference (MMSys)

作者： Ikusan, Ademola Dai, Rui Univ Cincinnati Cincinnati OH 45221 USA

ISBN: (纸本)9798400701481

Deep-learning-based video analysis solutions have become indispensable components in today's intelligent sensing applications. In a networked camera system, an efficient way to analyze the captured videos is to extract the features for deep learning at local cameras or edge devices and then transmit the features to powerful processing hubs for further analysis. As there exists substantial redundancy among different feature maps from the same video frame, the feature maps could be compressed before transmission to save bandwidth. This paper introduces a new rate-distortion optimized framework for compressing the intermediate deep features from the key frames of a video. First, to reduce the redundancy among different features, a feature selection strategy is designed based on hierarchical clustering. The selected features are then quantized, repacked as videos, and further compressed using a standardized video encoder. Furthermore, the proposed framework incorporates rate-distortion models that are built for three representative computer vision tasks: image classification, image segmentation, and image retrieval. A corresponding rate-distortion optimization module is designed to enhance the performance of common computer vision tasks under rate constraints. Experimental results show that the proposed deep feature compression framework can boost the compression performance over the standard HEVC video encoder.

关键词： Compression rate-distortion optimization deep learning features computer vision edge computing

来源：评论

学校读者我要写书评

暂无评论

The Fused Multiply-Add and Global Atmospheric Models: A Distributional Investigation into a Surprising Correctness Scenario

The Fused Multiply-Add and Global Atmospheric Models: A Dist...

引用

2024 Workshops of the International Conference for High Performance computing, Networking, Storage and analysis, SC Workshops 2024

作者： Price-Broncucia, Teo Baker, Allison H. Duda, Michael University of Colorado Department of Computer Science BoulderCO80303 United States NSF National Center for Atmospheric Research Computational and Information Systems Lab BoulderCO80305 United States NSF National Center for Atmospheric Research Mesoscale and Microscale Meteorology Lab BoulderCO80305 United States

ISBN: (纸本)9798350355543

In a series of related works developing an ensemble consistency testing approach for multiple popular global climate models (GCMs), one test scenario has repeatedly stood out. Why does the use of the Fused Multiply-Add (FMA) operation result in model configurations getting flagged as failures, while changes to compiler choice, optimization level, processor type and number, etc. are passed as expected? This work explores the impacts of FMA on GCM simulation output from a distributional perspective and provides directions for future work to enable model developers and users to use numerical optimization techniques with confidence. © 2024 IEEE.

关键词： climate-models correctness ensemble-methods

来源：评论

学校读者我要写书评

暂无评论

Obtaining Information Leakage Bounds via Approximate Model Counting

引用

proceedings OF THE acm ON PROGRAMMING LANGUAGES-PacmPL 2023年第PLDI期7卷 1488-1509页

作者： Saha, Seemanta Ghentiyala, Surendra Lu, Shihua Bang, Lucas Bultan, Tevfik Univ Calif Santa Barbara Santa Barbara CA 93106 USA Harvey Mudd Coll Claremont CA USA

Information leaks are a significant problem in modern software systems. In recent years, information theoretic concepts, such as Shannon entropy, have been applied to quantifying information leaks in programs. One recent approach is to use symbolic execution together with model counting constraints solvers in order to quantify information leakage. There are at least two reasons for unsoundness in quantifying information leakage using this approach: 1) Symbolic execution may not be able to explore all execution paths, 2) Model counting constraints solvers may not be able to provide an exact count. We present a sound symbolic quantitative information flow analysis that bounds the information leakage both for the cases where the program behavior is not fully explored and the model counting constraint solver is unable to provide a precise model count but provides an upper and a lower bound. We implemented our approach as an extension to KLEE for computing sound bounds for information leakage in C programs.

关键词： Quantitative Program analysis Symbolic Quantitative Information Flow analysis Model Counting Information Leakage Optimization

来源：评论

学校读者我要写书评

暂无评论

GMSys 2023 - proceedings of the 1st International Workshop on Green Multimedia systems, Part of acm MMSys 2023

GMSys 2023 - Proceedings of the 1st International Workshop o...

引用

1st acm International Workshop on Green Multimedia systems, co-located with acm MMSys 2023

ISBN: (纸本)9798400701962

The proceedings contain 8 papers. The topics discussed include: VE-match: video encoding matching-based model for cloud and edge computing instances;studying green video distribution as a whole;end-to-end optimizations for green streaming;audience aware streaming: new dynamics in OTT distribution;green video complexity analysis for efficient encoding in adaptive video streaming;energy efficiency improvements in software-based video encoding;video decoding energy reduction using temporal-domain filtering;and the analysis of DASH manifest optimizations.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：