检索结果-内蒙古大学图书馆

Autonomous Execution for Multi-GPU systems: Compiler Support 24

Autonomous Execution for Multi-GPU Systems: Compiler Support

proceedings of the SC '24 Workshops of the International Conference on High Performance computing, Network, Storage, and analysis

作者： Javid Baydamirli Tal Ben Nun Didem Unat Koç University Istanbul Turkey Lawrence Livermore National Laboratory Livermore California USA

ISBN: (纸本)9798350355543

Recent trends in HPC systems increasingly emphasize accelerators, particularly GPUs, as autonomous execution units, shifting control of entire program execution to GPUs. Communication libraries enable devices to move data independently among one another, bringing forth latency improvements, and first-party GPU runtimes expose APIs for kernels to organize their execution. Despite the trends and advancements, current high-level frameworks and compilers lack support for constructs enabling this autonomous execution. In this work, we aim to bridge this gap with a compiler and provide a productive method for writing efficient GPU-first code. We design and develop a code generator that efficiently fuses and schedules persistent kernels, provides high-level abstractions over device resources, and enables GPU-initiated communication within Python code using NVSHMEM to realize autonomous multi-GPU execution. We compare our implementation to other accelerated Python compilers including CuPy, DaCe, and cuNumeric on 22 NPBench kernels. We additionally perform a scaling study of distributed 2D/3D Jacobi and observe a speedup of 6.1x and 30.8x over DaCe and cuNumeric, respectively, on 8 GPUs for the 3D case with a scaling efficiency of 98%.

关键词： GPU-initiated communication

来源：评论

学校读者我要写书评

暂无评论

Origin-destination (OD) analysis based on big taxi trajectory data with XStar (Demo Paper) 21

Origin-destination (OD) analysis based on big taxi trajector...

引用

29th acm SIGSPATIAL International Conference on Advances in Geographic Information systems, SIGSPATIAL 2021

作者： Li, Xiang He, Yijun East China Normal University Shanghai China School of Geographic Sciences East China Normal University Shanghai China

ISBN: (纸本)9781450386647

In this paper, we demonstrate how to conduct OD analysis based on big taxi trajectory data with XStar in an efficient manner. XStar, originally developed by the first author, is a standalone software system dedicated to trajectory-data users with little programming skills and affordable computing devices. Since its release in Jan. 2019, it has received downloads of over 4000 by May 2021. © 2021 Owner/Author.

关键词： Trajectories

来源：评论

学校读者我要写书评

暂无评论

analysis of NVMe-SSD to Passthrough GPU Data Transfer in Virtualized systems 2021

Analysis of NVMe-SSD to Passthrough GPU Data Transfer in Vir...

引用

17th acm SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE)

作者： Vediappan, Arunkumar Mishra, Debadatta Indian Inst Technol Kanpur Kanpur Uttar Pradesh India

ISBN: (纸本)9781450383943

Non-volatile storage (NVM) technologies provide faster data access compared to traditional hard disk drives and can benefit applications executing on accelerators like general purpose graphics processing units (GPGPUs). Many contemporary GPU-friendly applications process huge volumes of data residing in the secondary storage. Several research work propose techniques to optimize data transfer overheads between devices connected to the same bus e.g., peer-to-peer data transfer between NVMe-SSD and GPU connected to a PCI bus. The applicability of these techniques, extent of their benefit and associated costs in virtualized systems is the scope of this paper. In this paper, we present a comprehensive empirical analysis of different combinations of NVMe-SSD virtualization techniques and data transfer mechanisms between NVMe-SSDs and GPUs. Further, the impact of different data transfer parameters and, root-cause analysis of the resulting performance in terms of data transfer throughput and CPU utilization for different combinations of techniques is presented. Based on the empirical analysis, we provide insights to address several bottlenecks related to different GPU data transfer techniques in different virtualization setups and motivate an alternate design by extending the VirtIO framework for efficient peer-to-peer data transfer.

关键词： Virtualization Performance measurement monitoring and analysis SSD GPU

来源：评论

学校读者我要写书评

暂无评论

Teaching an Intersectional Data analysis on Affirmative Action 2023

Teaching an Intersectional Data Analysis on Affirmative Acti...

引用

proceedings of the 54th acm Technical Symposium on Computer Science Education V. 2

作者： Olivia Dias Raechel Walker Cynthia Breazeal Massachusetts Institute of Technology Cambridge MA USA

ISBN: (纸本)9781450394338

ADM systems can be used to perform a task as inconsequential as recommending a song on Spotify, to making a decision that is instrumental to someone's life, such as determining their candidacy for college. If an algorithm is trained on biased data, it can propagate prejudice. Thus, it is pertinent to find methods to decrease ADM bias. This paper presents a way to potentially mitigate ADM bias by teaching high school students a intersectional data analysis activity that incorporates the second pillar of the liberatory computing framework, critical consciousness. This activity is designed to enable high school students to understand the bias and history behind the college admission process, which allows students to develop a critical consciousness. Establishing a critical consciousness will diversify the computing field and the data incorporated into ADM systems by encouraging minoritized high school students to get a degree in computer science. The National Institute of Standards and Technology (NIST) suggests that diversifying the computing field has the potential to reduce bias in ADM systems. Thus, the activity is focused on students developing a critical consciousness. This paper discusses the preliminary findings from teaching a two-day computing activity to high school students.

关键词： algorithmic-decision making data science liberatory computing framework critical consciousness data analysis affirmative action

来源：评论

学校读者我要写书评

暂无评论

On the Selection of Relevant Hardware Events for Explaining Execution Time Behavior 11

On the Selection of Relevant Hardware Events for Explaining ...

引用

11th Brazilian Symposium on computing systems Engineering (SBESC)

作者： Andrade, Tadeu Nogueira C. Lima, George Cadena Lima, Veronica Maria Abdeddaim, Yasmina Grosjean, Liliana Cucu Univ Fed Bahia Comp Inst Salvador BA Brazil Univ Fed Bahia Dept Stat Salvador BA Brazil Univ Gustave Eiffel LIGM CNRS Inria ParisKopernic Res Grp Paris France Inria Paris Kopernic Res Grp Paris France

ISBN: (纸本)9781665443111

Estimating safe upper bounds on task execution times is required in the design of predictable real-time systems. When multi-core, instruction pipeline, branch prediction, or cache memory are in place, due to the considerable complexity static timing analysis faces, measurement-based timing analysis (MBTA) is a more tractable option. MBTA estimates upper bounds on execution times using data measured under the execution of representative scenarios. In this context, it is paramount understanding not only how the task execution time is affected during its execution but also what kind of interference the task is sensitive to. Events such as cache misses or pipeline stalls, for example, may lead to large variability in task execution times. Based on the fact that current platforms offer Performance Monitoring Units (PMUs) capable of counting hardware-level event occurrences, in this paper, we focus on the problem of selecting the events that have the most impact on task execution with the goal of enriching the collected information to better support MBTA. Unfortunately, PMU usually have a limited number of monitoring registers, making them unable to monitor all events at once. Our approach describes how to carry out the events selection even under this limitation. Results from our experiments, considering 15 different programs running on a Raspberry Pi, indicate that five selected events can explain the execution behavior of the programs with reasonable accuracy.

关键词： real-time systems hardware events measurement-based timing analysis multi-core architectures

来源：评论

学校读者我要写书评

暂无评论

Understanding and Predicting Cross-Application I/O Interference in HPC Storage systems 24

Understanding and Predicting Cross-Application I/O Interfere...

引用

proceedings of the SC '24 Workshops of the International Conference on High Performance computing, Network, Storage, and analysis

作者： Chris Egersdoerfer Md. Hasanur Rashid Dong Dai Bo Fang Tallent Nathan Department of Computer and Information Sciences University of Delaware Pacific Northwest National Laboratory

ISBN: (纸本)9798350355543

On High Performance computing (HPC) systems, where multiple concurrent workloads may read and write vast amounts of data stored through a shared network on storage servers, competition for I/O resources between workloads is inevitable. Previous work has thoroughly recognized the impact of such competition-introduced resource contention, highlighting its potential to impact the performance of individual applications significantly. However, no prior work on such an issue has investigated the quantitative impact of inter-application I/O contention on individual applications, impeding a more efficient resource provision strategy. In this work, we first exemplify the dynamics of I/O interference towards I/O patterns and system status. We then propose a framework for collecting fine-grained I/O traces from applications and concurrent server-side metrics and train a machine learning model to accurately predict the existence of I/O interference and its quantitative impacts. Our results show that it is feasible to learn the complex factors and relationships which cause applications to underperform in the presence of I/O interference. Additionally, we show that a trained model can accurately predict the impact of I/O interference on HPC applications with F1 scores exceeding 90% for both synthetic benchmarks and real-world applications.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Barriers to Expertise in Citizen Science Games 22

Barriers to Expertise in Citizen Science Games

引用

CHI Conference on Human Factors in computing systems (CHI)

作者： Miller, Josh Aaron Cooper, Seth Northeastern Univ Boston MA 02115 USA

ISBN: (纸本)9781450391573

Expertise-centric citizen science games (ECCSGs) can be powerful tools for crowdsourcing scientific knowledge production. However, to be effective these games must train their players on how to become experts, which is difficult in practice. In this study, we investigated the path to expertise and the barriers involved by interviewing players of three ECCSGs: Foldit, Eterna, and Eyewire. We then applied reflexive thematic analysis to generate themes of their experiences and produce a model of expertise and its barriers. We found expertise is constructed through a cycle of exploratory and social learning but prevented by instructional design issues. Moreover, exploration is slowed by a lack of polish to the game artifact, and social learning is disrupted by a lack of clear communication. Based on our analysis we make several recommendations for CSG developers, including: collaborating with professionals of required skill sets;providing social features and feedback systems;and improving scientific communication.

关键词： citizen science games expertise game design thematic analysis

来源：评论

学校读者我要写书评

暂无评论

A measurement Study of Wechat Mini-Apps 21

A Measurement Study of Wechat Mini-Apps

引用

2021 acm SIGMETRICS / International Conference on measurement and Modeling of Computer systems, SIGMETRICS 2021

作者： Zhang, Yue Turkistani, Bayan Yang, Allen Yuqing Zuo, Chaoshun Lin, Zhiqiang The Ohio State University Columbus United States

ISBN: (纸本)9781450380720

A new mobile computing paradigm, dubbed mini-app, has been growing rapidly over the past few years since being introduced by WeChat in 2017. In this paradigm, a host app allows its end-users to install and run mini-apps inside itself, enabling the host app to build an ecosystem around (much like Google Play and Apple AppStore), enrich the host's functionalities, and offer mobile users elevated convenience without leaving the host app. It has been reported that there are over millions of mini-apps in WeChat. However, little information is known about these mini-apps at an aggregated level. In this paper, we present MiniCrawler, the first scalable and open-source WeChat mini-app crawler that has indexed over 1,333,308 mini-apps. It leverages a number of reverse engineering techniques to uncover the interfaces and APIs in WeChat for crawling the mini-apps. With the crawled mini-apps, we then measure the resource consumption, API usage, library usage, obfuscation rate, app categorization, and app ratings at an aggregated level in this work. © 2021 Owner/Author.

关键词： Application programming interfaces (API)

来源：评论

学校读者我要写书评

暂无评论

IoT-Based Approaches for Monitoring the Particulate Matter and Its Impact on Health

引用

IEEE INTERNET OF THINGS JOURNAL 2021年第15期8卷 11983-12003页

作者： Divan, Mario J. Sanchez-Reynoso, Maria Laura Panebianco, Juan Esteban Mendez, Mariano J. Natl Univ La Pampa Data Sci Res Grp Econ Sch RA-6300 Santa Rosa Argentina Natl Sci & Tech Res Council CONICET Environm & Earth Sci Inst La Pampa L6302EPA Santa Rosa Argentina

Scenario: The particulate matter (PM) is associated with all particles (solid and liquid) suspended in the air. Depending on the kind and size of the particle, each one represents different kinds of risks for human health. The emerging of tiny, available, and accessible devices related to the Internet of Things (IoT) has allowed the implementation of different monitoring strategies. Objective: To identify and characterize the IoT-based real-time monitoring strategies that have implemented a measurement process to study the effect of the PM on human health. Methodology: A wide analysis based on the systematic mapping study was performed on September 4, 2020. The Association for computing Machinery (acm), IEEE, ScienceDirect, SpringerLink, Scopus, and Wiley databases were considered in the exploration. Results: 48 articles addressing the IoT-based PM measurement were obtained, falling them between 2010 and 2020 with growing interest. The main use of this technology is related to increase the coverage and density of environmental monitoring stations due to the impact of PM on human health. Also, approaches to monitoring air quality and their potential effects on people's affections are described. Conclusions: Collaborative, people-aware, global proposals tend to get increasing interest. Only six (12.5%) articles incorporated some recommendation system based on PM measures. The accuracy and precision are the main concern around low-cost sensors for measuring PM. Thus, the calibration process is highlighted in 64.44% of articles. The main challenges reside in a combination of uncertainties in PM measurement, health impacts, data quality, and the influence of environmental variables on all of them.

关键词： Monitoring Atmospheric measurements Particle measurements Sensors Real-time systems Temperature measurement Systematics Intelligent systems measurement monitoring particulate matter (PM) systematic mapping study (SMS)

来源：评论

学校读者我要写书评

暂无评论

measurement Virtualization Technologies for Intelligent Information and measurement systems 24

Measurement Virtualization Technologies for Intelligent Info...

引用

24th International Conference on Soft computing and measurements, SCM 2021

作者： Brusakova, Irina A. Saint-Petersburg Electrotechnical University 'LETI' Saint Petersburg Russia

ISBN: (纸本)9781665439749

The article presents theoretical and methodological approaches to the formation of an ensemble of intelligent measurements for the tasks of metrological analysis and synthesis using virtual measuring circuits. A comparative analysis of measurement virtualization technologies for various platform solutions is presented. The procedures for integrating an ensemble of intelligent measurements into virtual measuring instruments are considered. The concept of an information-measuring system as a multi-agent system is presented. The concept of an intelligent agent for the tasks of reconfiguration of the measuring circuit has been introduced. © 2021 IEEE.

关键词： Multi agent systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：