检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

9 篇 会议
1 篇 期刊文献

馆藏范围

10 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

3 篇 工学
- 3 篇 计算机科学与技术...
- 3 篇 软件工程

主题

4 篇 programming
2 篇 computational mo...
1 篇 conferences
1 篇 runtime
1 篇 hpx
1 篇 parallel process...
1 篇 concurrent compu...
1 篇 task analysis
1 篇 memory managemen...
1 篇 three-dimensiona...
1 篇 merging
1 篇 computer archite...
1 篇 gpu programming
1 篇 bandwidth
1 篇 stars
1 篇 finite differenc...
1 篇 parallel machine...
1 篇 asynchronous mul...
1 篇 symmetric matric...
1 篇 performance meas...

机构

1 篇 riken r ccs chuo...
1 篇 uvsq ea 7432 li ...
1 篇 univ oregon oreg...
1 篇 amd
1 篇 kobe univ nada k...
1 篇 riken r ccs ctr ...
1 篇 oregon advanced ...
1 篇 next generation ...
1 篇 the ohio state u...
1 篇 ipvs university ...
1 篇 kyoto univ yukaw...
1 篇 barcelona superc...
1 篇 pezy comp kk chi...
1 篇 exascaler inc ch...
1 篇 univ paris sud c...
1 篇 barcelona superc...
1 篇 lsu center for c...
1 篇 department of ph...

作者

1 篇 subramoni hari
1 篇 valentin le fèvr...
1 篇 shafi aamir
1 篇 dufaud thomas
1 篇 gregor daiß
1 篇 nakamura takashi
1 篇 huck kevin a.
1 篇 tsubouchi miyuki
1 篇 tsuji miwako
1 篇 kevin a. huck
1 篇 schulz karl w.
1 篇 panda dhabaleswa...
1 篇 tanaka hideyuki
1 篇 tetsuzo usui
1 篇 patrick diehl
1 篇 nitadori keigo
1 篇 sato mitsuhisa
1 篇 dirk pflüger
1 篇 makino jun
1 篇 sakamoto ryo

语言

10 篇 英文

检索条件"任意字段=7th ACM/IEEE International Workshop on Extreme Scale Programming Models and Middleware, ESPM2 2022"

共 10 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Proceedings of 2022 acm/ieee 7th international workshop on extreme scale programming models and middleware, espm2 2022, Held in conjunction with SC 2022: the international Conference for High Performance Computing, Networking, Storage and Analysis

Proceedings of 2022 ACM/IEEE 7th International Workshop on E...

引用

7th acm/ieee international workshop on extreme scale programming models and middleware, espm2 2022

ISBN: (纸本)9781665463393

the proceedings contain 3 papers. the topics discussed include: a selective nesting approach for the sparse multi-threaded Cholesky factorization;from merging frameworks to merging stars: experiences using HPX, KOKKOS and SIMD types;and broad performance measurement support for asynchronous multi-tasking with APEX.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Broad Performance Measurement Support for Asynchronous Multi-Tasking with APEX 7

Broad Performance Measurement Support for Asynchronous Multi...

引用

ieee/acm 7th international workshop on extreme scale programming models and middleware (espm2)

作者： Huck, Kevin A. Univ Oregon Oregon Adv Comp Inst Sci & Soc OACISS Eugene OR 97403 USA

ISBN: (纸本)9781665463393

APEX (Autonomic Performance Environment for eXascale) is a performance measurement library for distributed, asynchronous multitasking runtime systems. It provides support for both lightweight measurement and high concurrency. To support performance measurement in systems that employ user-level threading, APEX uses a dependency chain in addition to the call stack to produce traces and task dependency graphs. APEX also provides a runtime adaptation system based on the observed system performance. In this paper, we describe the evolution of APEX from its design for HPX to support an array of programming models and abstraction layers and describe some of the features that have evolved to help understand the asynchrony and high concurrency of asynchronous tasking models.

关键词： asynchronous multitasking performance measurement HPX OpenMP Kokkos GPU programming

来源：评论

学校读者我要写书评

暂无评论

Design of Data Management for Multi SPMD Workflow programming Model 4

Design of Data Management for Multi SPMD Workflow Programmin...

引用

ieee/acm 4th international workshop on extreme scale programming models and middleware (espm2)

作者： Dufaud, thomas Tsuji, Miwako Sato, Mitsuhisa Univ Paris Sud CEA CNRS Digiteo LabsUVSQINRIAMaison SimulatUSR 3441 Bat 565 F-91191 Gif Sur Yvette France UVSQ EA 7432 LI PaRAD 45 Ave Etats Unis F-78035 Versailles France RIKEN R CCS Ctr Computat Sci Chuo Ku 7-1-26 Minatojima Minami Machi Kobe Hyogo 6500047 Japan

ISBN: (纸本)9781728101781

As both the complexity of algorithms and architecture increase, development of scientific software becomes a challenge. In order to exploit future architecture, we consider a Multi-SPMD workflow programing model. then, data transfer between tasks during computation highly depends on the architecture and middleware used. In this study we design an adaptive system for data management in a parallel programming environment which can express two level of parallelism. We show how the consideration of multiple strategies based on I/O and direct message passing can improve performances and fault tolerance in the YML-XMP environment. On a real application with a sufficiently large amount of local data, speedup of 1.36 for a mixed strategy to 1.73 for a direct message passing method are obtained compared to our original design.

关键词： Task analysis Parallel processing Distributed databases programming Computational modeling Computer architecture Data models

来源：评论

学校读者我要写书评

暂无评论

Automatic Generation of High-Order Finite-Difference Code with Temporal Blocking For extreme-scale Many-Core Systems 4

Automatic Generation of High-Order Finite-Difference Code wi...

引用

ieee/acm 4th international workshop on extreme scale programming models and middleware (espm2)

作者： Tanaka, Hideyuki Ishihara, Youhei Sakamoto, Ryo Nakamura, Takashi Kimura, Yasuyuki Nitadori, Keigo Tsubouchi, Miyuki Makino, Jun ExaScaler Inc Chiyoda Ku 2-1 Ogawa Machi Tokyo 1010052 Japan Kyoto Univ Yukawa Inst Theoret Phys Sakyo Ku Kyoto 6068502 Japan RIKEN R CCS Chuo Ku 7-1-26 Minatojima Minami Machi Kobe Hyogo 6500047 Japan PEZY Comp KK Chiyoda Ku 1-11 Ogawa Machi Tokyo 1010052 Japan Kobe Univ Nada Ku 1-1 Rokkodai Cho Kobe Hyogo 6578501 Japan

ISBN: (纸本)9781728101781

In this paper we describe the basic idea, implementation and achieved performance of our DSL for stencil computation, Formura, on systems based on PEZY-SC2 many-core processor. Formura generates, from high-level description of the differential equation and simple description of finite-difference stencil, the entire simulation code with MPI parallelization with overlapped communication and calculation, advanced temporal blocking and parallelization for many-core processors. Achieved performance is 4.78 PF, or 21.5% of the theoretical peak performance for an explicit scheme for compressive CFD, with the accuracy of fourth-order in space and third-order in time. For a slightly modified implementation of the same scheme, efficiency was slightly lower (17.5%) but actual calculation time per one timestep was faster by 25%. Temporal blocking improved the performance by up to 70%. Even though the B/F number of PEZY-SC2 is low, around 0.02, we have achieved the efficiency comparable to those of highly optimized CFD codes on machines with much higher memory bandwidth such as K computer. We have demonstrated that automatic generation of the code with temporal blocking is a quite effective way to make use of very large-scale machines with low memory bandwidth for large-scale CFD calculations.

关键词： Bandwidth Mathematical model Finite difference methods DSL Computational modeling Memory management Upper bound

来源：评论

学校读者我要写书评

暂无评论

Message from the workshop Chairs

Message from the Workshop Chairs

引用

international workshop on extreme scale programming models and Middlewar (espm2)

来源：评论

学校读者我要写书评

暂无评论

workshop Organization

Workshop Organization

引用

international workshop on extreme scale programming models and Middlewar (espm2)

来源：评论

学校读者我要写书评

暂无评论

Message from the workshop Chairs

Proceedings of 2022 ACM/IEEE 7th International Workshop on E...

引用

Proceedings of 2022 acm/ieee 7th international workshop on extreme scale programming models and middleware, espm2 2022, Held in conjunction with SC 2022: the international Conference for High Performance Computing, Networking, Storage and Analysis 2022年 IV页

作者： Subramoni, Hari Panda, Dhabaleswar K. Shafi, Aamir Schulz, Karl W. The Ohio State University United States Amd

来源：评论

学校读者我要写书评

暂无评论

Broad Performance Measurement Support for Asynchronous Multi-Tasking with APEX

Broad Performance Measurement Support for Asynchronous Multi...

引用

international workshop on extreme scale programming models and Middlewar (espm2)

作者： Kevin A. Huck Oregon Advanced Computing Institute for Science and Society (OACISS) University of Oregon Eugene Oregon USA

ISBN: (纸本)9781665463409

关键词： Measurement Concurrent computing Runtime System performance Graphics processing units programming Multitasking

来源：评论

学校读者我要写书评

暂无评论

A Selective Nesting Approach for the Sparse Multi-threaded Cholesky Factorization

A Selective Nesting Approach for the Sparse Multi-threaded C...

引用

international workshop on extreme scale programming models and Middlewar (espm2)

作者： Valentin Le Fèvre Tetsuzo Usui Marc Casas Barcelona Supercomputing Center Barcelona Spain Next Generation Technical Computing Unit Fujitsu Limited Kawasaki Japan Barcelona Supercomputing Center Universitat Politècnica de Catalunya (UPC) Barcelona Spain

ISBN: (纸本)9781665463409

Sparse linear algebra routines are fundamental building blocks of a large variety of scientific applications. Direct solvers, which are methods for solving linear systems via the factorization of matrices into products of triangular matrices, are commonly used in many contexts. the Cholesky factorization is the fastest direct method for symmetric and positive definite matrices. this paper presents selective nesting, a method to determine the optimal task granularity for the parallel Cholesky factorization based on the structure of sparse matrices. We propose the Opt-D algorithm, which automatically and dynamically applies selective nesting. Opt-D leverages matrix sparsity to drive complex task-based parallel workloads in the context of direct solvers. We run an extensive evaluation campaign considering a heterogeneous set of 35 sparse matrices and a parallel machine featuring the A64FX processor. Opt-D delivers an average performance speedup of 1.75× with respect to the best state-of-the-art parallel methods to run direct solvers.

关键词： Context Linear systems Symmetric matrices Heuristic algorithms Linear algebra programming Parallel machines

来源：评论

学校读者我要写书评

暂无评论

From Merging Frameworks to Merging Stars: Experiences using HPX, Kokkos and SIMD Types

From Merging Frameworks to Merging Stars: Experiences using ...

引用

international workshop on extreme scale programming models and Middlewar (espm2)

作者： Gregor Daiß Srinivas Yadav Singanaboina Patrick Diehl Hartmut Kaiser Dirk Pflüger IPVS University of Stuttgart Stuttgart 70174 Stuttgart Germany LSU Center for Computation & Technology Louisiana State University Baton Rouge LA 70803 U.S.A Department of Physics and Astronomy Louisiana State University Baton Rouge LA 70803 U.S.A.

ISBN: (纸本)9781665463409

Octo-Tiger, a large-scale 3D AMR code for the merger of stars, uses a combination of HPX, Kokkos and explicit SIMD types, aiming to achieve performance-portability for a broad range of heterogeneous hardware. However, on A64FX CPUs, we encountered several missing pieces, hindering performance by causing problems with the SIMD vectorization. therefore, we add std::experimental::simd as an option to use in Octo-Tiger’s Kokkos kernels alongside Kokkos SIMD, and further add a new SVE (Scalable Vector Extensions) SIMD backend. Additionally, we amend missing SIMD implementations in the Kokkos kernels within Octo-Tiger’s hydro solver. We test our changes by running Octo-Tiger on three different CPUs: An A64FX, an Intel Icelake and an AMD EPYC CPU, evaluating SIMD speedup and node-level performance. We get a good SIMD speedup on the A64FX CPU, as well as noticeable speedups on the other two CPU platforms. However, we also experience a scaling issue on the EPYC CPU.

关键词： three-dimensional displays Codes Conferences Merging Stars programming Hardware

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共1页 << < 1 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：