检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

361 篇 会议
46 篇 期刊文献

馆藏范围

407 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

351 篇 工学
- 296 篇 软件工程
- 287 篇 计算机科学与技术...
- 13 篇 电子科学与技术（可...
- 7 篇 信息与通信工程
- 7 篇 控制科学与工程
- 4 篇 机械工程
- 4 篇 电气工程
- 4 篇 生物工程
- 3 篇 生物医学工程（可授...
- 2 篇 动力工程及工程热...
- 1 篇 力学（可授工学、理...
- 1 篇 建筑学
- 1 篇 土木工程
- 1 篇 化学工程与技术
- 1 篇 核科学与技术
- 1 篇 农业工程
- 1 篇 环境科学与工程（可...
61 篇 理学
- 55 篇 数学
- 6 篇 系统科学
- 4 篇 生物学
- 4 篇 统计学（可授理学、...
- 3 篇 化学
- 1 篇 物理学
17 篇 管理学
- 12 篇 管理科学与工程(可...
- 9 篇 工商管理
- 5 篇 图书情报与档案管...
4 篇 教育学
- 4 篇 教育学
3 篇 经济学
- 3 篇 应用经济学
2 篇 法学
- 2 篇 社会学
1 篇 农学
- 1 篇 作物学

主题

72 篇 performance
49 篇 parallel process...
46 篇 parallel program...
43 篇 algorithms
40 篇 languages
34 篇 design
22 篇 gpu
21 篇 parallel algorit...
12 篇 experimentation
12 篇 measurement
10 篇 parallel computi...
9 篇 theory
8 篇 mpi
7 篇 parallelism
7 篇 graphics process...
7 篇 parallel
7 篇 openmp
7 篇 concurrency
6 篇 multicore
5 篇 reliability

机构

7 篇 carnegie mellon ...
4 篇 univ wisconsin d...
4 篇 indiana univ blo...
4 篇 shanghai jiao to...
3 篇 univ of tokyo
3 篇 tsinghua univ de...
3 篇 univ chinese aca...
3 篇 massachusetts in...
3 篇 univ illinois ur...
3 篇 swiss fed inst t...
3 篇 mit csail united...
3 篇 tsinghua univ pe...
3 篇 univ utah sch co...
3 篇 rice univ housto...
3 篇 univ calif berke...
3 篇 univ texas austi...
2 篇 ist austria klos...
2 篇 fudan univ sch c...
2 篇 princeton univ d...
2 篇 georgetown univ ...

作者

8 篇 blelloch guy e.
7 篇 chen haibo
6 篇 hoefler torsten
6 篇 garland michael
6 篇 zhai jidong
6 篇 shun julian
5 篇 sun yihan
5 篇 tsigas philippas
4 篇 dhulipala laxman
4 篇 pingali keshav
4 篇 chen wenguang
4 篇 tan guangming
4 篇 wang haojie
4 篇 nikolopoulos dim...
4 篇 long guoping
4 篇 valero mateo
4 篇 mellor-crummey j...
4 篇 gu yan
4 篇 leiserson charle...
4 篇 kennedy ken

语言

380 篇 英文
26 篇 其他
1 篇 葡萄牙文

检索条件"任意字段=16th ACM Symposium on Principles and Practice of Parallel Programming"

共 407 条记录，以下是31-40 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Provably Fast and Space-Efficient parallel Biconnectivity 23

Provably Fast and Space-Efficient Parallel Biconnectivity

引用

28th acm SIGPLAN Annual symposium on principles and practice of parallel programming, PPoPP 2023

作者： Dong, Xiaojun Wang, Letong Gu, Yan Sun, Yihan UC Riverside

ISBN: (纸本)9798400700156

Computing biconnected components (BCC) of a graph is a fundamental graph problem. the canonical parallel BCC algorithm is the Tarjan-Vishkin algorithm, which has O(n + m) optimal work and polylogarithmic span on a graph with n vertices and m edges. However, Tarjan-Vishkin is not widely used in practice. We believe the reason is the space-inefficiency (it uses O(m) extra space). In practice, existing parallel implementations are based on breath-first search (BFS). Since BFS has span proportional to the diameter of the graph, existing parallel BCC implementations suffer from poor performance on large-diameter graphs and can be slower than the sequential algorithm on many real-world graphs. We propose the first p arallel b iconnectivity algorithm (FAST-BCC) that has optimal work, polylogarithmic span, and is space-efficient. Our algorithm creates a skeleton graph based on any spanning tree of the input graph. then we use the connectivity information of the skeleton to compute the biconnectivity of the original input. We carefully analyze the correctness of our algorithm, which is highly non-trivial. We implemented FAST-BCC and compared it with existing implementations, including GBBS, Slota and Madduri's algorithm, and the sequential Hopcroft-Tarjan algorithm. We tested them on a 96-core machine on 27 graphs with varying edge distributions. FAST-BCC is the fastest on all graphs. On average (geometric means), FAST-BCC is 3.1× faster than the best existing baseline on each graph. © 2023 Owner/Author.

关键词： Graphic methods

来源：评论

学校读者我要写书评

暂无评论

Merchandiser: Data Placement on Heterogeneous Memory for Task-parallel HPC Applications with Load-Balance Awareness 23

Merchandiser: Data Placement on Heterogeneous Memory for Tas...

引用

28th acm SIGPLAN Annual symposium on principles and practice of parallel programming, PPoPP 2023

作者： Xie, Zhen Liu, Jie Li, Jiajia Li, Dong University of California Argonne National Laboratory Merced United States University of California Merced United States North Carolina State University United States

ISBN: (纸本)9798400700156

the emergence of heterogeneous memory (HM) provides a cost-effective and high-performance solution to memory-consuming HPC applications. Deciding the placement of data objects on HM is critical for high performance. We reveal a performance problem related to data placement on HM. the problem is manifested as load imbalance among tasks in task-parallel HPC applications. the root of the problem comes from being unaware of parallel-task semantics and an incorrect assumption that bringing frequently accessed pages to fast memory always leads to better performance. To address this problem, we introduce a load balance-aware page management system, named Merchandiser. Merchandiser introduces task semantics during memory profiling, rather than being application-agnostic. Using the limited task semantics, Merchandiser effectively sets up coordination among tasks on the usage of HM to finish all tasks fast instead of only considering any individual task. Merchandiser is highly automated to enable high usability. Evaluating with memory-consuming HPC applications, we show that Merchandiser reduces load imbalance and leads to an average of 17.1% and 15.4% (up to 26.0% and 23.2%) performance improvement, compared with a hardware-based solution and an industry-quality software-based solution. © 2023 acm.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

POSTER: parallel Algorithms for Masked Sparse Matrix-Matrix Products 27

POSTER: Parallel Algorithms for Masked Sparse Matrix-Matrix ...

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Milakovic, Srdan Selvitopi, Oguz Nisa, Israt Budimlic, Zoran Buluc, Aydin Rice Univ Houston Houston TX USA Lawrence Berkeley Nat Lab Berkeley Berkeley CA USA AWS Palo Alto Palo Alto CA USA

ISBN: (纸本)9781450392044

Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important class of algorithms, only a subset of the output entries are needed, and the resulting operation is known as Masked SpGEMM since a subset of the output entries is considered to be "masked out". In this work, we investigate various novel algorithms and data structures for this rather challenging and important computation, and provide guidelines on how to design a fast Masked-SpGEMM for shared-memory architectures.

关键词： Masked-SpGEMM Sparse Matrix GraphBLAS

来源：评论

学校读者我要写书评

暂无评论

POSTER: Automatic Differentiation of parallel Loops with Formal Methods 27

POSTER: Automatic Differentiation of Parallel Loops with For...

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Huckelheim, Jan Hascoet, Laurent Argonne Natl Lab Lemont IL 60439 USA Inria Sophia Antipolis Valbonne France

ISBN: (纸本)9781450392044

the accompanying poster to this short paper presents a combination of reverse mode AD and formal methods to enable efficient differentiation of (or backpropagation through) shared-memory parallel code. Compared to the state of the art, our approach can more often avoid the need for atomic updates or private data copies during the parallel derivative computation, even in the presence of unstructured or data-dependent data access patterns. this is achieved by gathering information about the memory access patterns from the input program, which is assumed to be correctly parallelized. this information is then used to build a model of assertions in a theorem prover, which can be used to check the safety of shared memory accesses during the parallel derivative computation.

关键词： Automatic Differentiation OpenMP theorem Proving Formal Methods Data Flow Reversal

来源：评论

学校读者我要写书评

暂无评论

POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-parallel HPC Applications 27

POSTER: LB-HM: Load Balance-Aware Data Placement on Heteroge...

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Xie, Zhen Liu, Jie Ma, Sam Li, Jiajia Li, Dong Univ Calif Merced CA USA Coll William Mary Williamsburg WA USA

ISBN: (纸本)9781450392044

the emergence of heterogeneous memory (HM) provides a cost-effective and high-performance solution to memory-consuming HPC applications. However, using HM, wisely migrating data objects on it is critical for high performance. In this work, we introduce a load balance-aware page management system, named LB-HM. LB-HM introduces task semantics during memory profiling, rather than being application-agnostic. Evaluating with a set of memory-consuming HPC applications, we show that we show that LB-HM reduces existing load imbalance and leads to an average of 17.1% and 15.4% (up to 26.0% and 23.2%) performance improvement, compared with a hardware-based solution and an industry-quality software-based solution on Optane-based HM.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

POSTER: Towards OmpSs-2 and OpenACC Interoperation 27

POSTER: Towards OmpSs-2 and OpenACC Interoperation

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Korakitis, Orestis De Gonzalo, Simon Garcia Guidotti, Nicolas Barreto, Joao Pedro Monteiro, Jose C. Pena, Antonio J. Barcelona Supercomputing Ctr Barcelona Spain Univ Lisbon Inst Super Tecnico INESC ID Lisbon Portugal

ISBN: (纸本)9781450392044

the increasing demand in HPC to utilize accelerators has motivated the development of pragma-based directives to target these devices. OmpSs-2 and OpenACC are both directive-based solutions that allow application programmers to utilize accelerators. the two leverage distinct types of parallelism: task parallelism and data parallelism, respectively. Non-trivial scientific applications can benefit from both types of available parallelism. However, the combination of pragma-based models is difficult to coordinate, as both assume full control and are unaware of each other at runtime. We propose an interoperation mechanism to enable novel composability across pragma-based programming models. We study and propose a clear separation of duties and implement our approach by augmenting the OmpSs-2 programming model, compiler and runtime to support OmpSs-2 + OpenACC programming.

关键词： programming Productivity Data-flow Paradigm Runtime Scheduling Code Transformation parallelism GPU

来源：评论

学校读者我要写书评

暂无评论

A programming Model for GPU Load Balancing 23

A Programming Model for GPU Load Balancing

引用

28th acm SIGPLAN Annual symposium on principles and practice of parallel programming, PPoPP 2023

作者： Osama, Muhammad Porumbescu, Serban D. Owens, John D. University of California DavisCA United States

ISBN: (纸本)9798400700156

We propose a GPU fine-grained load-balancing abstraction that decouples load balancing from work processing and aims to support both static and dynamic schedules with a programmable interface to implement new load-balancing schedules. Prior to our work, the only way to unleash the GPU's potential on irregular problems has been to workload-balance through application-specific, tightly coupled load-balancing techniques. With our open-source framework for load-balancing, we hope to improve programmers' productivity when developing irregular-parallel algorithms on the GPU, and also improve the overall performance characteristics for such applications by allowing a quick path to experimentation with a variety of existing load-balancing techniques. Consequently, we also hope that by separating the concerns of load-balancing from work processing within our abstraction, managing and extending existing code to future architectures becomes easier. © 2023 Owner/Author.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

POSTER: A parallel Branch-and-Bound Algorithm with History-Based Domination 27

POSTER: A Parallel Branch-and-Bound Algorithm with History-B...

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Gonggiatgul, Taspon Shobaki, Ghassan Muyan-Ozcelik, Pinar Calif State Univ Sacramento CA USA

ISBN: (纸本)9781450392044

In this paper, we describe a parallel Branch-and-Bound (B&B) algorithm with a history-based domination technique, and we apply it to the Sequential Ordering Problem (SOP). To the best of our knowledge, the proposed algorithm is the first parallel B&B algorithm that includes a history-based domination technique and is the first parallel B&B algorithm for solving the SOP using a pure B&B approach. the proposed algorithm takes a pool-based approach and employs a collection of novel techniques that we have developed to achieve effective parallel exploration of the solution space, including parallel history domination, history table memory management, and a thread restart technique. the proposed algorithm was experimentally evaluated using the SOPLIB and TSPLIB benchmarks. the results show that using ten threads with a time limit of one hour on the medium-difficulty instances, the proposed algorithm gives a geometric-mean speedup of 19.9 on SOPLIB and 10.23 on TSPLIB, with super-linear speedups up to 65x seen on 17 instances.

关键词： parallel branch-and-bound sequential ordering problem combinatorial optimization NP-complete problems

来源：评论

学校读者我要写书评

暂无评论

parallel Block-Delayed Sequences 22

Parallel Block-Delayed Sequences

引用

27th acm SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Westrick, Sam Rainey, Mike Anderson, Daniel Blelloch, Guy E. Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781450392044

programming languages using functions on collections of values, such as map, reduce, scan and filter, have been used for over fifty years. Such collections have proven to be particularly useful in the context of parallelism because such functions are naturally parallel. However, if implemented naively they lead to the generation of temporary intermediate collections that can significantly increase memory usage and runtime. To avoid this pitfall, many approaches use "fusion" to combine operations and avoid temporary results. However, most of these approaches involve significant changes to a compiler and are limited to a small set of functions, such as maps and reduces. In this paper we present a library-based approach that fuses widely used operations such as scans, filters, and flattens. In conjunction with existing techniques, this covers most of the common operations on collections. Our approach is based on a novel technique which parallelizes over blocks, with streams within each block. We demonstrate the approach by implementing libraries targeting multicore parallelism in two languages: parallel ML and C++, which have very different semantics and compilers. To help users understand when to use the approach, we define a cost semantics that indicates when fusion occurs and how it reduces memory allocations. We present experimental results for a dozen benchmarks that demonstrate significant reductions in both time and space. In most cases the approach generates code that is near optimal for the machines it is running on.

关键词： parallel programming fusion collections functional programming

来源：评论

学校读者我要写书评

暂无评论

Exploring the Use of WebAssembly in HPC 23

Exploring the Use of WebAssembly in HPC

引用

28th acm SIGPLAN Annual symposium on principles and practice of parallel programming, PPoPP 2023

作者： Chadha, Mohak Krueger, Nils John, Jophin Jindal, Anshul Gerndt, Michael Benedict, Shajulin Computer Architecture and Parallel Systems Technische Universität München Germany Department of Computer Science and Engg. Indian and Institute of Information Technology Kottayam Kerala India

ISBN: (纸本)9798400700156

Containerization approaches based on namespaces offered by the Linux kernel have seen an increasing popularity in the HPC community both as a means to isolate applications and as a format to package and distribute them. However, their adoption and usage in HPC systems faces several challenges. these include difficulties in unprivileged running and building of scientific application container images directly on HPC resources, increasing heterogeneity of HPC architectures, and access to specialized networking libraries available only on HPC systems. these challenges of container-based HPC application development closely align with the several advantages that a new universal intermediate binary format called WebAssembly (Wasm) has to offer. these include a lightweight userspace isolation mechanism and portability across operating systems and processor architectures. In this paper, we explore the usage of Wasm as a distribution format for MPI-based HPC applications. To this end, we present MPIWasm, a novel Wasm embedder for MPI-based HPC applications that enables high-performance execution of Wasm code, has low-overhead for MPI calls, and supports high-performance networking interconnects present on HPC systems. We evaluate the performance and overhead of MPIWasm on a production HPC system and AWS Graviton2 nodes using standardized HPC benchmarks. Results from our experiments demonstrate that MPIWasm delivers competitive native application performance across all scenarios. Moreover, we observe that Wasm binaries are 139.5x smaller on average as compared to the statically-linked binaries for the different standardized benchmarks. © 2023 acm.

关键词： Containers

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共41页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：