检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

344 篇 会议
19 篇 期刊文献
1 册 图书

馆藏范围

364 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

305 篇 工学
- 261 篇 软件工程
- 250 篇 计算机科学与技术...
- 13 篇 电子科学与技术（可...
- 9 篇 信息与通信工程
- 5 篇 控制科学与工程
- 4 篇 机械工程
- 4 篇 生物工程
- 3 篇 生物医学工程（可授...
- 1 篇 力学（可授工学、理...
- 1 篇 动力工程及工程热...
- 1 篇 电气工程
- 1 篇 核科学与技术
- 1 篇 农业工程
- 1 篇 环境科学与工程（可...
- 1 篇 网络空间安全
57 篇 理学
- 53 篇 数学
- 4 篇 生物学
- 4 篇 系统科学
- 4 篇 统计学（可授理学、...
- 2 篇 化学
18 篇 管理学
- 12 篇 管理科学与工程(可...
- 11 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 5 篇 应用经济学
3 篇 法学
- 3 篇 社会学
3 篇 教育学
- 3 篇 教育学
1 篇 农学
- 1 篇 作物学

主题

54 篇 performance
50 篇 parallel process...
34 篇 parallel program...
33 篇 algorithms
27 篇 languages
25 篇 design
20 篇 parallel algorit...
20 篇 gpu
9 篇 experimentation
9 篇 measurement
8 篇 parallel
7 篇 scalability
7 篇 graphics process...
7 篇 theory
7 篇 parallel computi...
6 篇 parallelism
6 篇 mpi
6 篇 concurrency
5 篇 graph algorithms
5 篇 logic programmin...

机构

7 篇 carnegie mellon ...
4 篇 indiana univ blo...
3 篇 univ of tokyo
3 篇 tsinghua univ de...
3 篇 univ chinese aca...
3 篇 massachusetts in...
3 篇 univ illinois ur...
3 篇 swiss fed inst t...
3 篇 mit csail united...
3 篇 shanghai jiao to...
3 篇 tsinghua univ pe...
3 篇 univ calif berke...
2 篇 ist austria klos...
2 篇 georgetown univ ...
2 篇 univ wisconsin d...
2 篇 yale university ...
2 篇 shanghai key lab...
2 篇 univ of wisconsi...
2 篇 tsinghua univers...
2 篇 shanghai jiao to...

作者

8 篇 blelloch guy e.
6 篇 hoefler torsten
6 篇 garland michael
6 篇 zhai jidong
6 篇 chen haibo
6 篇 shun julian
5 篇 sun yihan
4 篇 dhulipala laxman
4 篇 chen wenguang
4 篇 tsigas philippas
4 篇 tan guangming
4 篇 wang haojie
4 篇 mellor-crummey j...
4 篇 gu yan
4 篇 kennedy ken
3 篇 taura kenjiro
3 篇 li jiajia
3 篇 yonezawa akinori
3 篇 pingali keshav
3 篇 kim jungwon

语言

361 篇 英文
3 篇 其他

检索条件"任意字段=Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming"

共 364 条记录，以下是101-110 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Implementing parallel and Concurrent Tree Structures 19

Implementing Parallel and Concurrent Tree Structures

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Sun, Yihan Blelloch, Guy Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781450362252

As one of the most important data structures used in algorithm design and programming, balanced search trees are widely used in real-world applications for organizing data. Answering the challenges thrown up by modern largevolume and ever-changing data, it is important to consider parallelism, concurrency, and persistence. this tutorial will introduce techniques for supporting functionalities on trees, including various parallel algorithms, concurrency, multiversioning, etc. In particular, this tutorial will focus on an algorithmic framework for parallel balanced binary trees, which works for multiple balancing schemes, including AVL trees, red-black trees, weight-based trees, and treaps. this framework allows for theoretically-efficient algorithms. the corresponding implementation is available as a library, which demonstrates good performance both sequentially and in parallel in various use scenarios. this tutorial will focus on the following topics: 1) the algorithms and techniques used in the PAM library;2) the interface of the library and a hands-on introduction to the download/installation of the library;3) examples of applying the library to various applications and 4) introduction about other useful techniques for parallel tree structures and performance comparisons with PAM.

关键词： balanced tree augmented map parallel concurrent library PAM ordered set ordered map

来源：评论

学校读者我要写书评

暂无评论

throughput-Oriented GPU Memory Allocation 19

Throughput-Oriented GPU Memory Allocation

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Gelado, Isaac Garland, Michael NVIDIA Santa Clara CA 95051 USA

ISBN: (纸本)9781450362252

throughput-oriented architectures, such as GPUs, can sustain three orders of magnitude more concurrent threads than multicore architectures. this level of concurrency pushes typical synchronization primitives (e.g., mutexes) over their scalability limits, creating significant performance bottlenecks in modules, such as memory allocators, that use them. In this paper, we develop concurrent programming techniques and synchronization primitives, in support of a dynamic memory allocator, that are efficient for use with very high levels of concurrency. We formulate resource allocation as a two-stage process, that decouples accounting for the number of available resources from the tracking of the available resources themselves. To facilitate the accounting stage, we introduce a novel bulk semaphore abstraction that extends traditional semaphore semantics by optimizing for the case where threads operate on the semaphore simultaneously. We also similarly design new collective synchronization primitives that enable groups of cooperating threads to enter critical sections together. Finally, we show that delegation of deferred reclamation to threads already blocked greatly improves efficiency. Using all these techniques, our throughput-oriented memory allocator delivers both high allocation rates and low memory fragmentation on modern GPUs. Our experiments demonstrate that it achieves allocation rates that are on average 16.56 times higher than the counterpart implementation in the CUDA 9 toolkit.

关键词： Concurrency Memory Allocation GPU programming

来源：评论

学校读者我要写书评

暂无评论

SEP-Graph: Finding Shortest Execution Paths for Graph Processing under a Hybrid Framework on GPU 19

SEP-Graph: Finding Shortest Execution Paths for Graph Proces...

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Wang, Hao Geng, Liang Lee, Rubao Hou, Kaixi Zhang, Yanfeng Zhang, Xiaodong Ohio State Univ Dept Comp Sci & Engn Columbus OH 43210 USA Northeastern Univ Dept Comp Sci & Engn Shenyang Peoples R China United Parallel Comp Corp Atlanta DE USA Virginia Tech Dept Comp Sci Blacksburg VA USA

ISBN: (纸本)9781450362252

In general, the performance of parallel graph processing is determined by three pairs of critical parameters, namely synchronous or asynchronous execution mode (Sync or Async), Push or Pull communication mechanism (Push or Pull), and Data-driven or Topology-driven traversing scheme (DD or TD), which increases the complexity and sophistication of programming and system implementation of GPU. Existing graph-processing frameworks mainly use a single combination in the entire execution for a given application, but we have observed their variable and suboptimal performance. In this paper, we present SEP-Graph, a highly efficient software framework for graph-processing on GPU. the hybrid execution mode is automatically switched among three pairs of parameters, with an objective to achieve the shortest execution time in each iteration. We also apply a set of optimizations to SEP-Graph, considering the characteristics of graph algorithms and underlying GPU architectures. We show the effectiveness of SEP-Graph based on our intensive and comparative performance evaluation on NVIDIA 1080, P100, and V100 GPUs. Compared with existing and representative GPU graph-processing framework Groute and Gunrock, SEP-Graph can reduce execution time up to 45.8 times and 39.4 times.

关键词： Graph Algorithms GPU Hybrid

来源：评论

学校读者我要写书评

暂无评论

Efficient Race Detection with Futures 19

Efficient Race Detection with Futures

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Utterback, Robert Agrawal, Kunal Fineman, Jeremy Lee, I-Ting Angelina Monmouth Coll Monmouth IL 61462 USA Washington Univ St Louis MO 14263 USA Georgetown Univ Washington DC 20057 USA

ISBN: (纸本)9781450362252

this paper addresses the problem of provably efficient and practically good on-the-fly determinacy race detection in task parallel programs that use futures. Prior works on determinacy race detection have mostly focused on either task parallel programs that follow a series-parallel dependence structure or ones with unrestricted use of futures that generate arbitrary dependences. In this work, we consider a restricted use of futures and show that we can detect races more efficiently than with general use of futures. Specifically, we present two algorithms: MultiBags and MultiBags+. MultiBags targets programs that use futures in a restricted fashion and runs in time O(T-1 alpha(m, n)), where T-1 is the sequential running time of the program, a is the inverse Ackermann's function, m is the total number of memory accesses, n is the dynamic count of places at which parallelism is created. Since a is a very slowly growing function (upper bounded by 4 for all practical purposes), it can be treated as a close-to-constant overhead. MultiBags+ is an extension of MultiBags that target programs with general use of futures. It runs in time O((T-1 + k(2))alpha(m, n)) where T-1, alpha, m and n are defined as before, and k is the number of future operations in the computation. We implemented both algorithms and empirically demonstrate their efficiency.

关键词： dynamic program analysis determinacy race race detection series-parallel maintenance

来源：评论

学校读者我要写书评

暂无评论

Lock-free channels for programming via communicating sequential processes 19

Lock-free channels for programming via communicating sequent...

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2019

作者： Koval, Nikita Alistarh, Dan Elizarov, Roman IST Austria Austria JetBrains Austria

ISBN: (纸本)9781450362252

Traditional concurrent programming involves manipulating shared mutable state. Alternatives to this programming style are communicating sequential processes (CSP) [1] and actor [2] models, which share data via explicit communication. Rendezvous channel is the common abstraction for communication between several processes, where senders and receivers perform a rendezvous handshake as a part of their protocol (senders wait for receivers and vice versa). Additionally to this, channels support the select expression. In this work, we present the first eficient lock-free channel algorithm, and compare it against Go [3] and Kotlin [4] baseline implementations. © 2019 Copyright held by the owner/author(s).

关键词： Locks (fasteners)

来源：评论

学校读者我要写书评

暂无评论

programming Quantum Computers: A Primer with IBM Q and D-Wave Exercises 19

Programming Quantum Computers: A Primer with IBM Q and D-Wav...

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Mueller, Frank Byrd, Greg Dreher, Patrick North Carolina State Univ Raleigh NC 27695 USA

ISBN: (纸本)9781450362252

this tutorial provides a hands-on introduction to quantum computing. It will feature the three pillars, architectures, programming, and algorithms/applications of quantum computing. Its focus is on the applicability of problems to quantum computing from a practical point, with only the necessary foundational coverage of the physics and theoretical aspects to understand quantum computing. Simulation software will be utilized complemented by access to actual quantum computers to prototype problem solutions. this should develop a better understanding of how problems are transformed into quantum algorithms and what programming language support is best suited for a given application area. As a first of its kind, to the best of our knowledge, the tutorial includes hands-on programming experience with IBM Q and D-Wave hardware.

关键词： quantum computing

来源：评论

学校读者我要写书评

暂无评论

S-EnKF: Co-designing for Scalable Ensemble Kalman Filter 19

S-EnKF: Co-designing for Scalable Ensemble Kalman Filter

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Xiao, Junmin Wang, Shijie Wan, Weiqiang Hong, Xuehai Tan, Guangming Univ Chinese Acad Sci Inst Comp Technol State Key Lab Comp Architecture Chinese Acad Sci Beijing Peoples R China

ISBN: (纸本)9781450362252

Ensemble Kalman filter (EnKF) is one of the most important methods for data assimilation, which is widely applied to the reconstruction of observed historical data for providing initial conditions of numerical atmospheric and oceanic models. With the improvement of data resolution and the increase in the amount of model data, the scalability of recent parallel implementations suffers from high overhead on data transfer. In this paper, we propose, S-EnKF: a scalable and distributed EnKF adaptation for modern clusters. With an in-depth analysis of new requirements brought forward by recent frameworks and limitations of current designs, we present a co-design of S-EnKF. For fully exploiting the resources available in modern parallel file systems, we design a concurrent access approach to accelerate the process of reading large amounts of background data. through a deeper investigation of the data dependence relations, we modify EnKF's workflow to maximize the overlap of file reading and local analysis with a new multi-stage computation approach. Furthermore, we push the envelope of performance further with aggressive co-design of auto-tuning through tradeoff between the benefit on runtime and the cost on processors based on classic cost models. the experimental evaluation of S-EnKF demonstrates nearly ideal strong scalability on up to 12,000 processors. the largest run sustains a performance of 3x-speedup compared with P-EnKF, which represents the state-of-art parallel implementation of EnKF.

关键词： ensemble Kalman filter parallel implementation domain localization data assimilation scalability

来源：评论

学校读者我要写书评

暂无评论

Proactive Work Stealing for Futures 19

Proactive Work Stealing for Futures

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP)

作者： Singer, Kyle Xu, Yifan Lee, I-Ting Angelina Washington Univ St Louis MO USA

ISBN: (纸本)9781450362252

the use of futures provides a flexible way to express parallelism and can generate arbitrary dependences among parallel subcomputations. the additional flexibility that futures provide comes with a cost, however. When scheduled using classic work stealing, a program with futures, compared to a program that uses only fork-join parallelism, can incur a much higher number of "deviations," a metric for evaluating the performance of parallel executions. All prior works assume a parsimonious work-stealing scheduler, however, where a worker thread (surrogate of a processor) steals work only when its local deque becomes empty. In this work, we investigate an alternative scheduling approach, called ProWS, where the workers perform proactive work stealing when handling future operations. We show that ProWS, for programs that use futures, can provide provably efficient execution time and equal or better bounds on the number of deviations compared to classic parsimonious work stealing. Given a computation with T-1 work and T-infinity span, ProWS executes the computation on P processors in expected time O(T-1/P + T-infinity lg P), with an additional lg P overhead on the span term compared to the parsimonious variant. For structured use of futures, where each future is single touch with no race on the future handle, the algorithm incurs O(PT infinity 2) deviations, matching that of the parsimonious variant. For general use of futures, the algorithm incurs O(m(k)T(infinity) + PT infinity lg P) deviations, where m(k) is the maximum number of future touches that are logically parallel. Compared to the bound for the parsimonious variant, O(kT(infinity) + PT infinity), with k being the total number of touches in the entire computation, this bound is better assuming m(k) = Omega(P lg P) and is smaller than k, which holds true for all the benchmarks we examined.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Stream parallelism Annotations for Multi-Core Frameworks 20

Stream Parallelism Annotations for Multi-Core Frameworks

引用

24th Brazilian symposium on programming Languages, SBLP 2020, co-located with the Brazilian Conference on Software: theory and practice, CBSoft 2020

作者： Hoffmann, Renato B. Griebler, Dalvan Danelutto, Marco Fernandes, Luiz G. School of Technology Pucrs Porto Alegre Brazil University of Pisa Pisa Italy

ISBN: (纸本)9781450389433

Data generation, collection, and processing is an important workload of modern computer architectures. Stream or high-intensity data flow applications are commonly employed in extracting and interpreting the information contained in this data. Due to the computational complexity of these applications, high-performance ought to be achieved using parallel computing. Indeed, the efficient exploitation of available parallel resources from the architecture remains a challenging task for the programmers. Techniques and methodologies are required to help shift the efforts from the complexity of parallelism exploitation to specific algorithmic solutions. To tackle this problem, we propose a methodology that provides the developer with a suitable abstraction layer between a clean and effective parallel programming interface targeting different multi-core parallel programming frameworks. We used standard C++ code annotations that may be inserted in the source code by the programmer. then, a compiler parses C++ code with the annotations and generates calls to the desired parallel runtime API. Our experiments demonstrate the feasibility of our methodology and the performance of the abstraction layer, where the difference is negligible in four applications with respect to the state-of-the-art C++ parallel programming frameworks. Additionally, our methodology allows improving the application performance since the developers can choose the runtime that best performs in their system. © 2020 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

VEBO: A vertex- and edge-balanced ordering heuristic to load balance parallel graph processing 19

VEBO: A vertex- and edge-balanced ordering heuristic to load...

引用

24th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2019

作者： Sun, Jiawen Vandierendonck, Hans Nikolopoulos, Dimitrios S. Queen's University of Belfast United Kingdom

ISBN: (纸本)9781450362252

this work proposes Vertex- and Edge-Balanced Ordering (VEBO): balance the number of edges and the number of unique destinations of those edges. VEBO balances edges and vertices for graphs with a power-law degree distribution, and ensures an equal degree distribution between partitions. Experimental evaluation on three shared-memory graph processing systems (Ligra, Polymer and GraphGrind) shows that VEBO achieves excellent load balance and improves performance by 1.09× over Ligra, 1.41× over Polymer and 1.65× over GraphGrind, compared to their respective partitioning algorithms, averaged across 8 algorithms and 7 graphs. VEBO improves GraphGrind performance with a speedup of 2.9× over Ligra on average. © 2019 Copyright held by the owner/author(s).

关键词： Graph theory

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共37页 << < 7 8 9 10 11 12 13 14 15 16 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：