检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

329 篇 会议
46 篇 期刊文献

馆藏范围

375 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

325 篇 工学
- 275 篇 软件工程
- 267 篇 计算机科学与技术...
- 12 篇 电子科学与技术（可...
- 7 篇 信息与通信工程
- 4 篇 机械工程
- 4 篇 控制科学与工程
- 4 篇 生物工程
- 3 篇 生物医学工程（可授...
- 1 篇 力学（可授工学、理...
- 1 篇 动力工程及工程热...
- 1 篇 电气工程
- 1 篇 核科学与技术
- 1 篇 农业工程
- 1 篇 环境科学与工程（可...
56 篇 理学
- 52 篇 数学
- 5 篇 系统科学
- 4 篇 生物学
- 4 篇 统计学（可授理学、...
- 2 篇 化学
15 篇 管理学
- 11 篇 管理科学与工程(可...
- 8 篇 工商管理
- 4 篇 图书情报与档案管...
3 篇 经济学
- 3 篇 应用经济学
2 篇 法学
- 2 篇 社会学
1 篇 教育学
- 1 篇 教育学
1 篇 农学
- 1 篇 作物学

主题

71 篇 performance
49 篇 parallel process...
42 篇 algorithms
41 篇 parallel program...
39 篇 languages
34 篇 design
21 篇 gpu
20 篇 parallel algorit...
12 篇 experimentation
12 篇 measurement
9 篇 theory
8 篇 mpi
8 篇 parallel computi...
7 篇 graphics process...
7 篇 parallel
7 篇 verification
7 篇 concurrency
6 篇 parallelism
6 篇 openmp
5 篇 reliability

机构

7 篇 carnegie mellon ...
4 篇 univ wisconsin d...
4 篇 indiana univ blo...
3 篇 univ of tokyo
3 篇 univ chinese aca...
3 篇 massachusetts in...
3 篇 univ illinois ur...
3 篇 swiss fed inst t...
3 篇 mit csail united...
3 篇 shanghai jiao to...
3 篇 tsinghua univ pe...
3 篇 univ utah sch co...
3 篇 rice univ housto...
3 篇 univ calif berke...
2 篇 ist austria klos...
2 篇 princeton univ d...
2 篇 georgetown univ ...
2 篇 shanghai key lab...
2 篇 univ of wisconsi...
2 篇 tsinghua univers...

作者

8 篇 blelloch guy e.
6 篇 hoefler torsten
6 篇 garland michael
6 篇 chen haibo
6 篇 shun julian
5 篇 sun yihan
5 篇 zhai jidong
5 篇 tsigas philippas
4 篇 dhulipala laxman
4 篇 tan guangming
4 篇 wang haojie
4 篇 nikolopoulos dim...
4 篇 long guoping
4 篇 valero mateo
4 篇 mellor-crummey j...
4 篇 gu yan
4 篇 kennedy ken
3 篇 taura kenjiro
3 篇 li jiajia
3 篇 yonezawa akinori

语言

349 篇 英文
26 篇 其他

检索条件"任意字段=18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming"

共 375 条记录，以下是91-100 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

POSTER: parallel Algorithms for Masked Sparse Matrix-Matrix Products 27

POSTER: Parallel Algorithms for Masked Sparse Matrix-Matrix ...

引用

27th acm sigplan symposium on principles and practice of parallel programming (PPoPP)

作者： Milakovic, Srdan Selvitopi, Oguz Nisa, Israt Budimlic, Zoran Buluc, Aydin Rice Univ Houston Houston TX USA Lawrence Berkeley Nat Lab Berkeley Berkeley CA USA AWS Palo Alto Palo Alto CA USA

ISBN: (纸本)9781450392044

Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important class of algorithms, only a subset of the output entries are needed, and the resulting operation is known as Masked SpGEMM since a subset of the output entries is considered to be "masked out". In this work, we investigate various novel algorithms and data structures for this rather challenging and important computation, and provide guidelines on how to design a fast Masked-SpGEMM for shared-memory architectures.

关键词： Masked-SpGEMM Sparse Matrix GraphBLAS

来源：评论

学校读者我要写书评

暂无评论

Pure: Evolving Message Passing To Better Leverage Shared Memory Within Nodes 24

Pure: Evolving Message Passing To Better Leverage Shared Mem...

引用

29th acm sigplan Annual symposium on principles and practice of parallel programming (PPoPP)

作者： Psota, James Solar-Lezama, Armando MIT CSAIL Cambridge MA 02139 USA

ISBN: (纸本)9798400704352

Pure is a new programming model and runtime system explicitly designed to take advantage of shared memory within nodes in the context of a mostly message passing interface enhanced with the ability to use tasks to make use of idle cores. Pure leverages shared memory in two ways: (a) by allowing cores to steal work from each other while waiting on messages to arrive, and, (b) by leveraging *** lock-free data structures in shared memory to achieve highperformance messaging and collective operations between the ranks within nodes. We use microbenchmarks to evaluate Pure's key messaging and collective features and also show application speedups up to 2.1 Chi on the CoMD molecular dynamics and the miniAMR adaptive mesh *** applications scaling up to 4,096 cores.

关键词： parallel programming models distributed runtime systems task-based parallelism concurrent data structures lock-free data structures

来源：评论

学校读者我要写书评

暂无评论

the Boat Hull Model: Adapting the Roofline Model to Enable Performance Prediction for parallel Computing 12

The Boat Hull Model: Adapting the Roofline Model to Enable P...

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Nugteren, Cedric Corporaal, Henk Eindhoven Univ Technol NL-5600 MB Eindhoven Netherlands

ISBN: (纸本)9781450311601

Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With these trends of parallel computing, it becomes increasingly difficult to decide on which architecture to run a given application. In this work, we use an algorithm classification to predict performance prior to algorithm implementation. For this purpose, we modify the roofline model to include class information. In this way, we enable architectural choice through performance prediction prior to the development of architecture specific code. the new model, the boat hull model, is demonstrated using a GPU as a target architecture. We show for 6 example algorithms that performance is predicted accurately without requiring code to be available.

关键词： Performance parallel computing performance prediction many-core accelerators the roofline model

来源：评论

学校读者我要写书评

暂无评论

parallelizing dynamic programming through rank convergence 14

Parallelizing dynamic programming through rank convergence

引用

2014 19th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2014

作者： Maleki, Saeed Musuvathi, Madanlal Mytkowicz, Todd Univerity of Illinois at Urbana-Champaign United States Microsoft Research United States

ISBN: (纸本)9781450326568

this paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman-Wunsch, Smith-Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend on each other, and thus can be computed in parallel, form stages or wavefronts. the algorithm presented in this paper provides additional parallelism allowing multiple stages to be computed in parallel despite dependences among them. the correctness and the performance of the algorithm relies on rank convergence properties of matrix multiplication in the tropical semiring, formed with plus as the multiplicative operation and max as the additive operation. this paper demonstrates the efficiency of the parallel algorithm by showing significant speed ups on a variety of important dynamic programming problems. In particular, the parallel Viterbi decoder is up-to 24× faster (with 64 processors) than a highly optimized commercial baseline. Copyright © 2014 acm.

关键词： Dynamic programming

来源：评论

学校读者我要写书评

暂无评论

StreamScan: Fast Scan Algorithms for GPUs without Global Barrier Synchronization 13

StreamScan: Fast Scan Algorithms for GPUs without Global Bar...

引用

18th acm sigplan symposium on principles and practice of parallel programming

作者： Yan, Shengen Long, Guoping Zhang, Yunquan Chinese Acad Sci Inst Software Lab Parallel Software & Computat Sci Beijing 100864 Peoples R China Chinese Acad Sci State Key Lab Comp Sci Beijing 100864 Peoples R China Chinese Acad Sci Grad Univ Beijing 100864 Peoples R China

ISBN: (纸本)9781450319225

Scan (also known as prefix sum) is a very useful primitive for various important parallel algorithms, such as sort, BFS, SpMV, compaction and so on. Current state of the art of GPU based scan implementation consists of three consecutive Reduce-Scan-Scan phases. this approach requires at least two global barriers and 3N (N is the problem size) global memory accesses. In this paper we propose StreamScan, a novel approach to implement scan on GPUs with only one computation phase. the main idea is to restrict synchronization to only adjacent workgroups, and thereby eliminating global barrier synchronization completely. the new approach requires only 2N global memory accesses and just one kernel invocation. On top of this we propose two important optimizations to further boost performance speedups, namely thread grouping to eliminate unnecessary local barriers, and register optimization to expand the on chip problem size. We designed an auto-tuning framework to search the parameter space automatically to generate highly optimized codes for both AMD and Nvidia GPUs. We implemented our technique with OpenCL. Compared with previous fast scan implementations, experimental results not only show promising performance speedups, but also reveal dramatic different optimization tradeoffs between Nvidia and AMD GPU platforms.

关键词： Scan prefix-sum OpenCL CUDA GPU parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Transforming high-level data-parallel programs into vector operations 93

Transforming high-level data-parallel programs into vector o...

引用

4th acm sigplan symposium on principles and practice of parallel programming, PPOPP 1993

作者： Prins, Jan F. Palmer, Daniel W. Department of Computer Science University of North Carolina Chapel HillNC27599-3175 United States

ISBN: (纸本)0897915895

Efficient parallel execution of a high-level data-parallel language based on nested sequences, higher order functions and generalized iterators can be realized in the vector model using a suitable representation of nested sequences and a small set of transformational rules to distribute iterators through the constructs of the language. © 1993 acm.

关键词： Metadata

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 18th International symposium on principles and practice of Declarative programming, PPDP 2016

Proceedings of the 18th International Symposium on Principle...

引用

18th International symposium on principles and practice of Declarative programming, PPDP 2016

ISBN: (纸本)9781450341486

the proceedings contain 16 papers. the topics discussed include: description and evaluation of a generic design to integrate CLP and tabled execution;higher-order logic programming: an expressive language for representing qualitative preferences;a framework for easing the development of applications embedding answer set programming;proving inductive validity of constrained inequalities;analysis of access control policy updates through narrowing;strand spaces with choice via a process algebra semantics;reducing the overhead of assertion run-time checks via static analysis;and iterated process analysis over lattice-valued regular expressions.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Space and time efficient execution of parallel irregular computations

Space and time efficient execution of parallel irregular com...

引用

6th acm sigplan symposium on principles and practice of parallel programming

作者： Fu, C Yang, T Univ of California Santa Barbara United States

Solving problems of large sizes is an important goal for parallel machines with multiple CPU and memory resources. In this paper, issues of efficient execution of overhead-sensitive parallel irregular computation under memory constraints are addressed. the irregular parallelism is modeled by task dependence graphs with mixed granularities. the trade-off in achieving both time and space efficiency is investigated. the main difficulty of designing efficient run-time system support is caused by the use of fast communication primitives available on modern parallel architectures. A run-time active memory management scheme and new scheduling techniques are proposed to improve memory utilization while retaining good time efficiency, and a theoretical analysis on correctness and performance is provided. this work is implemented in the context of RAPID system [5] which provides run-time support for parallelizing irregular code on distributed memory machines and the effectiveness of the proposed techniques is verified on sparse Cholesky and LU factorization with partial pivoting. the experimental results on Cray-T3D show that solvable problem sizes can be increased substantially under limited memory capacities and the loss of execution efficiency caused by the extra memory managing overhead is reasonable.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Typechecking Protocols with Mungo and StMungo 16

Typechecking Protocols with Mungo and StMungo

引用

18th International symposium on principles and practice of Declarative programming (PPDP)

作者： Kouzapas, Dimitrios Dardha, Ornela Perera, Roly Gay, Simon J. Univ Glasgow Sch Comp Sci Glasgow Lanark Scotland

ISBN: (纸本)9781450341486

We report on two tools that extend Java with support for static type-checking of communication protocols. Our Mungo tool extends Java with typestate definitions, which allow classes to be associated with state machines defining permitted sequences of method calls. A complementary tool, StMungo, takes a communication protocol specified in the Scribble protocol description language, and generates a typestate specification for each endpoint, capturing the permitted sequences of messages along that channel. Endpoint implementations can be validated by Mungo against their typestate definitions and then compiled as usual with j avac. We formalise Mungo's typestate inference system and demonstrate the Scribble, Mungo and StMungo toolchain via a typechecked SMTP client that can communicate with a real-world SMTP server.

关键词： Session types object-oriented programming typestate protocols typestate inference

来源：评论

学校读者我要写书评

暂无评论

POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-parallel HPC Applications 27

POSTER: LB-HM: Load Balance-Aware Data Placement on Heteroge...

引用

27th acm sigplan symposium on principles and practice of parallel programming (PPoPP)

作者： Xie, Zhen Liu, Jie Ma, Sam Li, Jiajia Li, Dong Univ Calif Merced CA USA Coll William Mary Williamsburg WA USA

ISBN: (纸本)9781450392044

the emergence of heterogeneous memory (HM) provides a cost-effective and high-performance solution to memory-consuming HPC applications. However, using HM, wisely migrating data objects on it is critical for high performance. In this work, we introduce a load balance-aware page management system, named LB-HM. LB-HM introduces task semantics during memory profiling, rather than being application-agnostic. Evaluating with a set of memory-consuming HPC applications, we show that we show that LB-HM reduces existing load imbalance and leads to an average of 17.1% and 15.4% (up to 26.0% and 23.2%) performance improvement, compared with a hardware-based solution and an industry-quality software-based solution on Optane-based HM.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共38页 << < 6 7 8 9 10 11 12 13 14 15 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：