检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

336 篇 会议
49 篇 期刊文献

馆藏范围

385 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

335 篇 工学
- 290 篇 软件工程
- 274 篇 计算机科学与技术...
- 13 篇 电子科学与技术（可...
- 7 篇 信息与通信工程
- 4 篇 机械工程
- 4 篇 控制科学与工程
- 4 篇 生物工程
- 3 篇 电气工程
- 3 篇 生物医学工程（可授...
- 2 篇 力学（可授工学、理...
- 2 篇 动力工程及工程热...
- 1 篇 建筑学
- 1 篇 土木工程
- 1 篇 化学工程与技术
- 1 篇 核科学与技术
- 1 篇 农业工程
- 1 篇 环境科学与工程（可...
63 篇 理学
- 58 篇 数学
- 4 篇 生物学
- 4 篇 系统科学
- 4 篇 统计学（可授理学、...
- 3 篇 化学
- 2 篇 物理学
17 篇 管理学
- 11 篇 管理科学与工程(可...
- 9 篇 工商管理
- 6 篇 图书情报与档案管...
3 篇 经济学
- 3 篇 应用经济学
3 篇 法学
- 3 篇 社会学
1 篇 教育学
- 1 篇 教育学
1 篇 农学
- 1 篇 作物学

主题

73 篇 performance
52 篇 parallel process...
44 篇 parallel program...
43 篇 languages
42 篇 algorithms
35 篇 design
21 篇 gpu
20 篇 parallel algorit...
14 篇 experimentation
12 篇 measurement
11 篇 theory
8 篇 mpi
8 篇 parallel computi...
7 篇 graphics process...
7 篇 parallel
7 篇 concurrency
6 篇 scalability
6 篇 parallelism
6 篇 verification
6 篇 openmp

机构

7 篇 carnegie mellon ...
4 篇 univ wisconsin d...
4 篇 indiana univ blo...
3 篇 univ of tokyo
3 篇 tsinghua univers...
3 篇 univ chinese aca...
3 篇 massachusetts in...
3 篇 univ illinois ur...
3 篇 swiss fed inst t...
3 篇 mit csail united...
3 篇 shanghai jiao to...
3 篇 tsinghua univ pe...
3 篇 univ utah sch co...
3 篇 rice univ housto...
3 篇 univ calif berke...
2 篇 ist austria klos...
2 篇 princeton univ d...
2 篇 georgetown univ ...
2 篇 shanghai key lab...
2 篇 univ of wisconsi...

作者

8 篇 blelloch guy e.
6 篇 hoefler torsten
6 篇 garland michael
6 篇 chen haibo
6 篇 shun julian
5 篇 sun yihan
5 篇 zhai jidong
5 篇 tsigas philippas
4 篇 dhulipala laxman
4 篇 chen wenguang
4 篇 tan guangming
4 篇 wang haojie
4 篇 nikolopoulos dim...
4 篇 long guoping
4 篇 sarkar vivek
4 篇 valero mateo
4 篇 mellor-crummey j...
4 篇 gu yan
4 篇 kennedy ken
3 篇 taura kenjiro

语言

357 篇 英文
28 篇 其他

检索条件"任意字段=9th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming"

共 385 条记录，以下是91-100 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems 14

Initial study of multi-endpoint runtime for MPI+OpenMP hybri...

引用

2014 19th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2014

作者： Luo, Miao Lu, Xiaoyi Hamidouche, Khaled Kandalla, Krishna Panda, Dhabaleswar K. Dept. of Computer Science and Engineering Ohio State University United States

ISBN: (纸本)9781450326568

State-of-the-art MPI libraries rely on locks to guarantee thread-safety. this discourages application developers from using multiple threads to perform MPI operations. In this paper, we propose a high performance, lock-free multiendpoint MPI runtime, which can achieve up to 40% improvement for point-to-point operation and one representative collective operation with minimum or no modifications to the existing applications.

关键词： Locks (fasteners)

来源：评论

学校读者我要写书评

暂无评论

An OpenACC-based unified programming model for multi-accelerator systems 2015

An OpenACC-based unified programming model for multi-acceler...

引用

20th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2015

作者： Kim, Jungwon Lee, Seyong Vetter, Jeffrey S. Oak Ridge National Laboratory United States Georgia Institute of Technology United States

ISBN: (纸本)9781450332057

this paper proposes a novel SPMD programming model of OpenACC. Our model integrates the different granularities of parallelism from vector-level parallelism to node-level parallelism into a single, unified model based on OpenACC. It allows programmers to write programs for multiple accelerators using a uniform programming model whether they are in shared or distributed memory systems. We implement a prototype of our model and evaluate its performance with a GPU-based supercomputer using three benchmark applications.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Is Transactional programming Actually Easier?

Is Transactional Programming Actually Easier?

引用

15th acm sigplan symposium on principles and practice of parallel programming

作者： Rossbach, Christopher J. Hofmann, Owen S. Witchel, Emmett Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9781605587080

Chip multi-processors (CMPs) have become ubiquitous, while tools that ease concurrent programming have not. the promise of increased performance for all applications through ever more parallel hardware requires good tools for concurrent programming, especially for average programmers. Transactional memory (TM) has enjoyed recent interest as a tool that can help programmers program concurrently. the transactional memory (TM) research community is heavily invested in the claim that programming with transactional memory is easier than alternatives (like locks), but evidence for or against the veracity of this claim is scant. In this paper, we describe a user-study in which 237 undergraduate students in an operating systems course implement the same programs using coarse and fine-grain locks, monitors, and transactions. We surveyed the students after the assignment, and examined their code to determine the types and frequency of programming errors for each synchronization technique. Inexperienced programmers found baroque syntax a barrier to entry for transactional programming. On average, subjective evaluation showed that students found transactions harder to use than coarse-grain locks, but slightly easier to use than fine-grained locks. Detailed examination of synchronization errors in the students' code tells a rather different story. Overwhelmingly, the number and types of programming errors the students made was much lower for transactions than for locks. On a similar programming problem, over 70% of students made errors with fine-grained locking, while less than 10% made errors with transactions.

关键词： Design Performance Transactional Memory Optimistic Concurrency Synchronization

来源：评论

学校读者我要写书评

暂无评论

Simplifying low-level GPU programming with GAS 21

Simplifying low-level GPU programming with GAS

引用

26th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2021

作者： Yan, Da Wang, Wei Chu, Xiaowen Hkust Hong Kong Hong Kong Baptist University Hong Kong

ISBN: (纸本)9781450382946

Many low-level optimizations for NVIDIA GPU can only be implemented in native hardware assembly (SASS). However, programming in SASS is unproductive and not portable. To simplify low-level GPU programming, we present GAS (Gpu ASsembly), a PTX-like language that provides a stable instruction set across hardware architectures while giving programmers a low-level control of code execution. We demonstrate that GAS can be used with ease for low-level benchmarking and performance tuning in the context of Tensor Core HGEMM. © 2021 Owner/Author.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the acm sigplan symposium on principles and practice of parallel programming, PPOPP

Proceedings of the ACM SIGPLAN Symposium on Principles and P...

引用

24th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2019

ISBN: (纸本)9781450362252

the proceedings contain 58 papers. the topics discussed include: beyond human-level accuracy: computational challenges in deep learning;throughput-oriented GPU memory allocation;SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU;incremental flattening for nested data parallelism;modular transactions: bounding mixed races in space and time;processing transactions in a predefined order;data-flow/dependence profiling for structured transformations;lightweight hardware transactional memory profiling;provably and practically efficient granularity control;semantics-aware scheduling policies for synchronization determinism;and a round-efficient distributed betweenness centrality algorithm.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PPoPP 2014 - Proceedings of the 2014 acm sigplan symposium on principles and practice of parallel programming

PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium o...

引用

2014 19th acm sigplan symposium on principles and practice of parallel programming, PPoPP 2014

ISBN: (纸本)9781450326568

the proceedings contain 43 papers. the topics discussed include: predator: predictive false sharing detection;concurrency testing using schedule bounding: an empirical study;trace driven dynamic deadlock detection and reproduction;efficient search for inputs causing high floating-point errors;portable, MPI-interoperable coarray Fortran;eliminating global interpreter locks in ruby through hardware transactional memory;leveraging hardware message passing for efficient thread synchronization;well-structured futures and cache locality;time-warp: lightweight abort minimization in transactional memory;beyond parallel programming with domain specific languages;a decomposition for in-place matrix transposition;in-place transposition of rectangular matrices on accelerators;and parallelizing dynamic programming through rank convergence.

关键词： FORTRAN (programming language)

来源：评论

学校读者我要写书评

暂无评论

Petascale Computing with Accelerators

Petascale Computing with Accelerators

引用

14th acm sigplan symposium on principles and practice of parallel programming

作者： Kistler, Michael Gunnels, John Brokenshire, Daniel Benton, Brad IBM Corp Austin TX 78758 USA IBM Corp Yorktown Hts NY 10598 USA

ISBN: (纸本)9781605583976

A trend is developing in high performance computing in which commodity processors are coupled to various types of computational accelerators. Such systems are commonly called hybrid systems. In this paper, we describe our experience developing an implementation of the Linpack benchmark for a petascale hybrid system, the LANL Roadrunner cluster built by IBM for Los Alamos National Laboratory. this system combines traditional x86-64 host processors with IBM PowerXCell (TM) 8i accelerator processors. the implementation of Linpack we developed was the first to achieve a performance result in excess of 1.0 PFLOPS, and made Roadrunner the #1 system on the Top500 list in June 2008. We describe the design and implementation of hybrid Linpack, including the special optimizations we developed for this hybrid architecture. We then present actual results for single node and multi-node executions. From this work, we conclude that it is possible to achieve high performance for certain applications on hybrid architectures when careful attention is given to efficient use of memory bandwidth, scheduling of data movement between the host and accelerator memories, and proper distribution of work between the host and accelerator processors.

关键词： Algorithms Performance Design Accelerators hybrid programming models

来源：评论

学校读者我要写书评

暂无评论

Implementing parallel and Concurrent Tree Structures 19

Implementing Parallel and Concurrent Tree Structures

引用

24th acm sigplan symposium on principles and practice of parallel programming (PPoPP)

作者： Sun, Yihan Blelloch, Guy Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781450362252

As one of the most important data structures used in algorithm design and programming, balanced search trees are widely used in real-world applications for organizing data. Answering the challenges thrown up by modern largevolume and ever-changing data, it is important to consider parallelism, concurrency, and persistence. this tutorial will introduce techniques for supporting functionalities on trees, including various parallel algorithms, concurrency, multiversioning, etc. In particular, this tutorial will focus on an algorithmic framework for parallel balanced binary trees, which works for multiple balancing schemes, including AVL trees, red-black trees, weight-based trees, and treaps. this framework allows for theoretically-efficient algorithms. the corresponding implementation is available as a library, which demonstrates good performance both sequentially and in parallel in various use scenarios. this tutorial will focus on the following topics: 1) the algorithms and techniques used in the PAM library;2) the interface of the library and a hands-on introduction to the download/installation of the library;3) examples of applying the library to various applications and 4) introduction about other useful techniques for parallel tree structures and performance comparisons with PAM.

关键词： balanced tree augmented map parallel concurrent library PAM ordered set ordered map

来源：评论

学校读者我要写书评

暂无评论

Evaluating Graph Coloring on GPUs 11

Evaluating Graph Coloring on GPUs

引用

16th acm symposium on principles and practice of parallel programming

作者： Grosset, A. V. Pascal Zhu, Peihong Liu, Shusen Venkatasubramanian, Suresh Hall, Mary Univ Utah Sch Comp Salt Lake City UT 84112 USA

ISBN: (纸本)9781450301190

this paper evaluates features of graph coloring algorithms implemented on graphics processing units (GPUs), comparing coloring heuristics and thread decompositions. As compared to prior work on graph coloring for other parallel architectures, we find that the large number of cores and relatively high global memory bandwidth of a GPU lead to different strategies for the parallel implementation. Specifically, we find that a simple uniform block partitioning is very effective on GPUs and our parallel coloring heuristics lead to the same or fewer colors than prior approaches for distributed-memory cluster architecture. Our algorithm resolves many coloring conflicts across partitioned blocks on the GPU by iterating through the coloring process, before returning to the CPU to resolve remaining conflicts. With this approach we get as few color (if not fewer) than the best sequential graph coloring algorithm and performance is close to the fastest sequential graph coloring algorithms which have poor color quality.

关键词： Algorithms Performance Graph coloring parallel algorithm GPU CUDA

来源：评论

学校读者我要写书评

暂无评论

GENERATING parallel CODE FROM OBJECT-ORIENTED MAthEMATICAL-MODELS

GENERATING PARALLEL CODE FROM OBJECT-ORIENTED MATHEMATICAL-M...

引用

5th acm sigplan symposium on principles and practice of parallel programming

作者： ANDERSSON, N FRITZSON, P Linkoping Univ Linkoping Sweden

For a long time efficient use of parallel computers has been hindered by dependencies introduced in software through low-level implementation practice. In this paper we present a programming environment and language called Object-Math (Object oriented Mathematical language for scientific computing), which aims at eliminating this problem by allowing the user to represent mathematical equation-based models directly in the system. the system performs analysis of mathematical models to extract parallelism and automatically generates parallel code for numerical solution. In the context of industrial applications in mechanical analysis, we have so far primarily explored generation of parallel code for solving systems of ordinary differential equations (ODEs), in addition to preliminary work on generating code for solving partial differential equations. Two approaches to extracting parallelism have been implemented and evaluated: extracting parallelism at the equation system level and at the single equation level, respectively. We found that for several applications the corresponding systems of equations do not partition well into subsystems. this means that the equation system level approach is of restricted general applicability. thus, we focused on the equation-level approach which yielded significant parallelism for ODE systems solution. For the bearing simulation applications we present here, the achieved speedup is however critically dependent on low communication latency of the parallel computer.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共39页 << < 6 7 8 9 10 11 12 13 14 15 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：