检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

322 篇 会议
46 篇 期刊文献

馆藏范围

368 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

317 篇 工学
- 271 篇 软件工程
- 256 篇 计算机科学与技术...
- 12 篇 电子科学与技术（可...
- 7 篇 信息与通信工程
- 5 篇 控制科学与工程
- 4 篇 机械工程
- 4 篇 生物工程
- 3 篇 生物医学工程（可授...
- 1 篇 力学（可授工学、理...
- 1 篇 动力工程及工程热...
- 1 篇 电气工程
- 1 篇 建筑学
- 1 篇 土木工程
- 1 篇 化学工程与技术
- 1 篇 核科学与技术
- 1 篇 农业工程
- 1 篇 环境科学与工程（可...
55 篇 理学
- 50 篇 数学
- 4 篇 生物学
- 4 篇 系统科学
- 4 篇 统计学（可授理学、...
- 3 篇 化学
16 篇 管理学
- 12 篇 管理科学与工程(可...
- 8 篇 工商管理
- 4 篇 图书情报与档案管...
3 篇 经济学
- 3 篇 应用经济学
2 篇 法学
- 2 篇 社会学
1 篇 教育学
- 1 篇 教育学
1 篇 农学
- 1 篇 作物学

主题

71 篇 performance
49 篇 parallel process...
43 篇 algorithms
41 篇 parallel program...
39 篇 languages
34 篇 design
21 篇 gpu
20 篇 parallel algorit...
12 篇 experimentation
12 篇 measurement
9 篇 theory
8 篇 mpi
8 篇 parallel computi...
7 篇 graphics process...
7 篇 parallel
7 篇 concurrency
6 篇 parallelism
6 篇 verification
6 篇 openmp
5 篇 reliability

机构

7 篇 carnegie mellon ...
4 篇 univ wisconsin d...
4 篇 indiana univ blo...
3 篇 univ of tokyo
3 篇 univ chinese aca...
3 篇 massachusetts in...
3 篇 univ illinois ur...
3 篇 swiss fed inst t...
3 篇 mit csail united...
3 篇 shanghai jiao to...
3 篇 tsinghua univ pe...
3 篇 univ utah sch co...
3 篇 rice univ housto...
3 篇 univ calif berke...
2 篇 ist austria klos...
2 篇 princeton univ d...
2 篇 georgetown univ ...
2 篇 shanghai key lab...
2 篇 univ of wisconsi...
2 篇 tsinghua univers...

作者

8 篇 blelloch guy e.
7 篇 chen haibo
6 篇 hoefler torsten
6 篇 garland michael
6 篇 shun julian
5 篇 sun yihan
5 篇 zhai jidong
5 篇 tsigas philippas
4 篇 dhulipala laxman
4 篇 tan guangming
4 篇 wang haojie
4 篇 nikolopoulos dim...
4 篇 long guoping
4 篇 valero mateo
4 篇 mellor-crummey j...
4 篇 gu yan
4 篇 kennedy ken
3 篇 taura kenjiro
3 篇 li jiajia
3 篇 yonezawa akinori

语言

342 篇 英文
26 篇 其他

检索条件"任意字段=17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming"

共 368 条记录，以下是41-50 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

DOJ: Dynamically parallelizing Object-Oriented Programs 12

DOJ: Dynamically Parallelizing Object-Oriented Programs

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Eom, Yong Hun Yang, Stephen Jenista, James C. Demsky, Brian Univ Calif Irvine Irvine CA 92717 USA

ISBN: (纸本)9781450311601

We present Dynamic Out-of-Order Java (DOJ), a dynamic parallelization approach. In DOJ, a developer annotates code blocks as tasks to decouple these blocks from the parent execution thread. the DOJ compiler then analyzes the code to generate heap examiners that ensure the parallel execution preserves the behavior of the original sequential program. Heap examiners dynamically extract heap dependences between code blocks and determine when it is safe to execute a code block. We have implemented DOJ and evaluated it on twelve benchmarks. We achieved an average compilation speedup of 31.15x over OoOJava and an average execution speedup of 12.73x over sequential versions of the benchmarks.

关键词： Algorithms Performance parallel programming Dynamic Analysis Object-Oriented Analysis Heap Analysis parallelization

来源：评论

学校读者我要写书评

暂无评论

OpenCL as a Unified programming Model for Heterogeneous CPU/GPU Clusters 12

OpenCL as a Unified Programming Model for Heterogeneous CPU/...

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Kim, Jungwon Seo, Sangmin Lee, Jun Nah, Jeongho Jo, Gangwon Lee, Jaejin Seoul Natl Univ Sch Comp Sci & Engn Ctr Manycore Programming Seoul 151744 South Korea

ISBN: (纸本)9781450311601

In this paper, we propose an OpenCL framework for heterogeneous CPU/GPU clusters, and show that the framework achieves both high performance and ease of programming. the framework provides an illusion of a single system for the user. It allows the application to utilize multiple heterogeneous compute devices, such as multicore CPUs and GPUs, in a remote node as if they were in a local node. No communication API, such as the MPI library, is required in the application source. We implement the OpenCL framework and evaluate its performance on a heterogeneous CPU/GPU cluster that consists of one host node and nine compute nodes using eleven OpenCL benchmark applications.

关键词： Algorithm Design Experimentation Languages Measurement Performance OpenCL Clusters Heterogeneous computing programming models

来源：评论

学校读者我要写书评

暂无评论

An Overview of Medusa: Simplified Graph Processing on GPUs 12

An Overview of Medusa: Simplified Graph Processing on GPUs

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Zhong, Jianlong He, Bingsheng Nanyang Technol Univ Singapore 639798 Singapore

ISBN: (纸本)9781450311601

Graphs are the de facto data structures for many applications, and efficient graph processing is a must for the application performance. GPUs have an order of magnitude higher computational power and memory bandwidth compared to CPUs and have been adopted to accelerate several common graph algorithms. However, it is difficult to write correct and efficient GPU programs and even more difficult for graph processing due to the irregularities of graph structures. To address those difficulties, we propose a programming framework named Medusa to simplify graph processing on GPUs. Medusa offers a small set of APIs, based on which developers can define their application logics by writing sequential code without awareness of GPU architectures. the Medusa runtime system automatically executes the developer defined APIs in parallel on the GPU, with a series of graph-centric optimizations. this poster gives an overview of Medusa, and presents some preliminary results.

关键词： Algorithms Performance GPGPU GPU programming Graph Processing Runtime Framework

来源：评论

学校读者我要写书评

暂无评论

Architectural Support for Cilk Computations on Many-core Architectures

Architectural Support for Cilk Computations on Many-core Arc...

引用

14th acm sigplan symposium on principles and practice of parallel programming

作者： Long, Guoping Fan, Dongrui Zhang, Junchao Chinese Acad Sci Inst Comp Technol Key Lab Comp Syst & Architecture Beijing 100864 Peoples R China

来源：评论

学校读者我要写书评

暂无评论

Using GPU's to Accelerate Stencil-based Computation Kernels for the Development of Large Scale Scientific Applications on Heterogeneous Systems 12

Using GPU's to Accelerate Stencil-based Computation Kernels ...

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Tao, Jian Blazewicz, Marek Brandt, Steven R. Louisiana State Univ Ctr Computat & Technol Baton Rouge LA 70803 USA Poznan Supercomp & Networking Ctr Applicat Dept Poznan Poland

ISBN: (纸本)9781450311601

We present CaCUDA - a GPGPU kernel abstraction and a parallel programming framework for developing highly efficient large scale scientific applications using stencil computations on hybrid CPU/GPU architectures. CaCUDA is built upon the Cactus computational toolkit, an open source problem solving environment designed for scientists and engineers. Due to the flexibility and extensibility of the Cactus toolkit, the addition of a GPGPU programming framework required no changes to the Cactus infrastructure, guaranteeing that existing features and modules will continue to work without modification. CaCUDA was tested and benchmarked using a 3D CFD code based on a finite difference discretization of Navier-Stokes equations.

关键词： Algorithms Design Languages GPGPU programming Computational Framework HPC Stencil Computation

来源：评论

学校读者我要写书评

暂无评论

Scalable GPU Graph Traversal 12

Scalable GPU Graph Traversal

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Merrill, Duane Garland, Michael Grimshaw, Andrew Univ Virginia Charlottesville VA 22903 USA NVIDIA Corp Santa Clara CA USA

ISBN: (纸本)9781450311601

Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to focus on asymptotically inefficient algorithms that perform poorly on graphs with non-trivial diameter. We present a BFS parallelization focused on fine-grained task management constructed from efficient prefix sum that achieves an asymptotically optimal O(|V|+|E|) work complexity. Our implementation delivers excellent performance on diverse graphs, achieving traversal rates in excess of 3.3 billion and 8.3 billion traversed edges per second using single and quad-GPU configurations, respectively. this level of performance is several times faster than state-of-the-art implementations both CPU and GPU platforms.

关键词： Algorithms performance Breadth-first search GPU graph algorithms parallel algorithms prefix sum graph traversal sparse graph

来源：评论

学校读者我要写书评

暂无评论

Automatic placement of communications in mesh-partitioning parallelization 97

Automatic placement of communications in mesh-partitioning p...

引用

6th acm sigplan symposium on principles and practice of parallel programming

作者： Hascoet, L INRIA Sophia-Antipolis B.P. 93 06902 Sophia-Antipolis France

ISBN: (纸本)9780897919067

We present a tool for mesh-partitioning parallelization of numerical programs working iteratively on an unstructured mesh. this conventional method splits a mesh into sub-meshes, adding some overlap on the boundaries of the sub-meshes. the program is then run in SPMD mode on a parallel architecture with distributed memory. It is necessary to add calls to communication routines at a few carefully selected locations in the code. the tool presented here uses the data-dependence information to mechanize the placement of these synchronizations. Additionally, we see that there is not a unique solution for placing these synchronizations, and performance depends on this choice.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Intra-Application Shared Cache Partitioning For Multithreaded Applications

Intra-Application Shared Cache Partitioning For Multithreade...

引用

15th acm sigplan symposium on principles and practice of parallel programming

作者： Muralidhara, Sai Prashanth Kandemir, Mahmut Raghavan, Padma Penn State Univ University Pk PA 16802 USA

In this paper, we address the problem of partitioning a shared cache when the executing threads belong to the same application.

ISBN: (纸本)9781605587080

In this paper, we address the problem of partitioning a shared cache when the executing threads belong to the same application.

关键词： Cache Multicore parallel Applications

来源：评论

学校读者我要写书评

暂无评论

Extending a C-like Language for Portable SIMD programming 12

Extending a C-like Language for Portable SIMD Programming

引用

17th acm sigplan symposium on principles and practice of parallel programming

作者： Leissa, Roland Hack, Sebastian Wald, Ingo Univ Saarland Compiler Design Lab Saarbrucken Germany Intel Corp Visual Applicat Res Santa Clara CA 95051 USA

ISBN: (纸本)9781450311601

SIMD instructions are common in CPUs for years now. Using these instructions effectively requires not only vectorization of code, but also modifications to the data layout. However, automatic vectorization techniques are often not powerful enough and suffer from restricted scope of applicability;hence, programmers often vectorize their programs manually by using intrinsics: compiler-known functions that directly expand to machine instructions. they significantly decrease programmer productivity by enforcing a very error-prone and hard-to-read assembly-like programming style. Furthermore, intrinsics are not portable because they are tied to a specific instruction set. In this paper, we show how a C-like language can be extended to allow for portable and efficient SIMD programming. Our extension puts the programmer in total control over where and how control-flow vectorization is triggered. We present a type system and a formal semantics of our extension and prove the soundness of the type system. Using our prototype implementation IVL that targets Intel's MIC architecture and SSE instruction set, we show that the generated code is roughly on par with handwritten intrinsic code.

关键词： Languages Performance theory language theory parallel programming polymorphism semantics SIMD SIMT type system vectorization

来源：评论

学校读者我要写书评

暂无评论

Applying the Concurrent Collections programming Model to Asynchronous parallel Dense Linear Algebra

Applying the Concurrent Collections Programming Model to Asy...

引用

15th acm sigplan symposium on principles and practice of parallel programming

作者： Chandramowlishwaran, Aparna Knobe, Kathleen Vuduc, Richard Georgia Inst Technol Atlanta GA 30332 USA Intel Corp Santa Clara CA 95051 USA

ISBN: (纸本)9781605587080

this poster is a case study on the application of a novel programming model, called Concurrent Collections (CnC), to the implementation of an asynchronous-parallel algorithm for computing the Cholesky factorization of dense matrices. In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. We demonstrate the performance potential of CnC in this poster, by showing that our Cholesky implementation nearly matches or exceeds competing vendor-tuned codes and alternative programming models. We conclude that the CnC model is well-suited for expressing asynchronous-parallel algorithms on emerging multicore systems.

关键词： Algorithms Performance

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共37页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：