检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

1,503 篇 会议
99 篇 期刊文献

馆藏范围

1,602 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

1,162 篇 工学
- 1,105 篇 计算机科学与技术...
- 556 篇 软件工程
- 118 篇 电气工程
- 75 篇 信息与通信工程
- 46 篇 控制科学与工程
- 37 篇 电子科学与技术（可...
- 13 篇 材料科学与工程（可...
- 13 篇 农业工程
- 11 篇 机械工程
- 11 篇 光学工程
- 8 篇 化学工程与技术
- 8 篇 生物工程
- 7 篇 建筑学
- 7 篇 生物医学工程（可授...
- 6 篇 动力工程及工程热...
- 5 篇 土木工程
- 3 篇 力学（可授工学、理...
575 篇 理学
- 553 篇 数学
- 55 篇 统计学（可授理学、...
- 16 篇 物理学
- 9 篇 生物学
- 9 篇 系统科学
- 8 篇 化学
73 篇 管理学
- 64 篇 管理科学与工程(可...
- 40 篇 工商管理
- 10 篇 图书情报与档案管...
16 篇 农学
- 16 篇 作物学
6 篇 经济学
- 6 篇 应用经济学
3 篇 法学
- 3 篇 社会学
3 篇 教育学
- 3 篇 教育学
2 篇 医学
1 篇 文学
1 篇 军事学

主题

236 篇 parallel algorit...
175 篇 parallel process...
80 篇 computer archite...
73 篇 parallel process...
56 篇 parallel program...
54 篇 algorithms
47 篇 parallel archite...
41 篇 hardware
31 篇 scheduling
27 篇 computer program...
21 篇 graph algorithms
20 篇 computer systems...
18 篇 approximation al...
18 篇 processor schedu...
18 篇 computational mo...
18 篇 field programmab...
17 篇 parallel computi...
16 篇 performance
16 篇 delay
15 篇 computer science

机构

32 篇 carnegie mellon ...
15 篇 swiss fed inst t...
15 篇 carnegie mellon ...
11 篇 univ maryland de...
11 篇 stanford univ st...
10 篇 univ maryland co...
10 篇 mit 77 massachus...
10 篇 univ calif berke...
8 篇 eth zurich
7 篇 georgetown univ ...
7 篇 mit cambridge ma...
7 篇 univ texas austi...
6 篇 penn state univ ...
6 篇 mit csail cambri...
5 篇 univ calif river...
5 篇 princeton univer...
5 篇 university of ma...
5 篇 microsoft res re...
5 篇 carnegie mellon ...
5 篇 harvard univ cam...

作者

38 篇 blelloch guy e.
20 篇 gu yan
18 篇 gibbons phillip ...
18 篇 shun julian
18 篇 goodrich michael...
16 篇 fineman jeremy t...
15 篇 sun yihan
14 篇 dhulipala laxman
13 篇 vishkin uzi
12 篇 agrawal kunal
11 篇 leiserson charle...
10 篇 ballard grey
10 篇 hoefler torsten
10 篇 anon
10 篇 miller gary l.
10 篇 harris david g.
9 篇 ghaffari mohsen
9 篇 tangwongsan kana...
9 篇 reif john h.
9 篇 demmel james

语言

1,563 篇 英文
39 篇 其他

检索条件"任意字段=Annual ACM Symposium on Parallel Algorithms and Architectures"

共 1602 条记录，以下是1461-1470 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Experiments with High Speed parallel Cubing Units

Experiments with High Speed Parallel Cubing Units

引用

IEEE Computer Society annual symposium on VLSI

作者： Son Bui James E. Stine Masoud Sadeghian Department of Electrical and Computer Engineering Oklahoma State University Stillwater OK USA

This paper discusses modification to algorithms for computing within a parallel cubing unit. The algorithms discussed in this paper shows several architectures for various operand sizes ranging from 8 to 32 bits. The method proposed in this paper separates the cubing partial product matrix into smaller elements and organizes these partial products into repeatable manageable groups. Consequently, the overall partial product matrix is substantially reduced from previous methods. An algorithmic analysis is also presented that demonstrates reduction in area and delay for several operand widths as well as their implementations in a Vitex 5 Xilinx FPGAs and for IBM 65nm ASIC standard-cell library.

关键词： Logic gates Equations Mathematical model Adders Field programmable gate arrays Delays Application specific integrated circuits

来源：评论

学校读者我要写书评

暂无评论

parallel search for maximal independence given minimal dependence 90

引用

Proceedings of the first annual acm-SIAM symposium on Discrete algorithms

作者： Paul Beame Michael Luby Computer Science Department University of Washington Seattle Washington International Computer Science Institute Suite 600 1947 Center Street Berkeley California

来源：评论

学校读者我要写书评

暂无评论

Multithreaded algorithms for the fast Fourier transform 00

Multithreaded algorithms for the fast Fourier transform

引用

Proceedings of the twelfth annual acm symposium on parallel algorithms and architectures

作者： Parimala Thulasiraman Kevin B. Theobald Ashfaq A. Khokhar Guang R. Gao Department of Electrical and Computer Engineering 140 Evans Hall University of Delaware Newark DE

ISBN: (纸本)9781581131857

In this paper we present fine-grained multithreaded algorithms and implementations for the Fast Fourier Transform (FFT) problem. The FFT problem has been formulated using two distinct approaches based on the dataflow concepts. The first approach, referred to as the receiver-initiated algorithm, realizes the FFAT iterations as a parent-child relationship while fully exploiting the underlying parallelism. The second approach, referred to as the sender-initiated algorithm, follows a data-flow model based on the producer-consumer style of programming and can be adopted to different architectural parameters for achieving high performance. The implementations of the proposed algorithms have been carried out on the EARTH (Efficient Architecture for Running THreads) platform. For both the algorithms, we analyze the ratio of remote vs local threads and study its impact on the experimental results. Our implementation results show that for certain block sizes on fixed problem size and machine size, the receiver-initiated approach performs better than the sender-initiated approach. For large number of processors, both the algorithms perform well, yielding execution times of only 10 msec for an input of 16 K data points on a 64 processor machine, assuming each processor running at 140 MHz clock speed.

关键词： parallel algorithms fine-grained non-preemptive multithreading dataflow architecture

来源：评论

学校读者我要写书评

暂无评论

A Modified Shuffling Method to Split the Critical Path Delay in Layered Decoding of QC-LDPC Codes

A Modified Shuffling Method to Split the Critical Path Delay...

引用

IEEE International symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)

作者： Alireza Hasani Lukasz Lopacinski Steffen Büchner Jörg Nolte Rolf Kraemer Brandenburg University of Technology Cottbus-Senftenberg Cottbus Germany IHP - Leibniz-Institut für innovative Mikroelektronik Frankfurt (Oder) Germany

ISBN: (纸本)9781538681114

Layered (or Turbo) decoding of Low-Density Parity-Check (LDPC) codes is considered as a decoding schedule that facilitates partially parallel architectures for performing iterative algorithms based on belief propagation. It has, on one hand, reduced implementation complexity and memory overhead compared to fully parallel architectures and, on the other hand, higher convergence speed compared to both serial and parallel architectures. In this paper, we introduce a general form of shuffling of the parity-check matrix of quasi-cyclic LDPC (QC-LDPC) codes which can split the critical path delay in layered decoding and therefore improve throughput by allowing higher clock rates. We also reveal a valuable property of Latin squares QC-LDPC codes which makes them a good candidate for the proposed shuffling method. As a result of that property, no special caution of choosing offset values in the proposed generalized shuffling method is required.

关键词： parallel architectures DECODING low density parity check codes check matrix Critical Pathways critical path analysis Codes Belief propagation Serial Publications

来源：评论

学校读者我要写书评

暂无评论

On the parallel complexity of evaluating game trees 91

On the parallel complexity of evaluating game trees

引用

Proceedings of the second annual acm-SIAM symposium on Discrete algorithms

作者： Andrei Z. Broder Anna Karlin Prabhakar Raghavan Eli Upfal DEC Systems Research Center Palo Alto CA IBM T.J. Watson Research Center Yorktown Heights NY IBM Almaden Research Center San Jose CA and Department of Applied Mathematics The Weizmann Institute of Science Rehovot Israel

来源：评论

学校读者我要写书评

暂无评论

Convergence and scalarization for data-parallel architectures

Convergence and scalarization for data-parallel architecture...

引用

International symposium on Code Generation and Optimization (CGO)

作者： Yunsup Lee Ronny Krashinsky Vinod Grover Stephen W. Keckler Krste Asanović University of California at Berkeley USA NVIDIA USA University of Texas at Austin USA

ISBN: (纸本)9781467355247

Modern throughput processors such as GPUs achieve high performance and efficiency by exploiting data parallelism in application kernels expressed as threaded code. One draw-back of this approach compared to conventional vector architectures is redundant execution of instructions that are common across multiple threads, resulting in energy inefficiency due to excess instruction dispatch, register file accesses, and memory operations. This paper proposes to alleviate these overheads while retaining the threaded programming model by automatically detecting the scalar operations and factoring them out of the parallel code. We have developed a scalarizing compiler that employs convergence and variance analyses to statically identify values and instructions that are invariant across multiple threads. Our compiler algorithms are effective at identifying convergent execution even in programs with arbitrary control flow, identifying two-thirds of the opportunity captured by a dynamic oracle. The compile-time analysis leads to a reduction in instructions dispatched by 29%, register file reads and writes by 31% memory address counts by 47%, and data access counts by 38%.

关键词： Instruction sets Convergence Registers Kernel Computer architecture Graphics processing units Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

Distributed electric field approximation

Distributed electric field approximation

引用

IEEE International symposium on High Performance Computing Systems and Applications (HPCS)

作者： D. Trybus Z. Kucerovsky A. Ieta T.E. Doyle University of Western Ontario ONT Canada

Grid or mesh techniques are frequently used to approximate continuous entities that behave in a wave or fluid-like fashion. Partial Differential Equations (PDEs) are usually involved in the description of such entities or processes. Distributed parallel computation was used in various computer cluster configurations to calculate PDE solutions of electrostatic field. The study of the efficacy of the selected architecture using mesh techniques was intended. The match between the algorithm and the architecture in achieving maximum computational performance was also investigated. The developed architectures, algorithms, and findings are presented in the paper.

关键词： Grid computing Clustering algorithms Concurrent computing Distributed computing Computer architecture Linux Partitioning algorithms Partial differential equations Electrostatics parallel processing

来源：评论

学校读者我要写书评

暂无评论

KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism 49

KLAP: Kernel launch aggregation and promotion for optimizing...

引用

IEEE/acm International symposium on Microarchitecture (MICRO)

作者： Izzat El Hajj Juan Gomez-Luna Cheng Li Li-Wen Chang Dejan Milojicic Wen-mei Hwu University of Illinois College of Law Champaign IL US Universidad de C??rdoba University of Illinois at Urbana-Champaign IL USA Hewlett-Packard Labs

ISBN: (纸本)9781509035090

Dynamic parallelism on GPUs simplifies the programming of many classes of applications that generate paral-lelizable work not known prior to execution. However, modern GPUs architectures do not support dynamic parallelism efficiently due to the high kernel launch overhead, limited number of simultaneous kernels, and limited depth of dynamic calls a device can support. In this paper, we propose Kernel Launch Aggregation and Promotion (KLAP), a set of compiler techniques that improve the performance of kernels which use dynamic parallelism. Kernel launch aggregation fuses kernels launched by threads in the same warp, block, or kernel into a single aggregated kernel, thereby reducing the total number of kernels spawned and increasing the amount of work per kernel to improve occupancy. Kernel launch promotion enables early launch of child kernels to extract more parallelism between parents and children, and to aggregate kernel launches across generations mitigating the problem of limited depth. We implement our techniques in a real compiler and show that kernel launch aggregation obtains a geometric mean speedup of 6.58x over regular dynamic parallelism. We also show that kernel launch promotion enables cases that were not originally possible, improving throughput by a geometric mean of 30.44 x.

关键词： Kernel parallel processing Graphics processing units Heuristic algorithms Synchronization Hardware Fuses

来源：评论

学校读者我要写书评

暂无评论

On the price of heterogeneity in parallel systems 06

On the price of heterogeneity in parallel systems

引用

Proceedings of the eighteenth annual acm symposium on parallelism in algorithms and architectures

作者： P. Brighten Godfrey Richard M. Karp UC Berkeley Berkeley CA

ISBN: (纸本)9781595934529

Suppose we have a parallel or distributed system whose nodes have limited capacities, such as processing speed, bandwidth, memory, or disk space. How does the performance of the system depend on the amount of heterogeneity of its capacity distribution? We propose a general framework to quantify the worst-case effect of increasing heterogeneity in models of parallel systems. Given a cost function g(C,W) representing the system's performance as a function of its nodes' capacities C and workload W (such as the completion time of an optimum schedule of jobs W on machines C), we say that g has price of heterogeneity α when for any workload, cost cannot increase by more than a factor α if node capacities become arbitrarily more heterogeneous. We give constant bounds on the price of heterogeneity of several well-known job scheduling and graph degree/diameter problems, indicating that increasing heterogeneity can never be much of a disadvantage. On the other hand, with the introduction of timing constraints such as release times or precedence constraints on the jobs, the dependence on node capacities becomes more complex, so that increasing heterogeneity may be quite detrimental.

关键词： majorization scheduling heterogeneity

来源：评论

学校读者我要写书评

暂无评论

Sequential and parallel subquadratic work algorithms for constructing approximately optimal binary search trees 96

Sequential and parallel subquadratic work algorithms for con...

引用

Proceedings of the seventh annual acm-SIAM symposium on Discrete algorithms

作者： Marek Karpinski Lawrence L. Larmore Wojciech Rytter Dept. of Computer Science University of Bonn 53117 Bonn Department of Computer Science University of Nevada Las Vegas NV Institute of Informatics Warsaw University 02-097 Warszawa

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共161页 << < 143 144 145 146 147 148 149 150 151 152 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：