检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2,780 篇 会议
59 册 图书
46 篇 期刊文献

馆藏范围

2,883 篇 电子文献
2 种 纸本馆藏

日期分布

学科分类号

2,016 篇 工学
- 1,781 篇 计算机科学与技术...
- 945 篇 软件工程
- 297 篇 信息与通信工程
- 292 篇 电气工程
- 246 篇 电子科学与技术（可...
- 95 篇 控制科学与工程
- 52 篇 机械工程
- 49 篇 生物工程
- 44 篇 光学工程
- 41 篇 生物医学工程（可授...
- 37 篇 仪器科学与技术
- 28 篇 动力工程及工程热...
- 27 篇 化学工程与技术
- 21 篇 土木工程
- 20 篇 力学（可授工学、理...
- 19 篇 材料科学与工程（可...
- 18 篇 建筑学
542 篇 理学
- 386 篇 数学
- 107 篇 物理学
- 57 篇 生物学
- 48 篇 系统科学
- 32 篇 化学
- 32 篇 统计学（可授理学、...
197 篇 管理学
- 121 篇 管理科学与工程(可...
- 81 篇 图书情报与档案管...
- 56 篇 工商管理
51 篇 医学
- 42 篇 临床医学
- 16 篇 基础医学(可授医学...
19 篇 文学
17 篇 经济学
- 17 篇 应用经济学
15 篇 法学
- 14 篇 社会学
12 篇 农学
4 篇 教育学
3 篇 军事学

主题

345 篇 parallel process...
200 篇 parallel process...
192 篇 computer archite...
157 篇 graphics process...
153 篇 parallel archite...
113 篇 parallel algorit...
109 篇 graphics process...
106 篇 hardware
86 篇 image processing
81 篇 computational mo...
75 篇 signal processin...
71 篇 concurrent compu...
66 篇 instruction sets
65 篇 algorithm design...
65 篇 multicore proces...
63 篇 field programmab...
60 篇 parallel program...
58 篇 parallel computi...
53 篇 gpu
51 篇 optimization

机构

10 篇 natl univ def te...
8 篇 college of compu...
6 篇 hosei univ dept ...
6 篇 college of compu...
5 篇 univ aizu dept c...
5 篇 inria rennes
5 篇 national univers...
5 篇 natl univ def te...
5 篇 city university ...
5 篇 science and tech...
4 篇 chinese acad sci...
4 篇 school of comput...
4 篇 carleton univ sc...
4 篇 univ chinese aca...
4 篇 school of comput...
4 篇 charles univ pra...
4 篇 department of co...
4 篇 school of comput...
4 篇 hainan internati...
4 篇 purple mountain ...

作者

10 篇 liu jie
9 篇 jack dongarra
8 篇 roman wyrzykowsk...
7 篇 wang qinglin
7 篇 konrad karczewsk...
7 篇 quintana-orti en...
6 篇 gepner pawel
6 篇 peng shietung
6 篇 li kuan-ching
6 篇 li yamin
6 篇 chu wanming
6 篇 prasanna viktor ...
6 篇 rothermel kurt
6 篇 yang chao-tung
5 篇 dongarra jack
5 篇 olas tomasz
5 篇 hannig frank
5 篇 wanlei zhou
5 篇 qian depei
5 篇 ewa deelman

语言

2,822 篇 英文
51 篇 其他
17 篇 中文
1 篇 俄文

检索条件"任意字段=8th International Conference on Algorithms and Architectures for Parallel Processing"

共 2885 条记录，以下是2701-2710 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Extracting parallelism in Fortran by translation to a single assignment intermediate form

Extracting parallelism in Fortran by translation to a single...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Barry, Robert J. Evripidou, Paraskevas Southern Methodist Univ Dallas United States

ISBN: (纸本)0818656026

this paper presents MUSTANG, a system for translating Fortran to single assignment form in an effort to automatically extract parallelism. Specifically, a sequential Fortran source program is translated into IF1, a machine-independent dataflow graph description language that is the intermediate form for the SISAL language. During this translation, Parafrase 2 is used to detect opportunities for parallelization which are then explicitly introduced into the IF1 program. the resulting IF1 program is then processed by the Optimizing SISAL Compiler which produces parallel executables on multiple target platforms. the execution results of several Livermore Loops are presented and compared against Fortran and SISAL implementation on two different platforms. the results show that the translation is an efficient method for exploiting parallelism form the sequential Fortran source code.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

New combinatorial approach to optimal embeddings of rectangles

New combinatorial approach to optimal embeddings of rectangl...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Huang, Shou-Hsuan S. Liu, Hongfei Verma, Rakesh M. Univ of Houston Houston United States

ISBN: (纸本)0818656026

An important problem in graph embeddings and parallel computing is to embed a rectangular grid into other graphs. We present a novel, general combinatorial approach to (one-to-one) embedding rectangular grids into their ideal rectangular grids and optimal hypercubes. In contrast to earlier approaches of Aleliunas and Rosenberg, and Ellis, our approach is based on a special kind of doubly stochastic matrix. We prove that any rectangular grid can be embedded into its ideal rectangular grid with dilation equal to the ceiling of the compression ratio, which is both optimal up a multiplicative constant and a substantial generalization of previous work. We also show that any rectangular grid can be embedded into its nearly ideal square grid with dilation at most 3.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

parallel evaluation of a parallel architecture by means of calibrated emulation

Parallel evaluation of a parallel architecture by means of c...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Muller, Henk L. Stallard, Paul W.A. Warren, David H.D. Raina, Sanjay Univ of Bristol Bristol United Kingdom

ISBN: (纸本)0818656026

A parallel transputer-based emulator has been developed to evaluate the DDM--a highly parallel virtual shared memory architecture. the emulator provides performance results of a hardware implementation of the DDM using a calibrated virtual clock. Unlike the virtual clock of a simulator, the emulator clock is bound to a fixed fraction of real time so individual processors may time action independently without the need for a global clock value. Each component of the emulator is artificially slowed down so that the balance of the speeds of all components reflects the balance of the expected hardware implementation. the calibrated emulator runs an order of magnitude faster than a simulator (the application program is executed directly and there is no overhead for the maintenance of event lists) and more importantly, the emulator is inherently parallel. this results in a peak emulation speed of 27 million instructions per second when simulating a machine with 81 leaf nodes on a 121 node transputer system.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Scalable techniques for computing band linear recurrence on massively parallel and vector supercomputers

Scalable techniques for computing band linear recurrence on ...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Wang, Haigeng Nicolau, Alexandru Keung, Stephen Siu, Kai-Yeung Sunny Univ of California Irvine United States

ISBN: (纸本)0818656026

In this paper, we present a new scalable algorithm called the Regular Schedule, for parallel evaluation of band linear recurrences (BLR's, i.e., mth-order linear recurrences for m ≥1). Its scalability and simplicity make it well suited for vector supercomputers and massively parallel computers. We describe our implementation of the regular Schedule on two types of machines: the Convex C240 and the MasPar MP-2. the scalability of our scheduling techniques is demonstrated on the two machines. Significant improvements in CPU performance for a range of programs containing BLR implemented using the Regular Schedule in C over the same programs implemented using the highly-optimized coded-in-assembly BLAS routines [17] are demonstrated on the Convex C240.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Processor mapping techniques toward efficient data redistribution

Processor mapping techniques toward efficient data redistrib...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Kalns, Edgar T. Ni, Lionel M. Michigan State Univ East Lansing United States

ISBN: (纸本)0818656026

Run-time data redistribution can effect algorithm performance in distributed-memory machines. Redistribution of data can be performed between algorithm phases when a different data decomposition is expected to deliver increased performance for a subsequent phase of computation. Additionally, data redistribution, can occur at subprogram boundaries. Redistribution, however, represents increased program overhead as algorithm computation is necessarily discontinued while data are exchanged among processor memories. In this paper, we present a technique for data-processor mapping, applicable to data redistribution, that minimizes the total amount of data that must be communicated among processors. the mapping technique is architecture-independent and represents our initial work toward achieving efficient redistribution in distributed-memory machines.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

High performance communications and memory caching scheme for molecular dynamics on the CM-5

High performance communications and memory caching scheme fo...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Beazley, David M. Lomdahl, Peter S. Gronbech-Jensen, Niels Tamayo, Pablo Los Alamos National Lab Los Alamos United States

ISBN: (纸本)0818656026

We present several techniques that we have used to optimize the performance of a message-passing C code for molecular dynamics on the CM-5. We describe our use of the CM-5 vector units and a parallel memory coaching scheme that we have developed to speed up the code by more than 50%. A modification that decreases our communication time by 35% is also presented along with a discussion of how we have been able to take advantage of the CM-5 hardware without significantly compromising code portability. We have been able to speed up our original code by a factor of ten and we feel that our modifications may be useful in optimizing the performance of other message-passing C applications on the CM-5.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Hardware support for synchronization in the scalable coherent interface (SCI)

Hardware support for synchronization in the scalable coheren...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Aboulenein, Nagi M. Goodman, James R. Gjessing, Stein Woest, Philip J. Univ of Wisconsin Madison United States

ISBN: (纸本)0818656026

the exploitation of the inherent parallelism in applications written for shared-memory systems depends critically on the efficiency of the synchronization and data exchange primitives provided by the hardware. this paper discusses and analyzes such primitives as they are implemented in the Scalable Coherent Interface (SCI). the SCI synchronization primitives are based on QOLB, a hardware primitive that shows much promise for reducing/eliminating the synchronization and access latencies of shared data. Introducing finergrained programs in the absence of such latency reduction will have little or no benefit. In particular, we discuss how QOLB fits the underlying linked-list cache coherence protocol of SCI. We also show how, for some important scenarios (critical sections and pairwise-sharing), the QOLB primitives in SCI can greatly reduce data communication latencies.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Analysis of memory latency factors and their impact on KSRI performance

Analysis of memory latency factors and their impact on KSRI ...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Kahhaleh, Bassam Z. Univ of Jordan Amman Jordan

ISBN: (纸本)0818656026

the KSR1 has a shared address space, which spreads over physically distributed memory modules with various latencies. thus performance depends considerably on the program's locality of reference and the effectiveness of using prefetch and post-store instructions. this paper analyzes the various memory latency factors which stall the processor during program execution, running on 32-processor system. A suitable model for evaluating these factors is developed for the execution of tiled do-loops with the slice strategy. the benchmark used is a sparse matrix solver. the limited size of the prefetch queue is shown to stall the processor for a long period of time, which reduces the benefit of prefetch considerably. the post-store operation is shown to have a high overhead. However, delaying the post-store operation improved performance considerably.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

GRAPE-4: a teraflops massively parallel special-purpose computer system for astrophysical N-body simulations

GRAPE-4: a teraflops massively parallel special-purpose comp...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Taiji, Makoto Makino, Junichiro Ebisuzaki, Toshikazu Sugimoto, Daiichiro Univ of Tokyo Tokyo Japan

ISBN: (纸本)0818656026

We are developing a massively parallel special-purpose computer system for astrophysical N-body simulations, GRAPE-4 (GRAvity-PipE 4). the GRAPE-4 system is designed to simulate dynamics of classical particles which interact each other gravitationally by using predictor-corrector methods. We have developed two application-specific LSIs, the HARP-(Hermite AcceleratoR Pipe) chip and the PROMEthEUS chip for the GRAPE-4 system. the HARP chip calculates gravitational forces and its performance exceeds 600 megaflops. the PROMEthEUS chip calculates predictors for the time-integration. Using multi-chip module technology, we can integrate 1920 HARP chips into a single system. the GRAPE-4 system consists of 4 clusters, which are connected to a single host workstation. the peak speed of GRAPE-4 will exceed 1 teraflops even in the worst case, and will reach around 1.8 teraflops in the typical case.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Building multithreaded architecture with off-the-she;f microprocessor

Building multithreaded architecture with off-the-she;f micro...

引用

Proceedings of the 8th international parallel processing Symposium

作者： Hum, Herbert H.J. theobald, Kevin B. Gao, Guang R. Concordia Univ Montreal Canada

ISBN: (纸本)0818656026

Present-day parallel computers often face the problems of large software overheads for process switching and interprocessor communication. these problems are addressed by the Multi-threaded Architecture (MTA), a multiprocessor model designed for efficient parallel execution of both numerical and non-numerical programs. We begin with a conventional processor, and add what we believe to be the minimal external hardware necessary for efficient support of multithreaded programs. the presentation begins with the top-level architecture and the program execution model. the latter includes a description of activation frames and thread synchronization. this is followed by a detailed presentation of the processor. Major features of the MTA include the Register-Use Cache for exploiting temporal locality in multiple register set microprocessors, support for programs requiring non-determinism and speculation, and local function invocations which can utilize registers for parameter passing.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共289页 << < 267 268 269 270 271 272 273 274 275 276 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：