检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

2,699 篇 会议
58 册 图书
54 篇 期刊文献

馆藏范围

2,811 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

1,852 篇 工学
- 1,636 篇 计算机科学与技术...
- 847 篇 软件工程
- 342 篇 电气工程
- 222 篇 电子科学与技术（可...
- 216 篇 信息与通信工程
- 91 篇 控制科学与工程
- 63 篇 光学工程
- 58 篇 机械工程
- 47 篇 仪器科学与技术
- 39 篇 生物医学工程（可授...
- 38 篇 生物工程
- 31 篇 材料科学与工程（可...
- 27 篇 动力工程及工程热...
- 21 篇 化学工程与技术
- 20 篇 建筑学
- 17 篇 网络空间安全
- 15 篇 土木工程
- 13 篇 力学（可授工学、理...
506 篇 理学
- 343 篇 数学
- 115 篇 物理学
- 51 篇 系统科学
- 48 篇 生物学
- 32 篇 化学
- 30 篇 统计学（可授理学、...
177 篇 管理学
- 123 篇 管理科学与工程(可...
- 62 篇 图书情报与档案管...
- 49 篇 工商管理
44 篇 医学
- 30 篇 临床医学
- 14 篇 基础医学(可授医学...
15 篇 法学
- 15 篇 社会学
9 篇 经济学
9 篇 农学
8 篇 文学
2 篇 军事学
1 篇 教育学

主题

364 篇 parallel process...
219 篇 computer archite...
205 篇 graphics process...
146 篇 parallel archite...
136 篇 graphics process...
129 篇 hardware
116 篇 parallel algorit...
112 篇 image processing
99 篇 computational mo...
94 篇 concurrent compu...
87 篇 instruction sets
86 篇 field programmab...
83 篇 algorithm design...
79 篇 multicore proces...
77 篇 signal processin...
76 篇 parallel process...
66 篇 parallel program...
60 篇 throughput
60 篇 gpu
59 篇 kernel

机构

11 篇 natl univ def te...
6 篇 college of compu...
6 篇 school of comput...
6 篇 hosei univ dept ...
6 篇 natl univ def te...
5 篇 univ aizu dept c...
5 篇 carleton univ sc...
5 篇 school of comput...
5 篇 computer science...
5 篇 inria rennes
5 篇 city university ...
4 篇 chinese acad sci...
4 篇 univ michigan ad...
4 篇 institute of com...
4 篇 univ chinese aca...
4 篇 school of comput...
4 篇 univ jaume 1 dep...
4 篇 hainan internati...
4 篇 tech univ cluj n...
4 篇 department of co...

作者

11 篇 jack dongarra
10 篇 roman wyrzykowsk...
9 篇 konrad karczewsk...
9 篇 quintana-orti en...
7 篇 dongarra jack
7 篇 kothapalli kisho...
6 篇 hannig frank
6 篇 liu jie
6 篇 su jinshu
6 篇 nakano koji
6 篇 peng shietung
6 篇 li yamin
6 篇 chu wanming
6 篇 wyrzykowski roma...
6 篇 thulasiraman par...
5 篇 ito yasuaki
5 篇 jerzy waśniewski
5 篇 wang guojun
5 篇 geyong min
5 篇 wanlei zhou

语言

2,737 篇 英文
49 篇 其他
18 篇 中文
11 篇 俄文
2 篇 乌克兰文
1 篇 西班牙文

检索条件"任意字段=10th International Conference on Algorithms and Architectures for Parallel Processing"

共 2811 条记录，以下是1501-1510 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Applied parallel and Scientific Computing - 10th international conference, PARA 2010, Revised Selected Papers

Applied Parallel and Scientific Computing - 10th Internation...

引用

10th international conference on Applied parallel and Scientific Computing, PARA 2010

ISBN: (纸本)9783642281501

the proceedings contain 77 papers. the topics discussed include: on aggressive early deflation in parallel variants of the QR algorithm;a model for efficient onboard actualization of an instrumental cyclogram for the mars MetNet mission on a public cloud infrastructure;distributed Java programs initial mapping based on extremal optimization;global asynchronous parallel program control for multicore processors;streaming model computation of the FDTD problem;numerical investigation of the cumulant expansion for Fourier path integrals;simulated annealing with coarse graining and distributed computing;high performance computing techniques for scaling image analysis workflows;parallel computation of bivariate polynomial resultants on graphics processing units;an interval version of the Crank-Nicolson method - the first approach;and an interval finite difference method of Crank-Nicolson type for solving the one-dimensional heat conduction equation with mixed boundary conditions.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Applied parallel and Scientific Computing - 10th international conference, PARA 2010, Revised Selected Papers

引用

10th international conference on Applied parallel and Scientific Computing, PARA 2010

ISBN: (纸本)9783642281440

关键词：

来源：评论

学校读者我要写书评

暂无评论

Study of Hierarchical N-Body Methods for Network-on-Chip architectures 1

引用

17th international Euro-Par conference on parallel processing

作者： Xu, thomas Canhao Liljeberg, Pasi Tenhunen, Hannu Turku Ctr Comp Sci Joukahaisenkatu 3-5 B Turku 20520 Finland Univ Turku Dept Informat Technol Turku 20014 Finland

ISBN: (数字)9783642297403

ISBN: (纸本)9783642297403;9783642297397

In this paper, we study two hierarchical N-Body methods for Network-on-Chip (NoC) architectures. the modern Chip Multiprocessor (CMP) designs are mainly based on the shared-bus communication architecture. As the number of cores increases, it suffers from high communication delays. therefore, NoC based architecture is proposed. the N-Body problem is a classical problem of approximating the motion of bodies. Two methods, namely Barnes-Hut (Barnes) and Fast Multipole (FMM), have been developed for fast simulation. the two algorithms have been implemented and studied in conventional computer systems and Graphics processing Units (GPUs). However, as a promising unconventional multicore architecture, the evaluation of N-Body methods in a NoC platform has not been well addressed. We define a NoC model based on state-of-the-art systems. Evaluation results are presented using a cycle accurate full system simulator. Experiments show that, Barnes scales better (53.7x/Barnes and 36.6x/FMM for 64 processing elements) and requires less cache than FMM. However, we observe hot-spot traffic in Barnes. Our analysis and experiment results provide a guideline for studying N-Body methods in a NoC platform.

关键词： Network-on-chip

来源：评论

学校读者我要写书评

暂无评论

Overcoming the scalability limitations of parallel star schema data warehouses

Overcoming the scalability limitations of parallel star sche...

引用

12th international conference on algorithms and architectures for parallel processing, ICA3PP 2012

作者： Costa, João Pedro Cecílio, José Martins, Pedro Furtado, Pedro ISEC-Institute Polytechnic of Coimbra Portugal University of Coimbra Portugal

ISBN: (纸本)9783642330773

Most Data Warehouses (DW) are stored in Relational Database Management Systems (RDBMS) using a star-schema model. While this model yields a trade-off between performance and storage requirements, huge data warehouses experiment performance problems. Although parallel shared-nothing architectures improve on this matter by a divide-and-conquer approach, issues related to parallelizing join operations cause limitations on that amount of improvement, since they have implications concerning placement, the need to replicate data and/or on-the-fly repartitioning. In this paper, we show how these limitations can be overcome by replacing the star schema by a universal relation approach for more efficient and scalable parallelization. We evaluate the proposed approach using TPC-H benchmark, to both demonstrate that it provides highly predictable response times and almost optimal speedup. © 2012 Springer-Verlag.

关键词： Data warehouses

来源：评论

学校读者我要写书评

暂无评论

Performance evaluation of OpenMP and CUDA on multicore systems

Performance evaluation of OpenMP and CUDA on multicore syste...

引用

12th international conference on algorithms and architectures for parallel processing, ICA3PP 2012

作者： Yang, Chao-Tung Chang, Tzu-Chieh Huang, Kuan-Lung Liu, Jung-Chun Chang, Chih-Hung Department of Computer Science Tunghai University Taichung City 40704 Taiwan Department of Information Management Hsiuping University of Science Technology Taichung City 41280 Taiwan

ISBN: (纸本)9783642330643

Nowadays, not only CPU but also GPU goes along the trend of multi-core processors. parallel processing presents not only an opportunity but also a challenge at the same time. To explicitly parallelize the software by programmers or compilers is the key for enhancing the performance on multi-core chip. In this paper, we first introduce some of the automatic parallel tools based OpenMP, which could save the time to rewrite codes for parallel processing on multicore system. then we focus on ROSE and explore it in depth. And we also implement an interface to reduce its complexity of use and use some automatic parallelization for CUDA. © 2012 Springer-Verlag.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Transient Analysis of Large Linear Dynamic Networks on Hybrid GPU-Multicore Platforms

Transient Analysis of Large Linear Dynamic Networks on Hybri...

引用

10th IEEE international New Circuits and Systems conference (NEWCAS)

作者： Liu, Xue-Xin Tan, Sheldon X. -D. Liu, Zao Wang, Hai Xu, Tailong Univ Calif Riverside Dept Elect Engn Riverside CA 92521 USA UESTC Sch Microelect & Solid State Elect Chengdu 610054 Peoples R China Anhui Univ Sch Elect & Informat Engn Hefei Peoples R China

ISBN: (纸本)9781467308595

A new transient analysis method is proposed for general linear dynamic networks, such as on-chip power grid networks, using hybrid GPU-based multicore platform. the new method, called ETBR-GPU, first performs sampling-like reduction on the original circuit matrices where the frequency domain responses at different frequency points can be calculated in parallel on multicore CPU. After the reduction, the reduced circuit matrices, which are dense but well suitable for GPU's data parallel computing, are simulated on GPU. Such reduction based simulation technique is very amenable for parallelization on the hybrid multicore and GPU platforms, where coarse-grained task-level and fine-grained lightweight-thread level parallelism can be both exploited. the proposed method is very general, since it can analyze any linear networks with complicated structures and macromodels, and it does not assume some structure properties in order to build problem-specific preconditioners, as many iterative solvers do. Experiments show that the new method achieves about one or two orders of magnitude speedup when compared to the general LU-based simulation method on some recently published IBM power grid benchmark circuits.

关键词： circuit matrices Transient analysis dynamic network Multi-core processors Linear networks Platform frequency response Power grids Simulation models analog approach parallelization HYBRID Hybrid electric vehicles GRAPPER PICK UP Graphics processing Unit

来源：评论

学校读者我要写书评

暂无评论

Performance measurement of parallel Vlasov code for space plasma on various scalar-type supercomputer systems

Performance measurement of parallel Vlasov code for space pl...

引用

12th international conference on algorithms and architectures for parallel processing, ICA3PP 2012

作者： Umeda, Takayuki Fukazawa, Keiichiro Solar-Terrestrial Environment Laboratory Nagoya University Nagoya 464-8601 Japan Research Institute for Information Technology Kyushu University Fukuoka 812-8581 Japan

ISBN: (纸本)9783642330773

Computer simulations with the first-principle (kinetic) model are essential for studying multi-scale processes in space plasma. We develop numerical schemes for Vlasov simulations for practical use on currently-existing supercomputer systems. the weak-scaling benchmark test shows that our parallel Vlasov code achieves a high performance and a high scalability. Currently, we use more than 1000 cores for parallel computations and apply the present parallel Vlasov code to various cross-scale processes in space plasma, such as a first-principle global simulation of solar-wind-magnetosphere interactions. © 2012 Springer-Verlag.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

A new low latency parallel turbo decoder employing parallel phase decoding method

A new low latency parallel turbo decoder employing parallel ...

引用

12th international conference on algorithms and architectures for parallel processing, ICA3PP 2012

作者： Lee, Wen-Ta Chang, Min-Sheng Shen, Wei-Chieh Institute of Computer and Communication National Taipei University of Technology Taipei Taiwan

ISBN: (纸本)9783642330643

In this paper, a new parallel phase algorithm for parallel turbo decoder is proposed. Traditional sliding window turbo algorithm exchanges extrinsic information phase by phase, it will induce long decoding latency. the proposed algorithm exchanges extrinsic information as soon as it had been calculated half the frame size, thus, it can not only eliminate (De-)Interleaver delay but also save the storage space. For verifying the proposed parallel phase turbo decoder, we have used FPGA to emulate the hardware architectures, and designed this turbo decoder chip with TSMC 0.18μm 1P6M CMOS process. the gate count of this decoder chip is 128284. the chip size including I/O pad is 1.91x1.91mm2. the simulation result shows that, compared to traditional sliding window method, for different code size, parallel phase turbo decoding method has 51.23%~58.13% decoding time saved, with 8 iteration times at 100MHz working frequency. © 2012 Springer-Verlag.

关键词： Iterative decoding

来源：评论

学校读者我要写书评

暂无评论

Password Recovery Using MPI and CUDA

Password Recovery Using MPI and CUDA

引用

19th international conference on High Performance Computing (HiPC)

作者： Apostal, David Foerster, Kyle Chatterjee, Amrita Desell, Travis Univ North Dakota Dept Comp Sci Grand Forks ND 58201 USA Univ North Dakota Dept Elect Engn Grand Forks ND USA

ISBN: (纸本)9781467323703;9781467323727

Using passwords to verify a user's identity is the most widely deployed method for electronic authentication. When system administrators need to recover lost passwords or test accounts for easily guessable passwords, it can require millions of hash function and string comparison operations. these operations can be computationally expensive but are easily parallelizable because each password can be tested independently. therefore, using high performance computing (HPC) can greatly reduce the time required to perform password recovery. Due to the high level of fine-grained parallelism of this type of problem, GPU computing using Compute Unified Device Architecture (CUDA) can be used to further improve performance. the scale of HPC can be further increased through the use of multiple GPUs, but this requires communication between the GPU devices and can reduce the overall performance due to increased communications latency. In this work a well established HPC framework, Message Passing Interface (MPI), was used to minimize the amount of latency and handle the communication between the devices. this allowed for a course-grained division of the problem using MPI where each device applies a fine-grained division of the problem using CUDA to perform the actual calculations. this paper describes three dictionary-based password recovery algorithms that use both MPI and CUDA. In this approach the hashed values of known words are computed and compared with hash values of unknown user passwords. the algorithms differed in GPU memory utilization and how the data was divided and distributed among the MPI nodes and GPU devices. A divided dictionary algorithm split the dictionary of potential passwords over the GPUs and copied the password database to each GPU. A divided password database algorithm split the password database and copied the potential passwords. A minimal memory algorithm split the password database and sequentially processed individual passwords on the GPUs. the div

关键词： application program interfaces database management systems graphics processing units message passing parallel architectures security of data storage management

来源：评论

学校读者我要写书评

暂无评论

Scheduling Architecture-Supported Regions in parallel Programs

Scheduling Architecture-Supported Regions in Parallel Progra...

引用

10th Nordic international conference on Applied parallel Computing - State of the Art in Scientific and parallel Computing (PARA)

作者： Tudruj, Marek Masko, Lukasz Polish Acad Sci Inst Comp Sci PL-01237 Warsaw Poland

ISBN: (纸本)9783642281501;9783642281518

Current multicore system technology enables implementation of particular program functions like library operations, special functions generation, optimized data search etc. using dedicated computing units to increase overall program performance. A parallel system can be equipped with a set of such units to speed up execution of applications, which use such functionality. To properly model and schedule programs using such functions running on a dedicated hardware, a proper program representation must be introduced. the paper presents special scheduling algorithm for programs represented as graphs, based on a modified ETF heuristics. the algorithm is meant for a modular architecture composed of many CMP modules interconnected by a global data communication network. the assumed architecture of dedicated CMP modules enables personalized fully synchronous program execution, which uses communication on the fly to strongly reduce inter-core communication overheads.

关键词： CMP architectures program execution control program scheduling data communication optimization

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共282页 << < 147 148 149 150 151 152 153 154 155 156 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：