检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

33 篇 会议
3 篇 期刊文献
2 册 图书

馆藏范围

38 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

33 篇 工学
- 32 篇 计算机科学与技术...
- 20 篇 软件工程
- 6 篇 电气工程
- 6 篇 电子科学与技术（可...
- 5 篇 信息与通信工程
- 3 篇 生物工程
- 2 篇 机械工程
- 1 篇 动力工程及工程热...
- 1 篇 交通运输工程
- 1 篇 生物医学工程（可授...
19 篇 理学
- 14 篇 数学
- 3 篇 生物学
- 1 篇 物理学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 系统科学
3 篇 法学
- 3 篇 社会学
2 篇 管理学
- 2 篇 管理科学与工程(可...

主题

3 篇 parallel algorit...
3 篇 parallel program...
3 篇 mapreduce
2 篇 information syst...
2 篇 software enginee...
2 篇 computer communi...
2 篇 next generation ...
2 篇 matrix algebra
2 篇 parallel computi...
2 篇 management of co...
2 篇 artificial intel...
2 篇 algorithm analys...
1 篇 parallel impleme...
1 篇 data warehouses
1 篇 parallel process...
1 篇 scalability
1 篇 image matching
1 篇 database systems
1 篇 programming
1 篇 performance opti...

机构

2 篇 college of compu...
2 篇 seecs university...
2 篇 school of inform...
2 篇 department of in...
2 篇 lifo university ...
2 篇 school of inform...
2 篇 school of inform...
1 篇 school of comput...
1 篇 department of co...
1 篇 science and tech...
1 篇 national key lab...
1 篇 tsinghua univ de...
1 篇 department of in...
1 篇 department of in...
1 篇 college of compu...
1 篇 college of infor...
1 篇 central south un...
1 篇 hunan univ sci &...
1 篇 center for compu...
1 篇 tsinghua univ de...

作者

2 篇 bernady o. apduh...
2 篇 yang xiang
2 篇 nakano koji
2 篇 ivan stojmenovic
2 篇 loulergue frédér...
2 篇 koji nakano
2 篇 guojun wang
2 篇 takahashi daisuk...
2 篇 albert zomaya
1 篇 huang tao
1 篇 fedak gilles
1 篇 ohene-kwofie d.
1 篇 jubertie sylvain
1 篇 shen junzhong
1 篇 chi lihua
1 篇 li liang
1 篇 liu wenjian
1 篇 bala piotr
1 篇 xiang yang
1 篇 tang bing

语言

38 篇 英文

检索条件"任意字段=12th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2012"

共 38 条记录，以下是21-30 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

An optimal parallel prefix-sums algorithm on the memory machine models for GPUs

An optimal parallel prefix-sums algorithm on the memory mach...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Nakano, Koji Department of Information Engineering Hiroshima University Kagamiyama 1-4-1 Higashi Hiroshima 739-8527 Japan

ISBN: (纸本)9783642330773

the main contribution of this paper is to show optimal algorithms computing the sum and the prefix-sums on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). the DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. these models have three parameters, the number p of threads, the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in time units on the DMM and the UMM. We then go on to show that time units are necessary to compute the sum. Finally, we show an optimal parallel algorithm that computes the prefix-sums of n numbers in time units on the DMM and the UMM. © 2012 Springer-Verlag.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

High-performance matrix multiply on a massively multithreaded Fiteng1000 processor

High-performance matrix multiply on a massively multithreade...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Liu, Jie Chi, Lihua Gong, Chunye Xu, Han Jiang, Jie Yan, Yihui Hu, Qingfeng College of Computer Science National University of Defense Technology Changsha 410073 China

ISBN: (纸本)9783642330643

Matrix multiplication is an essential building block of many linear algebra operations and applications. this paper presents parallel algorithms with shared A or B matrix in the memory for the special massively multithreaded Fiteng1000 processor. We discuss the implementations of parallel matrix multiplication algorithms on the multi-core processor with many threads. To gain better performance, it is important to choose the 2D thread spatial topography, the memory layer for the placement and the sizes of the matrices. parallel codes using C and assembly language under OpenMP parallel programming environment are designed. Performance results on Fiteng1000 processor show that the algorithms have well good parallel performance and achieve near-peak performance. © 2012 Springer-Verlag.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Fast parallel algorithms for blocked dense matrix multiplication on shared memory architectures

Fast parallel algorithms for blocked dense matrix multiplica...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Nimako, G. Otoo, E.J. Ohene-Kwofie, D. School of Computer Science University of the Witwatersrand Johannesburg South Africa

ISBN: (纸本)9783642330773

the current trend of multicore and Symmetric Multi-Processor (SMP), architectures underscores the need for parallelism in most scientific computations. Matrix-matrix multiplication is one of the fundamental computations in many algorithms for scientific and numerical analysis. Although a number of different algorithms (such as Cannon, PUMMA, SUMMA etc), have been proposed for the implementation of matrix-matrix multiplication on distributed memory architectures, matrix-matrix algorithms for multicore and SMP architectures have not been extensively studied. We present two types of algorithms, based largely on blocked dense matrices, for parallel matrix-matrix multiplication on shared memory systems. the first algorithm is based on blocked matrices whiles the second algorithm uses blocked matrices with the MapReduce framework in shared memory. Our experimental results show that, our blocked dense matrix approach outperforms the known existing implementations by up to 50% whiles our MapReduce blocked matrix-matrix algorithm outperforms the existing matrix-matrix multiplication algorithm of the Phoenix shared memory MapReduce approach, by about 40%. © 2012 Springer-Verlag.

关键词： Matrix algebra

来源：评论

学校读者我要写书评

暂无评论

parallel algorithm for nonlinear network optimization problems and real-time applications

Parallel algorithm for nonlinear network optimization proble...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Lin, Shin-Yeu Guo, Xian-Chang Department of Electrical Engineering Chang Gung University Tao-Yuan Taiwan

ISBN: (纸本)9783642330773

In this paper, we propose a parallel algorithm to solve a class of nonlinear network optimization problems. the proposed parallel algorithm is a combination of the successive quadratic programming and the dual method, which can achieve complete decomposition and make parallel computation possible. the proposed algorithm can be applied to solve nonlinear network optimization problems in the smart grid. We have tested the proposed parallel algorithm in solving numerous cases of power flow problems on the IEEE 30-bus system. the test results demonstrate that the proposed parallel algorithm can obtain accurate solution. Additionally, neglecting the data communication time, the proposed parallel algorithm is, ideally, 13.1 times faster than the centralized Newton Raphson's method in solving the power flow problems of the IEEE 30-bus system. © 2012 Springer-Verlag.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A note on developing optimal and scalable parallel two-list algorithms

A note on developing optimal and scalable parallel two-list ...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Chedid, Fouad B. College of Arts and Applied Sciences Dhofar University Oman Department of Computer Science Notre Dame University - Louaize Lebanon

ISBN: (纸本)9783642330643

We show that developing an optimal parallelization of the two-list algorithm is much easier than we once thought. All it takes is to observe that the steps of the search phase of the two-list algorithm are closely related to the steps of a merge procedure for merging two sorted lists, and we already know how to parallelize merge efficiently. Armed with this observation, we present an optimal and scalable parallel two-list algorithm that is easy to understand and analyze, while it achieves the best known range of processor-time tradeoffs for this problem. In particular, our algorithm based on a CREW PRAM model takes time O(2n/2 - α) using 2α processors, for 0 ≤ α ≤ n/2 - 2logn + 2. © 2012 Springer-Verlag.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Performance measurement of parallel Vlasov code for space plasma on various scalar-type supercomputer systems

Performance measurement of parallel Vlasov code for space pl...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Umeda, Takayuki Fukazawa, Keiichiro Solar-Terrestrial Environment Laboratory Nagoya University Nagoya 464-8601 Japan Research Institute for Information Technology Kyushu University Fukuoka 812-8581 Japan

ISBN: (纸本)9783642330773

Computer simulations with the first-principle (kinetic) model are essential for studying multi-scale processes in space plasma. We develop numerical schemes for Vlasov simulations for practical use on currently-existing supercomputer systems. the weak-scaling benchmark test shows that our parallel Vlasov code achieves a high performance and a high scalability. Currently, we use more than 1000 cores for parallel computations and apply the present parallel Vlasov code to various cross-scale processes in space plasma, such as a first-principle global simulation of solar-wind-magnetosphere interactions. © 2012 Springer-Verlag.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Overcoming the scalability limitations of parallel star schema data warehouses

Overcoming the scalability limitations of parallel star sche...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Costa, João Pedro Cecílio, José Martins, Pedro Furtado, Pedro ISEC-Institute Polytechnic of Coimbra Portugal University of Coimbra Portugal

ISBN: (纸本)9783642330773

Most Data Warehouses (DW) are stored in Relational Database Management Systems (RDBMS) using a star-schema model. While this model yields a trade-off between performance and storage requirements, huge data warehouses experiment performance problems. Although parallel shared-nothing architectures improve on this matter by a divide-and-conquer approach, issues related to parallelizing join operations cause limitations on that amount of improvement, since they have implications concerning placement, the need to replicate data and/or on-the-fly repartitioning. In this paper, we show how these limitations can be overcome by replacing the star schema by a universal relation approach for more efficient and scalable parallelization. We evaluate the proposed approach using TPC-H benchmark, to both demonstrate that it provides highly predictable response times and almost optimal speedup. © 2012 Springer-Verlag.

关键词： Data warehouses

来源：评论

学校读者我要写书评

暂无评论

Performance evaluation of OpenMP and CUDA on multicore systems

Performance evaluation of OpenMP and CUDA on multicore syste...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Yang, Chao-Tung Chang, Tzu-Chieh Huang, Kuan-Lung Liu, Jung-Chun Chang, Chih-Hung Department of Computer Science Tunghai University Taichung City 40704 Taiwan Department of Information Management Hsiuping University of Science Technology Taichung City 41280 Taiwan

ISBN: (纸本)9783642330643

Nowadays, not only CPU but also GPU goes along the trend of multi-core processors. parallel processing presents not only an opportunity but also a challenge at the same time. To explicitly parallelize the software by programmers or compilers is the key for enhancing the performance on multi-core chip. In this paper, we first introduce some of the automatic parallel tools based OpenMP, which could save the time to rewrite codes for parallel processing on multicore system. then we focus on ROSE and explore it in depth. And we also implement an interface to reduce its complexity of use and use some automatic parallelization for CUDA. © 2012 Springer-Verlag.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A verified library of algorithmic skeletons on evenly distributed arrays

A verified library of algorithmic skeletons on evenly distri...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Bousdira, Wadoud Loulergue, Frédéric Tesson, Julien LIFO University of Orléans France Kochi University of Technology Japan

ISBN: (纸本)9783642330773

To make parallel programming as widespread as parallel architectures, more structured parallel programming paradigms are necessary. One of the possible approaches are algorithmic skeletons. they can be seen as higher order functions implemented in parallel. Algorithmic skeletons offer a simple interface to the programmer without all the details of parallel implementations as they abstract the communications and the synchronisations of parallel activities. To write a parallel program, users have to combine and compose the skeletons. Orléans Skeleton Library (OSL) is an efficient meta-programmed C++ library of algorithmic skeletons that manipulate distributed arrays. A prototype implementation of OSL exists as a library written with the function parallel language Bulk Synchronous parallel ML (BSML). In this paper we are interested in verifying the correctness of a subset of this prototype implementation. To do so, we give a functional specification of a subset of OSL and we prove the correctness of the BSML implementation with respect to this functional specification, using the Coq proof assistant. To illustrate how the user could use these skeletons, we prove the correctness of two applications implemented with them. © 2012 Springer-Verlag.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

An implementation of parallel 2-D FFT using intel AVX instructions on multi-core processors

An implementation of parallel 2-D FFT using intel AVX instru...

引用

12th international conference on algorithms and architectures for parallel processing, ica3pp 2012

作者： Takahashi, Daisuke Faculty of Engineering Information and Systems University of Tsukuba 1-1-1 Tennodai Tsukuba Ibaraki 305-8573 Japan

ISBN: (纸本)9783642330643

In this paper, we propose an implementation of a parallel two-dimensional fast Fourier transform (FFT) using Intel Advanced Vector Extensions (AVX) instructions on multi-core processors. the combination of vectorization and a block two-dimensional FFT algorithm is shown to effectively improve performance. We vectorized FFT kernels using the AVX instructions. Performance results of two-dimensional FFTs on multi-core processors are reported. We successfully achieved a performance of over 61 GFlops on an Intel Xeon E5-2670 (2.6 GHz, two CPUs, 16 cores) and over 24 GFlops on an Intel Core i7-3930K (3.2 GHz, one CPU, six cores) for a 212 x 212-point FFT. © 2012 Springer-Verlag.

关键词： Fast Fourier transforms

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共4页 << < 1 2 3 4 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：