检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

39 篇 会议
1 册 图书

馆藏范围

40 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

37 篇 工学
- 37 篇 计算机科学与技术...
- 32 篇 软件工程
- 1 篇 电气工程
- 1 篇 信息与通信工程
- 1 篇 生物医学工程（可授...
- 1 篇 生物工程
3 篇 理学
- 2 篇 数学
- 1 篇 生物学

主题

5 篇 parallel computi...
5 篇 parallel algorit...
4 篇 parallel program...
4 篇 high performance...
3 篇 scalability
2 篇 parallelization
2 篇 apache spark
2 篇 mpi
2 篇 data compression
2 篇 parallel encodin...
2 篇 hevc
2 篇 gpu
1 篇 multi-frontal me...
1 篇 biology computin...
1 篇 datalog
1 篇 parallel algorit...
1 篇 mumps
1 篇 pattern assembly
1 篇 performance mode...
1 篇 co-design

机构

1 篇 inst immunol & p...
1 篇 univ mostaganem ...
1 篇 department of sy...
1 篇 tsinghua univ de...
1 篇 barcelona superc...
1 篇 moe key lab mach...
1 篇 univ valladolid ...
1 篇 univ paris 13 la...
1 篇 artificial intel...
1 篇 tsinghua univ de...
1 篇 ural fed univ ek...
1 篇 texas tech univ ...
1 篇 univ tunis el ma...
1 篇 higher sch techn...
1 篇 univ houston hou...
1 篇 icar-cnr and uni...
1 篇 krasovskii inst ...
1 篇 qingdao natl lab...
1 篇 guangdong prov k...
1 篇 charles univ pra...

作者

2 篇 dos santos rodri...
2 篇 pinol pablo
2 篇 lopez-granado ot...
2 篇 migallon hector
2 篇 lobosco marcelo
1 篇 huang tao
1 篇 sozykin andrey
1 篇 garcia ana-barba...
1 篇 luis martinez jo...
1 篇 zhang jianlei
1 篇 ayres daniel l.
1 篇 cebrian-marquez ...
1 篇 gergel victor
1 篇 ito yasuaki
1 篇 santander-jimene...
1 篇 alfredo cuzzocre...
1 篇 zavoral filip
1 篇 zhang shuang
1 篇 khamzin svyatosl...
1 篇 llanos diego r.

语言

40 篇 英文

检索条件"任意字段=16th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2016"

共 40 条记录，以下是21-30 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Formalizing Data Locality in Task parallel Applications 16th

Formalizing Data Locality in Task Parallel Applications

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Ceballos, German Hagersten, Erik Black-Schaffer, David Uppsala Univ Dept Informat Technol Uppsala Sweden

ISBN: (纸本)9783319499567;9783319499550

Task-based programming provides programmers with an intuitive abstraction to express parallelism, and runtimes with the flexibility to adapt the schedule and load-balancing to the hardware. Although many profiling tools have been developed to understand these characteristics, the interplay between task scheduling and data reuse in the cache hierarchy has not been explored. these interactions are particularly intriguing due to the flexibility task-based runtimes have in scheduling tasks, which may allow them to improve cache behavior. this work presents StatTask, a novel statistical cache model that can predict cache behavior for arbitrary task schedules and cache sizes from a single execution, without programmer annotations. StatTask enables fast and accurate modeling of data locality in task-based applications for the first time. We demonstrate the potential of this new analysis to scheduling by examining applications from the BOTS benchmarks suite, and identifying several important opportunities for reuse-aware scheduling.

关键词： Task-based Cache modeling Performance model

来源：评论

学校读者我要写书评

暂无评论

Comparative Analysis of OpenACC Compilers 16th

Comparative Analysis of OpenACC Compilers

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Barba, Daniel Gonzalez-Escribano, Arturo Llanos, Diego R. Univ Valladolid Dept Informat Valladolid Spain

ISBN: (纸本)9783319499567;9783319499550

OpenACC has been on development for a few years now. the OpenACC 2.5 specification was recently made public and there are some initiatives for developing full implementations of the standard to make use of accelerator capabilities. there is much to be done yet, but currently, OpenACC for GPUs is reaching a good maturity level in various implementations of the standard, using CUDA and OpenCL as backends. Nvidia is investing in this project and they have released an OpenACC Toolkit, including the PGI Compiler. there are, however, more developments out there. In this work, we analyze different available OpenACC compilers that have been developed by companies or universities during the last years. We check their performance and maturity, keeping in mind that OpenACC is designed to be used without extensive knowledge about parallel programming. Our results show that the compilers are on their way to a reasonable maturity, presenting different strengths and weaknesses.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Creating Distributed Execution Plans with BobolangNG 16th

Creating Distributed Execution Plans with BobolangNG

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Bednarek, David Krulis, Martin Yaghob, Jakub Zavoral, Filip Charles Univ Prague Fac Math & Phys Parallel Architectures Algorithms Applicat Res Gr Malostranske Nam 25 Prague Czech Republic

ISBN: (纸本)9783319495835;9783319495828

Execution plans constitute the traditional interface between DBMS front-ends and back-ends;similar networks of interconnected operators are found also outside database systems. Tasks like adapting execution plans for distributed or heterogeneous runtime environments require a plan transformation mechanism which is simple enough to produce predictable results while general enough to express advanced communication schemes required for instance in skew-resistant partitioning. In this paper, we describe the BobolangNG language designed to express execution plans as well as their transformations, based on hierarchical models known from many environments but enhanced with a novel compile-time mechanism of component multiplication. Compared to approaches based on general graph rewriting, the plan transformation in BobolangNG is not iterative;therefore the consequences and limitations of the process are easier to understand and the development of distribution strategies and experimenting with distributed plans are easier and safer.

关键词： Execution plan Distributed computing Partitioning Distributed database Datalog Modeling language

来源：评论

学校读者我要写书评

暂无评论

Shared Memory Tile-Based vs Hybrid Memory GOP-Based parallel algorithms for HEVC Encoder 16th

Shared Memory Tile-Based vs Hybrid Memory GOP-Based Parallel...

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Migallon, Hector Lopez-Granado, Otoniel Galiano, Vicente Pinol, Pablo Malumbres, Manuel P. Miguel Hernandez Univ Dept Phys & Comp Architecture Avda Univ S-N Alicante 03202 Spain

ISBN: (纸本)9783319495835;9783319495828

After the emergence of the new High Efficiency Video Coding standard, several strategies have been followed in order to take advantage of the parallel features available in it. Many of the parallelization approaches in the literature have been performed in the decoder side, aiming at achieving real-time decoding. However, the most complex part of the HEVC codec is the encoding side. In this paper, we perform a comparative analysis of two parallelization proposals. One of them is based on tiles, employing shared memory architectures and the other one is based on Groups Of Pictures, employing distributed shared memory architectures. the results show that good speed-ups are obtained for the tile-based proposal, especially for high resolution video sequences, but the scalability decreases for low resolution video sequences. the GOP-based proposal outperforms the tile-based proposal when the number of processes increases. this benefit grows up when low resolution video sequences are compressed.

关键词： HEVC Video coding parallel encoding Shared memory Distributed shared memory

来源：评论

学校读者我要写书评

暂无评论

Automated parallel Simulation of Heart Electrical Activity Using Finite Element Method 16th

Automated Parallel Simulation of Heart Electrical Activity U...

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Sozykin, Andrey Epanchintsev, Timofei Zverev, Vladimir Khamzin, Svyatoslav Bersenev, Aleksandr Krasovskii Inst Math & Mech Ekaterinburg Russia Inst Immunol & Physiol UrB RAS Ekaterinburg Russia Ural Fed Univ Ekaterinburg Russia

ISBN: (纸本)9783319499567;9783319499550

In this paper we present an approach to the parallel simulation of the heart electrical activity using the finite element method with the help of the FEniCS automated scientific computing framework. FEniCS allows scientific software development using the near-mathematical notation and provides automatic parallelization on MPI clusters. We implemented the ten Tusscher-Panfilov (TP06) cell model of cardiac electrical activity. the scalability testing of the implementation was performed using up to 240 CPU cores and the 95 times speedup was achieved. We evaluated various combinations of the Krylov parallel linear solvers and the preconditioners available in FEniCS. the best performance was provided by the conjugate gradient method and the biconjugate gradient stabilized method solvers with the successive over-relaxation preconditioner. Since the FEniCS-based implementation of TP06 model uses notation close to the mathematical one, it can be utilized by computational mathematicians, biophysicists, and other researchers without extensive parallel computing skills.

关键词： Heart simulation Finite element method Scalability Krylov subspace methods FEniCS parallel computing

来源：评论

学校读者我要写书评

暂无评论

On a parallel Algorithm for the Determination of Multiple Optimal Solutions for the LCSS Problem 16th

On a Parallel Algorithm for the Determination of Multiple Op...

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Ben Mabrouk, Bchira Hasni, Hamadi Mahjoub, Zaher Univ Tunis El Manar Fac Sci Tunis Univ CampusManar 2 Tunis 2092 Tunisia Higher Sch Technol & Comp Sci Charguia 2 Tunis 2035 Tunisia

ISBN: (纸本)9783319495835;9783319495828

For particular real world combinatorial optimization problems e.g. the longest common subsequence problem (LCSSP) from Bioinformatics, determining multiple optimal solutions (DMOS) is quite useful for experts. However, for large size problems, this may be too time consuming, thus the resort to parallel computing. We address here the parallelization of an algorithm for DMOS for the LCSSP. Considering the dynamic programming algorithm solving it, we derive a generic algorithm for DMOS (A-DMOS). Since the latter is a non perfect DO-loop nest, we adopt a three-step approach. the first consists in transforming the A-DMOS into a perfect nest. the second consists in choosing the granularity and the third carries out a dependency analysis in order to determine the type of each loop i.e. either parallel or serial. the practical performances of our approach are evaluated through experimentations achieved on input benchmarks and random DNA sequences and targeting a parallel multicore machine.

关键词： Bioinformatics Combinatorial optimization problem Dependency analysis Dynamic programming Longest common subsequence Loop nest Multiple optimal solutions Multicore machine parallelization Polyhedral algorithm

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel Algorithm for Optimal DAG Structure Search on parallel Computer with Torus Network 16th

Efficient Parallel Algorithm for Optimal DAG Structure Searc...

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Honda, Hirokazu Tamada, Yoshinori Suda, Reiji Univ Tokyo Grad Sch Informat Sci & Technol Tokyo 1138656 Japan

ISBN: (纸本)9783319495835;9783319495828

the optimal directed acyclic graph search problem constitutes searching for a DAG with a minimum score, where the score of a DAG is defined on its structure. this problem is known to be NP-hard, and the state-of-the-art algorithm requires exponential time and space. It is thus not feasible to solve large instances using a single processor. Some parallel algorithms have therefore been developed to solve larger instances. A recently proposed parallel algorithm can solve an instance of 33 vertices, and this is the largest solved size reported thus far. In the study presented in this paper, we developed a novel parallel algorithm designed specifically to operate on a parallel computer with a torus network. Our algorithm crucially exploits the torus network structure, thereby obtaining good scalability. through computational experiments, we confirmed that a run of our proposed method using up to 20,736 cores showed a parallelization efficiency of 0.94 as compared to a 1296-core run. Finally, we successfully computed an optimal DAG structure for an instance of 36 vertices, which is the largest solved size reported in the literature.

关键词： Optimal DAG structure Optimal bayesian network structure parallel algorithm Distributed algorithm Torus network

来源：评论

学校读者我要写书评

暂无评论

the Co-design of Astrophysical Code for Massively parallel Supercomputers 16th

The Co-design of Astrophysical Code for Massively Parallel S...

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Glinsky, Boris Kulikov, Igor Chernykh, Igor Weins, Dmitry Snytnikov, Alexey Nenashev, Vladislav Andreev, Andrey Egunov, Vitaly Kharkov, Egor Inst Computat Math & Math Geophys SB RAS Lavrentjeva Ave 6 Novosibirsk 630090 Russia Novosibirsk State Tech Univ Novosibirsk 630073 Russia Volgograd State Tech Univ Volgograd 400005 Russia

ISBN: (纸本)9783319499567;9783319499550

the rapid growth of supercomputer technologies became a driver for the development of natural sciences. Most of the discoveries in astronomy, in physics of elementary particles, in the design of new materials in the DNA research are connected with numerical simulation and with supercomputers. Supercomputer simulation became an important tool for the processing of the great volume of the observation and experimental data accumulated by the mankind. Modern scientific challenges put the actuality of the works in computer systems and in the scientific software design to the highest level. the architecture of the future exascale systems is still being discussed. Nevertheless, it is necessary to develop the algorithms and software for such systems right now. It is necessary to develop software that is capable of using tens and hundreds of thousands of processors and of transmitting and storing of large volumes of data. In the present work the technology for the development of such algorithms and software is proposed. As an example of the use of the technology, the process of the software development is considered for some problems of astrophysics.

关键词： Exascale systems Co-design High performance computing Computational astrophysics Physics of plasmas

来源：评论

学校读者我要写书评

暂无评论

Light Loss-Less Data Compression, with GPU Implementation 16th

Light Loss-Less Data Compression, with GPU Implementation

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Funasaka, Shunji Nakano, Koji Ito, Yasuaki Hiroshima Univ Dept Informat Engn Kagamiyama 1-4-1 Higashihiroshima 7398527 Japan

ISBN: (纸本)9783319495835;9783319495828

there is no doubt that data compression is very important in computer engineering. However, most lossless data compression and decompression algorithms are very hard to parallelize, because they use dictionaries updated sequentially. the main contribution of this paper is to present a new lossless data compression method that we call Light Loss-Less (LLL) compression. It is designed so that decompression can be highly parallelized and run very efficiently on the GPU. this makes sense for many applications in which compressed data is read and decompressed many times and decompression performed more frequently than compression. We show optimal sequential and parallel algorithms for LLL decompression and implement them to run on Core i7-4790 CPU and GeForce GTX 1080 GPU, respectively. To show the potentiality of LLL compression method, we have evaluated the running time using five images and compared with well-known compression methods LZW and LZSS. Our GPU implementation of LLL decompression runs 91.1-176 times faster than the CPU implementation. Also, the running time on the GPU of our experiments show that LLL decompression is 2.49-9.13 times faster than LZW decompression and 4.30-14.1 times faster that LZSS decompression, although their compression ratios are comparable.

关键词： Data compression parallel algorithms GPGPU

来源：评论

学校读者我要写书评

暂无评论

GPU-Based Heterogeneous Coding Architecture for HEVC 16th

GPU-Based Heterogeneous Coding Architecture for HEVC

引用

16th international conference on algorithms and architectures for parallel processing (ica3pp)

作者： Cebrian-Marquez, Gabriel Migallon, Hector Luis Martinez, Jose Lopez-Granado, Otoniel Pinol, Pablo Cuenca, Pedro Univ Castilla La Mancha Albacete Res Inst Informat I3A Plz Univ 2 Albacete 02071 Spain Miguel Hernandez Univ Dept Phys & Comp Architecture Elche 03202 Spain

ISBN: (纸本)9783319495835;9783319495828

the High Efficiency Video Coding (HEVC) standard has nearly doubled the compression efficiency of prior standards. Nonetheless, this increase in coding efficiency involves a notably higher computing complexity that should be overcome in order to achieve real-time encoding. For this reason, this paper focuses on applying parallel processing techniques to the HEVC encoder with the aim of reducing significantly its computational cost without affecting the compression performance. Firstly, we propose a coarse-grained slice-based parallelization technique that is executed in a multi-core CPU, and then, with finer level of parallelism, a GPU-based motion estimation algorithm. Both techniques define a heterogeneous parallel coding architecture for HEVC. Results show that speed-ups of up to 4.06x can be obtained on a quad-core platform with low impact in coding performance.

关键词： H.265 HEVC Heterogeneous parallel encoding GPU

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共4页 << < 1 2 3 4 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：