检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

3,165 篇 会议
73 篇 期刊文献
65 册 图书

馆藏范围

3,302 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

2,344 篇 工学
- 2,064 篇 计算机科学与技术...
- 1,040 篇 软件工程
- 415 篇 电气工程
- 328 篇 信息与通信工程
- 311 篇 电子科学与技术（可...
- 114 篇 控制科学与工程
- 69 篇 机械工程
- 67 篇 光学工程
- 67 篇 生物工程
- 62 篇 生物医学工程（可授...
- 36 篇 动力工程及工程热...
- 34 篇 仪器科学与技术
- 32 篇 材料科学与工程（可...
- 32 篇 建筑学
- 29 篇 化学工程与技术
- 24 篇 土木工程
- 21 篇 力学（可授工学、理...
725 篇 理学
- 485 篇 数学
- 175 篇 物理学
- 80 篇 生物学
- 65 篇 系统科学
- 61 篇 统计学（可授理学、...
- 36 篇 化学
247 篇 管理学
- 159 篇 管理科学与工程(可...
- 102 篇 图书情报与档案管...
- 70 篇 工商管理
64 篇 医学
- 53 篇 临床医学
- 21 篇 基础医学(可授医学...
22 篇 农学
- 19 篇 作物学
21 篇 法学
- 19 篇 社会学
15 篇 经济学
12 篇 文学
11 篇 教育学
4 篇 军事学

主题

327 篇 parallel process...
203 篇 graphics process...
202 篇 computer archite...
158 篇 parallel archite...
136 篇 parallel process...
123 篇 parallel algorit...
121 篇 graphics process...
115 篇 hardware
113 篇 image processing
86 篇 concurrent compu...
86 篇 computational mo...
77 篇 signal processin...
72 篇 parallel program...
71 篇 field programmab...
68 篇 instruction sets
68 篇 multicore proces...
67 篇 parallel computi...
65 篇 algorithm design...
58 篇 throughput
57 篇 gpu

机构

9 篇 college of compu...
9 篇 natl univ def te...
8 篇 carleton univ sc...
8 篇 national laborat...
6 篇 hosei univ dept ...
6 篇 inria rennes
6 篇 st francis xavie...
5 篇 chinese acad sci...
5 篇 univ aizu dept c...
5 篇 polish japanese ...
5 篇 computer science...
5 篇 college of compu...
5 篇 city university ...
4 篇 shanghai jiao to...
4 篇 charles univ pra...
4 篇 rwth aachen univ...
4 篇 hainan internati...
4 篇 department of co...
4 篇 university of ch...
4 篇 universidad carl...

作者

11 篇 jack dongarra
10 篇 roman wyrzykowsk...
8 篇 dongarra jack
7 篇 liu jie
7 篇 konrad karczewsk...
7 篇 quintana-orti en...
6 篇 hannig frank
6 篇 li dongsheng
6 篇 teich juergen
6 篇 li chao
6 篇 nakano koji
6 篇 peng shietung
6 篇 li yamin
6 篇 chu wanming
6 篇 krulis martin
5 篇 zhang lei
5 篇 ito yasuaki
5 篇 li kenli
5 篇 wanlei zhou
5 篇 tudruj marek

语言

3,163 篇 英文
128 篇 其他
20 篇 中文

检索条件"任意字段=5th International Conference on Algorithms and Architectures for Parallel Processing"

共 3303 条记录，以下是1441-1450 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Video processing on GPU: Analysis of data transfer overhead 26

Video processing on GPU: Analysis of data transfer overhead

引用

26th IEEE international Symposium on Computer Architecture and High Performance Computing Workshops, SBAC-PADW 2014

作者： Prata, Renan Bentes, Cristiana Farias, Ricardo Department of System Engineering State University of Rio de Janeiro Rio de Janeiro RJ20550-900 Brazil COPPE/Systems Engineering Federal University of Rio de Janeiro Rio de Janeiro RJ21945-970 Brazil

ISBN: (纸本)9781479970148

In this work, we study one of the major problems in exploring the power of GPUs to accelerate video processing applications: countless frames have to be transferred back and forth between the CPU and GPU. We evaluate four different data transfer approaches currently available on modern GPUs: Standard Allocation, Pinned Memory, Data Stream, and Zero-Copy. Our results show that Data Stream is the most efficient strategy, but requires more programming effort. Zero-Copy, on the other hand, demonstrates inferior performance due to the significant latency incurred by the PCIe bus transfers for every memory access. © 2014 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

20th international conference on parallel processing, Euro-Par 2014

20th International Conference on Parallel Processing, Euro-P...

引用

20th international conference on parallel processing, Euro-Par 2014

ISBN: (纸本)9783319098722

the proceedings contain 68 papers. the special focus in this conference is on Support tools environments, Performance prediction and evaluation, Scheduling and load balancing, High performance architectures and compilers, parallel and distributed data management, Grid, Cluster and cloud computing, Green high performance computing, Distributed systems and algorithms, parallel and distributed programming, parallel numerical algorithms, Multicore and manycore programming, theory and algorithms for parallel computation and High performance networks and communication. the topics include: MPI trace compression using event flow graphs;customized scalable tracing with in-situ data analysis;performance measurement and analysis of transactional memory and speculative execution on IBM blue Gene/Q;an open-source management framework for cloud applications;modeling and simulation of a dynamic task-based runtime system for heterogeneous multi-core architectures;modeling the impact of reduced memory bandwidth on HPC applications;finding the important basic blocks in multithreaded programs;optimization and trade-off analysis for time, energy and resource usage;performance prediction and evaluation of parallel applications in KVM, Xen, and VMware;per-task DRAM energy metering in multicore systems;characterizing the performance-energy tradeoff of small ARM cores in HPC computation;finding efficient queue setup using high-resolution simulations;a progressively pessimistic scheduler for software transactional memory;a queueing theory approach to pareto optimal bags-of-tasks scheduling on clouds;scheduling/placement approach for task-graphs on heterogeneous architecture;energy-aware multi-organization scheduling problem;energy efficient scheduling of mapreduce jobs and switchable scheduling for runtime adaptation of optimization.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Acceleration of accurate floating point operations using SIMD 9

Acceleration of accurate floating point operations using SIM...

引用

9th international conference on Computer Engineering & Systems (ICCES)

作者： Abdalla, DiaaEldin M. Zaki, Ahmad M. Bahaa-Eldin, Ayman M. Ain Shams Univ Cairo Egypt

ISBN: (纸本)9781479965946

Several computing systems that use decimal number calculations suffer from the accumulation and propagation of errors. Decimal numbers are represented using specific length floating point formats and hence there will always be a truncation of extra fraction bits causing errors. Several solutions had been proposed for such a problem. Among those accurate calculation systems was the usage of vectors of floating point numbers to represent decimal values with very large accuracy, known as Multi-Number System (MN). Unfortunately, MN calculations are time consuming and are not suitable for real time applications. Several attempts for special architectures had been proposed to speed up those calculations. In this work, the Single Instruction Multiple Data (SIMD) paradigm found in modern CPUs is exploited to accelerate the MN calculations. the basic arithmetic operation algorithms had been modified to utilize the SIMD architecture and a new Square representation of operands had been proposed, this representation was introduced because the MN operations are sequential and iterative, and thus we can't apply the SIMD parallel instructions directly. the proposed architecture has an execution time that is 35 % of the original MN execution time for the division, which is the most time consuming operation while preserving the same accuracy.

关键词： floating point arithmetic parallel processing CPU MN calculations MN execution time SIMD architecture SIMD parallel instructions basic arithmetic operation algorithms computing systems decimal number calculations decimal values execution time floating point numbers floating point operations fraction-bit truncation multinumber system single-instruction multiple data paradigm specific-length floating point formats square representation Manganese floating point arithmetic SIMD architecture execution time single instruction multiple datastream manganese central processing units FLOATING POINTS parallel processing (COMPUTERS) acceleration

来源：评论

学校读者我要写书评

暂无评论

A GPU offloading mechanism for LHCb

A GPU offloading mechanism for LHC<i>b</i>

引用

20th international conference on Computing in High Energy and Nuclear Physics (CHEP)

作者： Badalov, Alexey Perez, Daniel Hugo Campora Zvyagin, Alexander Neufeld, Niko Cardona, Xavier Vilasis La Salle Ramon Llull Univ Barcelona Spain

the current computational infrastructure at LHCb is designed for sequential execution. It is possible to make use of modern multi-core machines by using multi-threaded algorithms and running multiple instances in parallel, but there is no way to make efficient use of specialized massively parallel hardware, such as graphical processing units and Intel Xeon/Phi. We extend the current infrastructure with an out-of-process computational server able to gather data from multiple instances and process them in large batches.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Programmable Systems-on-Chip for Information processing 8

Programmable Systems-on-Chip for Information Processing

引用

8th IEEE international conference on Application of Information and Communication Technologies (AICT)

作者： Sklyarov, Valery Skliarova, Iouilia Univ Aveiro Dept Elect Telecommun & Informat IEETA Aveiro Portugal

ISBN: (纸本)9781479941209

Information processing is a very broad area in which many problems are computationally intensive and thus, they require parallelization and acceleration based on new technologies. the Xilinx Zynq-7000 all programmable system-on-chip can be seen as a very adequate platform permitting application-specific software and problem-targeted hardware to be coupled on a single configurable microchip. the tutorial is dedicated to multi-level software/hardware co-design techniques and system architectures that combine general-purpose computers, multi-core application-specific processing, and accelerators in reconfigurable hardware with emphasis on broad parallelism. Four projects from the scope of data processing, application informatics, parallel algorithms (mapped to hardware), and combinatorial search are briefly characterized and will be demonstrated in fully implemented and ready to test projects that include software and reconfigurable hardware linked with on-chip high-performance interfaces. Particular design examples, potential practical applications, experiments and comparisons will be demonstrated.

关键词： Information processing high-performance computing parallelism programmable systems-on-chip

来源：评论

学校读者我要写书评

暂无评论

Speeding up Frequent Itemset Mining Process on XML Data using Graphic Processor 5

Speeding up Frequent Itemset Mining Process on XML Data usin...

引用

5th international conference on Confluence - the Next Generation Information Technology Summit (Confluence)

作者： Rathi, Sheetal Dhote, C. A. Bangera, Vivek SGBAU Amravati Maharashtra India PRMITR Badnera Maharashtra India Indian Inst Technol Bombay Maharashtra India

ISBN: (纸本)9781479942367

XML technology is being extensively used for data exchange between applications on web and hence mining these documents becomes an important area of research. Since XML is extensively used in web, efficient methods are required for knowledge discovery from the enormous collections of XML documents. Also some advanced tools and technologies are required to effectively handle this scalable data. A methodology is proposed to deal with handling such scalable XML data with the help of high performance and low cost computing, the GPU. this paper aims to parallelize the pre-processing stage of deserialization and sorting to make the dataset favorable for mining.

关键词： XML mining High performance computing Frequent Pattern mining parallel processing

来源：评论

学校读者我要写书评

暂无评论

Fast GPU parallel N-Body tree traversal with Simulated Wide-Warp 20

Fast GPU parallel N-Body tree traversal with Simulated Wide-...

引用

20th IEEE international conference on parallel and Distributed Systems, ICPADS 2014

作者： Zola, Wagner M. Nunan Bona, Luis C.E. Silva, Fabiano Federal University of Paraná Curitiba PR Brazil

ISBN: (纸本)9781479976157

the Barnes-Hut algorithm is a widely used approximation method for the N-Body simulation problem. the irregular nature of this tree walking code presents interesting challenges for its computation on parallel systems. Additional problems arise in effectively exploiting the processing capacity of GPU architectures. We propose and investigate the applicability of software Simulated Wide-Warps (SWW) in this context. To this extent, we explicitly deal with dynamic irregular patterns in data accesses with data remapping and data transformation, by controlling execution flow divergence of threads. We present a new compact data-structure for the tree layout, GPU parallel algorithms for tree transformation and parallel walking using SWW. Benefits of our techniques are in transposing the tree algorithm to execute regular patterns to match the GPU model. Our experiments show significant performance improvement over the best known GPU solutions to this algorithm. © 2014 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

the path toward HEP High Performance Computing

The path toward HEP High Performance Computing

引用

20th international conference on Computing in High Energy and Nuclear Physics (CHEP)

作者： Apostolakis, John Brun, Rene Carminati, Federico Gheata, Andrei Wenzel, Sandro European Org Nucl Res CERN Geneva Switzerland

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. this activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. the activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a high-performance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. the talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from t

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Image registration techniques using parallel computing in multicore environment and its applications in medical imaging: An overview 5

Image registration techniques using parallel computing in mu...

引用

5th IEEE international conference on Computer and Communication Technology, ICCCT 2014

作者： Saxena, Sanjay Sharma, Shiru Sharma, Neeraj Varanasi UP India

ISBN: (纸本)9781479967575

Image Registration is the key step of Image processing as it is the process to locate most accurate relative orientation among two or more images, captured at the same or different times by distinguishable or indistinguishable sensors to increase the information content. For speed optimization of Image Registration, there have been developed numerous approaches till now based on CPU platforms, GPU, CUDA Programming Models etc. Purpose of this paper is to provide a comprehensive review of the existing literature available on Image registration methods based on parallel computing in Multi core architecture. Another considerable intention of this paper is to describe the various applications of image registration using parallel computing in Medical imaging as it can be applied for different modalities of medical images. © 2014 IEEE.

关键词： Image registration

来源：评论

学校读者我要写书评

暂无评论

Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-core architectures 1

引用

20th European conference on parallel Computing (Euro-Par)

作者： Stanisic, Luka thibault, Samuel Legrand, Arnaud Videau, Brice Mehaut, Jean-Francois Univ Grenoble CNRS Inria Grenoble France Univ Bordeaux Inria Bordeaux France

ISBN: (数字)9783319098739

ISBN: (纸本)9783319098739;9783319098722

Multi-core architectures comprising several GPUs have become mainstream in the field of High-Performance Computing. However, obtaining the maximum performance of such heterogeneous machines is challenging as it requires to carefully offload computations and manage data movements between the different processing units. the most promising and successful approaches so far rely on task-based runtimes that abstract the machine and rely on opportunistic scheduling algorithms. As a consequence, the problem gets shifted to choosing the task granularity, task graph structure, and optimizing the scheduling strategies. Trying different combinations of these different alternatives is also itself a challenge. Indeed, getting accurate measurements requires reserving the target system for the whole duration of experiments. Furthermore, observations are limited to the few available systems at hand and may be difficult to generalize. In this article, we show how we crafted a coarse-grain hybrid simulation/emulation of StarPU, a dynamic runtime for hybrid architectures, over SimGrid, a versatile simulator for distributed systems. this approach allows to obtain performance predictions accurate within a few percents on classical dense linear algebra kernels in a matter of seconds, which allows both runtime and application designers to quickly decide which optimization to enable or whether it is worth investing in higher-end GPUs or not.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共331页 << < 141 142 143 144 145 146 147 148 149 150 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：