检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

5,038 篇 会议
1,418 篇 期刊文献
130 册 图书
45 篇 学位论文

馆藏范围

6,631 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,943 篇 工学
- 3,363 篇 计算机科学与技术...
- 1,987 篇 软件工程
- 973 篇 电气工程
- 238 篇 信息与通信工程
- 179 篇 电子科学与技术（可...
- 138 篇 控制科学与工程
- 67 篇 机械工程
- 52 篇 生物医学工程（可授...
- 52 篇 生物工程
- 44 篇 仪器科学与技术
- 33 篇 材料科学与工程（可...
- 29 篇 力学（可授工学、理...
- 28 篇 动力工程及工程热...
- 27 篇 土木工程
- 21 篇 光学工程
- 20 篇 石油与天然气工程
684 篇 理学
- 401 篇 数学
- 117 篇 物理学
- 87 篇 生物学
- 78 篇 系统科学
- 33 篇 化学
- 28 篇 统计学（可授理学、...
- 26 篇 地球物理学
352 篇 管理学
- 260 篇 管理科学与工程(可...
- 98 篇 图书情报与档案管...
- 62 篇 工商管理
68 篇 教育学
- 62 篇 教育学
58 篇 医学
- 43 篇 临床医学
- 22 篇 基础医学(可授医学...
28 篇 法学
- 27 篇 社会学
15 篇 经济学
15 篇 农学
12 篇 文学
6 篇 艺术学
4 篇 军事学

主题

6,631 篇 parallel program...
1,094 篇 concurrent compu...
1,030 篇 parallel process...
585 篇 programming prof...
497 篇 application soft...
482 篇 computer archite...
466 篇 computer science
437 篇 hardware
338 篇 distributed comp...
335 篇 message passing
318 篇 computational mo...
317 篇 libraries
253 篇 computer languag...
241 篇 program processo...
237 篇 high performance...
229 篇 runtime
197 篇 yarn
191 篇 parallel archite...
189 篇 parallel algorit...
185 篇 costs

机构

15 篇 carnegie mellon ...
14 篇 barcelona superc...
13 篇 school of comput...
11 篇 stanford univ st...
11 篇 intel corporatio...
10 篇 univ pisa dept c...
10 篇 univ illinois de...
9 篇 school of applie...
9 篇 department of co...
9 篇 carnegie mellon ...
9 篇 mathematics and ...
9 篇 department of co...
9 篇 univ texas austi...
8 篇 department of co...
8 篇 ibm thomas j. wa...
8 篇 univ alberta dep...
8 篇 barcelona superc...
8 篇 department of co...
8 篇 irisa rennes
8 篇 tech univ berlin

作者

31 篇 griebler dalvan
25 篇 sarkar vivek
21 篇 danelutto marco
20 篇 fernandes luiz g...
18 篇 badia rosa m.
18 篇 loulergue freder...
16 篇 torquati massimo
15 篇 mencagli gabriel...
15 篇 olukotun kunle
14 篇 wolf felix
14 篇 ayguade eduard
13 篇 m. sato
12 篇 g. runger
12 篇 gonzalez-escriba...
12 篇 valero mateo
11 篇 hoefler torsten
11 篇 dinavahi venkata
11 篇 pingali keshav
11 篇 benini luca
11 篇 sato mitsuhisa

语言

6,411 篇 英文
167 篇 其他
22 篇 中文
17 篇 俄文
6 篇 土耳其文
2 篇 德文
2 篇 朝鲜文
1 篇 西班牙文
1 篇 日文
1 篇 葡萄牙文

检索条件"主题词=Parallel Programming"

共 6631 条记录，以下是1301-1310 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Aggressive Pipelining of Irregular Applications on Reconfigurable Hardware 17

Aggressive Pipelining of Irregular Applications on Reconfigu...

引用

44th Annual International Symposium on Computer Architecture (ISCA)

作者： Li, Zhaoshi Liu, Leibo Deng, Yangdong Yin, Shouyi Wang, Yao Wei, Shaojun Tsinghua Univ Natl Lab Informat Sci & Technol Beijing Peoples R China Tsinghua Univ Sch Software Beijing Peoples R China

ISBN: (纸本)9781450348928

CPU-FPGA heterogeneous platforms offer a promising solution for high-performance and energy-efficient computing systems by providing specialized accelerators with post-silicon reconfigurability. To unleash the power of FPGA, however, the programmability gap has to be filled so that applications specified in high-level programming languages can be efficiently mapped and scheduled on FPGA. The above problem is even more challenging for irregular applications, in which the execution dependency can only be determined at run time. Thus over-serialized accelerators are generated from existing works that rely on compile time analysis to schedule the computation. In this work, we propose a comprehensive software-hardware co-design framework, which captures parallelism in irregular applications and aggressively schedules pipelined execution on reconfigurable platform. Based on an inherently parallel abstraction packaging parallelism for runtime schedule, our framework significantly differs from existing works that tend to schedule executions at compile time. An irregular application is formulated as a set of tasks with their dependencies specified as rules describing the conditions under which a subset of tasks can be executed concurrently. Then datapaths on FPGA will be generated by transforming applications in the formulation into task pipelines orchestrated by evaluating rules at runtime, which could exploit fine-grained pipeline parallelism as handcrafted accelerators do. An evaluation shows that this framework is able to produce datapath with its quality close to handcrafted designs. Experiments show that generated accelerators are dramatically more efficient than those created by current high-level synthesis tools. Meanwhile, accelerators generated for a set of irregular applications attain 0.5x similar to 1.9x performance compared to equivalent software implementations we selected on a server-grade 10-core processor, with the memory subsystem remaining as the bottlene

关键词： FPGA Hardware Accelerator parallel programming

来源：评论

学校读者我要写书评

暂无评论

Transforming procedural code for streaming environments 25

Transforming procedural code for streaming environments

引用

25th Euromicro International Conference on parallel, Distributed and Network-Based Processing (PDP)

作者： Brabec, Michal Bednarek, David Charles Univ Prague Dept Software Engn Prague Czech Republic

ISBN: (纸本)9781509060580

Streaming environments and similar parallel platforms are widely used in image, signal, or general data processing as a means of achieving high performance. Unfortunately, they are often associated with specific programming languages and, thus, hardly accessible for non-experts. In this paper, we present a framework for transformation of a C# procedural code to a Hybrid Flow Graph - a novel intermediate code which employs the streaming paradigm and can be further converted into a streaming application. This approach will allow creating streaming applications or their parts using a widely known imperative language instead of an intricate language specific to streaming. In this paper, we focus on the transformation of control flow which represents the main difference between procedural code, driven by control flow constructs, and streaming environments, driven by data. Since the use of a streaming platform automatically enables parallelism and vectorization, we were able to demonstrate that the streaming applications generated by our method may outperform their original C# implementation.

关键词： code transformation intermediate code parallel programming vectorization streaming systems

来源：评论

学校读者我要写书评

暂无评论

Configuring Concurrent Computation of Phylogenetic Partial Likelihoods: Accelerating Analyses Using the BEAGLE Library 1

引用

17th International Conference on Algorithms and Architectures for parallel Processing (ICA3PP)

作者： Ayres, Daniel L. Cummings, Michael P. Univ Maryland Ctr Bioinformat & Computat Biol College Pk MD 20742 USA

ISBN: (数字)9783319654829

ISBN: (纸本)9783319654829;9783319654812

We describe our approach in augmenting the BEAGLE library for high-performance statistical phylogenetic inference to support concurrent computation of independent partial likelihoods arrays. Our solution involves identifying independent likelihood estimates in analyses of partitioned datasets and in proposed tree topologies, and configuring concurrent computation of these likelihoods via CUDA and opencL frameworks. We evaluate the effect of each increase in concurrency on throughput performance for our partial likelihoods kernel for a four-state nucleotide substitution model on a variety of parallel computing hardware, such as NVIDIA and AMD GPU5, and Intel multicore cPus, observing up to 16-fold speedups over our previous implementation. Finally, we evaluate the effect of these gains on an domain application program, MrBayes. For a partitioned nucleotide-model analysis we observe an average speedup for the overall run time of 2.1-fold over our previous parallel implementation, and 10-fold over the native MrBayes with SSE.

关键词： Bayes methods Biology computing Evolution (biology) Phylogeny Maximum likelihood estimation Multicore processing parallel programming High performance computing

来源：评论

学校读者我要写书评

暂无评论

Replicated Synchronization for Imperative BSP Programs

Replicated Synchronization for Imperative BSP Programs

引用

International Conference on Computational Science (ICCS)

作者： Jakobsson, Arvid Dabrowski, Frederic Bousdira, Wadoud Loulergue, Frederic Hains, Gaetan Huawei Technol France Res Ctr Paris France Univ Orleans INSA Ctr Val Loire LIFO EA 4022 Orleans France No Arizona Univ Sch Informat Comp & Cyber Syst Flagstaff AZ USA

The BSP model (Bulk Synchronous parallel) simplifies the construction and evaluation of parallel algorithms, with its simplified synchronization structure and cost model. Nevertheless, imperative BSP programs can suffer from synchronization errors. Programs with textually aligned barriers are free from such errors, and this structure eases program comprehension. We propose a simplified formalization of barrier inference as data flow analysis, which verifies statically whether an imperative BSP program has replicated synchronization, which is a sufficient condition for textual barrier alignment. (C) 2017 The Authors. Published by Elsevier B. V.

关键词： parallel programming bulk synchronous parallelism static analysis barrier inference

来源：评论

学校读者我要写书评

暂无评论

Cache Friendly parallelization of Neural Encoder-Decoder Models without Padding on Multi-core Architecture 31

Cache Friendly Parallelization of Neural Encoder-Decoder Mod...

引用

31st IEEE International parallel and Distributed Processing Symposium Workshops (IPDPS)

作者： Qiao, Yuchen Hashimoto, Kazuma Eriguchi, Akiko Wang, Haxia Wang, Dongsheng Tsuruoka, Yoshimasa Taura, Kenjiro Tsinghua Univ Dept Comp Sci & Technol Beijing Peoples R China Tsinghua Univ Res Inst Informat & Technol Beijing Peoples R China Univ Tokyo Grad Sch Engn Tokyo Japan Univ Tokyo Grad Sch Informat Sci & Technol Tokyo Japan

ISBN: (纸本)9780769561493

Scaling up Artificial Intelligence (AI) algorithms for massive datasets to improve their performance is becoming crucial. In Machine Translation (MT), one of most important research fields of AI, models based on Recurrent Neural Networks (RNN) show state-of-the-art performance in recent years, and many researchers keep working on improving RNN-based models to achieve better accuracy in translation tasks. Most implementations of Neural Machine Translation (NMT) models employ a padding strategy when processing a mini-batch to make all sentences in a mini-batch have the same length. This enables an efficient utilization of caches and GPU/SIMD parallelism but leads to a waste of computation time. In this paper, we implement and parallelize batch learning for a Sequence-toSequence (Seq2Seq) model, which is the most basic model of NMT, without using a padding strategy. More specifically, our approach forms vectors which represent the input words as well as the neural network's states at different time steps into matrices when it processes one sentence, and as a result, the approach makes a better use of cache and optimizes the process that adjusts weights and biases during the back-propagation phase. Our experimental evaluation shows that our implementation achieves better scalability on multi-core CPUs. We also discuss our approach's potential to be used in other implementations of RNN-based models.

关键词： Neural Machine Translation Cache Optimization parallel programming

来源：评论

学校读者我要写书评

暂无评论

Program Verification Under Weak Memory Consistency Using Separation Logic 29th

Program Verification Under Weak Memory Consistency Using Sep...

引用

29th International Conference on Computer-Aided Verification (CAV)

作者： Vafeiadis, Viktor SWS MPI Saarbrucken Germany

ISBN: (纸本)9783319633879;9783319633862

The semantics of concurrent programs is now defined by a weak memory model, determined either by the programming language (e.g., in the case of C/C++11 or Java) or by the hardware architecture (e.g., for assembly and legacy C code). Since most work in concurrent software verification has been developed prior to weak memory consistency, it is natural to ask how these models affect formal reasoning about concurrent programs. In this overview paper, we show that verification is indeed affected: for example, the standard Owicki-Gries method is unsound under weak memory. Further, based on concurrent separation logic, we develop a number of sound program logics for fragments of the C/C++11 memory model. We show that these logics are useful not only for verifying concurrent programs, but also for explaining the weak memory constructs of C/C++.

关键词： parallel programming program verification Formal logic Memory Memory Semantics Verification Java Java programming language

来源：评论

学校读者我要写书评

暂无评论

MINIME-Validator: Validating Hardware with Synthetic parallel Testcases 20

MINIME-Validator: Validating Hardware with Synthetic Paralle...

引用

20th Conference and Exhibition on Design, Automation and Test in Europe (DATE)

作者： Sen, Alper Deniz, Etem Kahne, Brian Bogazici Univ Dept Comp Engn Istanbul Turkey NXP Semicond Austin TX USA

ISBN: (纸本)9783981537093

programming of multicore architectures with large number of cores is a huge burden on the programmer. parallel patterns ease this burden by presenting the developer with a set of predefined programming patterns that implement best practices in parallel programming. Since the behavior of patterns is well-known and understood they can also lower the burden for verification. In this work, we present a toolset, MINIME-Validator, for generating synthetic parallel testcases from a newly defined parallel Pattern Markup Language (PPML) that uses the concept of parallel patterns. Our testcases mimic the behavior of real customer applications while being much smaller and can be used to generate traffic and validate e.g. inter-processor communication architectures. Experiments show that synthetic testcases can be used for finding representative hardware communication problems. To the best of our knowledge, this is the first time synthetic testcases using parallel programming patterns are used for hardware validation.

关键词： Message systems Multicore processing Pipelines Instruction sets Hardware parallel programming

来源：评论

学校读者我要写书评

暂无评论

Massively parallel Sequence Alignment with BLAST Through Work Distribution Implemented Using PCJ Library 1

引用

17th International Conference on Algorithms and Architectures for parallel Processing (ICA3PP)

作者： Nowicki, Marek Bzhalava, Davit Bala, Piotr Nicolaus Copernicus Univ Fac Math & Comp Sci Chopina 12-18 PL-87100 Torun Poland Karolinska Inst Dept Lab Med F46 S-14186 Stockholm Sweden Univ Warsaw Interdisciplinary Ctr Math & Computat Modelling Pawinskiego 5a PL-02106 Warsaw Poland

ISBN: (数字)9783319654829

ISBN: (纸本)9783319654829;9783319654812

This article presents massively parallel execution of the BLAST algorithm on supercomputers and HPC clusters using thousands of processors. Our work is based on the optimal splitting up the set of queries running with the non-modified NCBI-BLAST package for sequence alignment. The work distribution and search management have been implemented in Java using a PCJ (parallel Computing in Java) library. The PCJ-BLAST package is responsible for reading sequence for comparison, splitting it up and start multiple NCBI-BLAST executables. We also investigated a problem of parallel I/O and thanks to PCJ library we deliver high throughput execution of BLAST. The presented results show that using Java and PCJ library we achieved very good performance and efficiency. In result, we have significantly reduced time required for sequence analysis. We have also proved that PCJ library can be used as an efficient tool for fast development of the scalable applications.

关键词： Sequence alignment NGS Next Generation Sequencing parallel programming Java BLAST NCBI-BLAST PCJ

来源：评论

学校读者我要写书评

暂无评论

Incremental caffeination of a terrestrial hydrological modeling framework using Fortran 2018 teams 2

Incremental caffeination of a terrestrial hydrological model...

引用

2nd Annual Partitioned Global Address Space (PGAS) Applications Workshop (PAW)

作者： Rouson, Damian McCreight, James L. Fanfarillo, Alessandro Sourcery Inst Oakland CA 94612 USA Natl Ctr Atmospher Res POB 3000 Boulder CO 80307 USA

ISBN: (纸本)9781450351232

We present Fortran 2018 teams (grouped processes) running a parallel ensemble of simulations built from a pre-existing Message Passing Interface (MPI) application. A challenge arises around the Fortran standard's eschewing any direct reference to lower-level communication substrates, such as MPI, leaving any interoperability between Fortran's parallel programming model, Coarray Fortran (CAF), and the supporting substrate to the quality of the compiler implmentation. Our approach introduces CAF incrementally, a process we term "caffeination." By letting CAF initiate execution and exposing the underlying MPI communicator to the original application code, we create a one-to-one correspondence between MPI group colors and Fortran teams. We apply our approach to the National Center for Atmospheric Research (NCAR)'s Weather Research and Forcecasting Hydrological Model (WRF-Hydro). The newly caffeinated main program replaces batch job submission scripts and forms teams that each execute one ensemble member. To support this work, we developed the first compiler front-end and parallel runtime library support for teams. This paper describes the required modifications to a public GNU Compiler Collection (GCC) fork, an OpenCoarrays [1] application binary interface (ABI) branch, and a WRF-Hydro branch.

关键词： coarray Fortran computational hydrology parallel programming

来源：评论

学校读者我要写书评

暂无评论

Comparison of Threading programming Models 31

Comparison of Threading Programming Models

引用

31st IEEE International parallel and Distributed Processing Symposium Workshops (IPDPS)

作者： Salehian, Solmaz Liu, Jiawen Yan, Yonghong Oakland Univ Dept Comp Sci & Engn Rochester MI 48309 USA

ISBN: (纸本)9780769561493

In this paper, we provide comparison of language features and runtime systems of commonly used threading parallel programming models for high performance computing, including OpenMP, Intel Cilk Plus, Intel TBB, OpenACC, Nvidia CUDA, OpenCL, C++11 and PThreads. We then report our performance comparison of OpenMP, Cilk Plus and C++11 for data and task parallelism on CPU using benchmarks. The results show that the performance varies with respect to factors such as runtime scheduling strategies, overhead of enabling parallelism and synchronization, load balancing and uniformity of task workload among threads in applications. Our study summarizes and categorizes the latest development of threading programming APIs for supporting existing and emerging computer architectures, and provides tables that compare all features of different APIs. It could be used as a guide for users to choose the APIs for their applications according to their features, interface and performance reported.

关键词： threading parallel programming data parallelism task parallelism memory abstraction synchronization mutual exclusion

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 127 128 129 130 131 132 133 134 135 136 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：