检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

5,038 篇 会议
1,414 篇 期刊文献
130 册 图书
45 篇 学位论文

馆藏范围

6,627 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,940 篇 工学
- 3,356 篇 计算机科学与技术...
- 1,983 篇 软件工程
- 978 篇 电气工程
- 238 篇 信息与通信工程
- 180 篇 电子科学与技术（可...
- 138 篇 控制科学与工程
- 67 篇 机械工程
- 52 篇 生物医学工程（可授...
- 52 篇 生物工程
- 44 篇 仪器科学与技术
- 33 篇 材料科学与工程（可...
- 29 篇 力学（可授工学、理...
- 28 篇 动力工程及工程热...
- 27 篇 土木工程
- 21 篇 光学工程
- 20 篇 石油与天然气工程
684 篇 理学
- 401 篇 数学
- 117 篇 物理学
- 87 篇 生物学
- 78 篇 系统科学
- 33 篇 化学
- 28 篇 统计学（可授理学、...
- 26 篇 地球物理学
352 篇 管理学
- 260 篇 管理科学与工程(可...
- 98 篇 图书情报与档案管...
- 62 篇 工商管理
67 篇 教育学
- 62 篇 教育学
57 篇 医学
- 43 篇 临床医学
- 22 篇 基础医学(可授医学...
28 篇 法学
- 27 篇 社会学
15 篇 经济学
15 篇 农学
12 篇 文学
6 篇 艺术学
4 篇 军事学

主题

6,627 篇 parallel program...
1,096 篇 concurrent compu...
1,033 篇 parallel process...
585 篇 programming prof...
497 篇 application soft...
483 篇 computer archite...
467 篇 computer science
438 篇 hardware
354 篇 distributed comp...
335 篇 message passing
319 篇 computational mo...
317 篇 libraries
254 篇 computer languag...
241 篇 program processo...
230 篇 runtime
227 篇 high performance...
202 篇 yarn
191 篇 parallel archite...
189 篇 parallel algorit...
183 篇 costs

机构

15 篇 carnegie mellon ...
14 篇 barcelona superc...
13 篇 school of comput...
11 篇 intel corporatio...
10 篇 univ pisa dept c...
10 篇 univ illinois de...
10 篇 stanford univ st...
9 篇 school of applie...
9 篇 department of co...
9 篇 carnegie mellon ...
9 篇 mathematics and ...
9 篇 department of co...
9 篇 univ texas austi...
8 篇 department of co...
8 篇 ibm thomas j. wa...
8 篇 univ alberta dep...
8 篇 barcelona superc...
8 篇 department of co...
8 篇 irisa rennes
8 篇 tech univ berlin

作者

32 篇 griebler dalvan
26 篇 sarkar vivek
24 篇 danelutto marco
20 篇 fernandes luiz g...
18 篇 badia rosa m.
18 篇 loulergue freder...
16 篇 torquati massimo
15 篇 mencagli gabriel...
15 篇 ayguade eduard
14 篇 olukotun kunle
14 篇 wolf felix
12 篇 g. runger
12 篇 gonzalez-escriba...
12 篇 valero mateo
12 篇 fernandes luiz g...
12 篇 m. sato
11 篇 hoefler torsten
11 篇 dinavahi venkata
11 篇 pingali keshav
11 篇 benini luca

语言

6,407 篇 英文
167 篇 其他
22 篇 中文
17 篇 俄文
6 篇 土耳其文
2 篇 德文
2 篇 朝鲜文
1 篇 西班牙文
1 篇 日文
1 篇 葡萄牙文

检索条件"主题词=Parallel programming"

共 6627 条记录，以下是1871-1880 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

parallel Independent FFT Implementation on Intel Processors and Xeon Phi for LTE and OFDM Systems 1

Parallel Independent FFT Implementation on Intel Processors ...

引用

1st Nordic Circuits Systems Conference (NORCAS) - NORCHIP / International Symposium on System Chip (SoC) 2015

作者： Khelifi, Mounir Massicotte, Daniel Savaria, Yvon Univ Quebec Trois Rivieres Elect & Comp Engn Dept Trois Rivieres PQ Canada Ecole Polytech Montreal Elect & Comp Engn Dept Grp Rech Elect Ind Lab Signaux & Syst Integres Montreal PQ Canada

ISBN: (纸本)9781467365765

Fast Fourier Transform (FFT) is a key element for wireless applications based on the OFDM (Orthogonal Frequency Division Multiplexing) and challenging for implementing on processor multicores/many-cores. As an example, the Long Term Evolution (LTE) protocol establishes a requirement for processing, whereby many independent FFTs must be calculated within a limited time slot. By using Intel Math Kernel Library (MKL), in our approach to Xeon phi, we managed to reduce the maximum execution time of many independent FFTs. We proposed an implementation on processors multi-cores/many-cores using OpenMP (Open Multi-processing) reducing the mean time latency to 124 mu s on native mode after 1300 mu s with the offload. This is a challenge for shared memory projects. This paper describes how this level of performance can be obtained with multi-core Intel i7, Xeon processors and a many-core Xeon Phi. The best results were obtained with the Xeon Phi, which outperformed the Xeon Sandy-Bridge.

关键词： LTE OFDM Fast Fourier Transform (FFT) parallel programming multithread parallel multi-core many-core MKL

来源：评论

学校读者我要写书评

暂无评论

An Efficient parallel Algorithm for Simpson Cumulative Integration on GPU 3

An Efficient Parallel Algorithm for Simpson Cumulative Integ...

引用

3rd International Symposium on Computing and Networking (CANDAR)

作者： Swardiana, Wayan Aditya Wirahman, Taufiq Sadikin, Rifki Indonesian Inst Sci Cibinong Sci Ctr Res Ctr Informat High Performance Comp Lab Jakarta Indonesia

ISBN: (纸本)9781467397971

In this paper, we present an efficient parallel algorithm for calculating cumulative integration based on Simpson's rule. The proposed parallel algorithm exploits two Blelloch's prefix sums. The first scan is used to calculate even-index, while the second scan is used to calculate odd-index cumulative integration. We implement the parallel algorithm on NVIDIA CUDA based GPUs. Performance of the proposed parallel algorithm is measured by calculating speedup. We also present accuracy performance of the proposed algorithm. Based on the performance measurements, we can conclude that the parallel proposed algorithm is faster than optimized CPU codes with 3 times speedup.

关键词： prefix sums cumulative integration graphics processing unit NVIDIA CUDA parallel processing parallel programming

来源：评论

学校读者我要写书评

暂无评论

Asymmetric Memory Fences: Optimizing Both Performance and Implementability 15

Asymmetric Memory Fences: Optimizing Both Performance and Im...

引用

20th International Conference on Architectural Support for programming Languages and Operating Systems (ASPLOS)

作者： Duan, Yuelu Honarmand, Nima Torrellas, Josep Univ Illinois Urbana IL USA

ISBN: (纸本)9781450328357

There have been several recent efforts to improve the performance of fences. The most aggressive designs allow post-fence accesses to retire and complete before the fence completes. Unfortunately, such designs present implementation difficulties due to their reliance on global state and structures. This paper's goal is to optimize both the performance and the implementability of fences. We start-off with a design like the most aggressive ones but without the global state. We call it Weak Fence or wF. Since the concurrent execution of multiple wFs can deadlock, we combine wFs with a conventional fence (i.e., Strong Fence or sF) for the less performance-critical thread(s). We call the result an Asymmetric fence group. We also propose a taxonomy of Asymmetric fence groups under TSO. Compared to past aggressive fences, Asymmetric fence groups both are substantially easier to implement and have higher average performance. The two main designs presented (WS+ and W+) speed-up workloads under TSO by an average of 13% and 21%, respectively, over conventional fences.

关键词： Fences Sequential Consistency Synchronization parallel programming Shared-Memory Machines

来源：评论

学校读者我要写书评

暂无评论

An Efficient Data-Dependence Profiler for Sequential and parallel Programs 29

An Efficient Data-Dependence Profiler for Sequential and Par...

引用

29th IEEE International parallel and Distributed Processing Symposium (IPDPS)

作者： Li, Zhen Jannesari, Ali Wolf, Felix German Res Sch Simulat Sci D-52062 Aachen Germany Tech Univ Darmstadt D-64289 Darmstadt Germany

ISBN: (纸本)9781479986484

Extracting data dependences from programs serves as the foundation of many program analysis and transformation methods, including automatic parallelization, runtime scheduling, and performance tuning. To obtain data dependences, more and more related tools are adopting profiling approaches because they can track dynamically allocated memory, pointers, and array indices. However, dependence profiling suffers from high runtime and space overhead. To lower the overhead, earlier dependence profiling techniques exploit features of the specific program analyses they are designed for. As a result, every program analysis tool in need of data-dependence information requires its own customized profiler. In this paper, we present an efficient and at the same time generic data-dependence profiler that can be used as a uniform basis for different dependence-based program analyses. Its lock-free parallel design reduces the runtime overhead to around 86x on average. Moreover, signature-based memory management adjusts space requirements to practical needs. Finally, to support analyses and tuning approaches for parallel programs such as communication pattern detection, our profiler produces detailed dependence records not only for sequential but also for multi-threaded code.

关键词： data dependence profiling program analysis parallelization parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel Collocation Solution of Constrained Optimal Control Problems

Parallel Collocation Solution of Constrained Optimal Control...

引用

European Control Conference (ECC)

作者： Fabien, Brian C. Univ Washington Dept Mech Engn Seattle WA 98195 USA

ISBN: (纸本)9783952426937

This paper presents a parallel collocation algorithm for the solution of a two-point boundary value problem (BVP) that involves index-1 differential-algebraic equations (DAEs) and inequality constraints due to complementarity conditions. BVP-DAEs of this type arise from the indirect approach to the solution of optimal control problems that control variable inequality constraints. In the collocation method presented here the differential and algebraic variables of the BVP-DAEs are approximated using piecewise polynomials on a mesh that may be nonuniform. A Newton interior point method is used to solve the collocation equations, and maintain feasibility of the inequality constraints. The implementation of the algorithm involves parallel evaluation of the collocation equations, parallel evaluation of the system Jacobian, and parallel solution of a boarded almost block diagonal (BABD) system to obtain the Newton search direction. A numerical example shows that the parallel implementation provides significant speedup when compared to a sequential version of the algorithm, and when compared to a direct method.

关键词： Optimal control boundary value problem collocation parallel programming

来源：评论

学校读者我要写书评

暂无评论

The Concepts of HPC: The Formalization of Hierarchical Massively parallel Computing 8

The Concepts of HPC: The Formalization of Hierarchical Massi...

引用

Proceedings 8th Romania Tier 2 Federation Grid, Cloud & High Performance Computing in Science (RO-LCG)

作者： Ferenc, Nagy-Egri Mate Wigner RCP Inst Particle & Nucl Phys GPU Lab POB 49 H-1525 Budapest Hungary

ISBN: (纸本)9786067370393

It is becoming clear that software of all kind are growing in complexity. Production quality code plagued with bugs and security issues that are impossible to test for is becoming commonplace, and HPC is no exception. It is therefore necessary to grasp all means of ruling out faulty code and aiding programmers in expressing their intent. C++ is still the dominant language in HPC and with its recent rapid development, a turning point is imminent when the gains of reformulating existing code will outweigh the costs. The current study is a roundtrip of accumulated changes in C++ 11, C++ 14 and the coming C++ 17 standard, new best practices, patterns and idioms that should make their way to the foundations of HPC software. Such drastic changes will result in faster and safer programs with decreased development time.

关键词： C plus GPGPU template metaprogramming parallel programming

来源：评论

学校读者我要写书评

暂无评论

Dynamic Analysis to Support Program Development with the Textually Aligned Property for OpenSHMEM Collectives 2nd

Dynamic Analysis to Support Program Development with the Tex...

引用

2nd Workshop OpenSHMEM and Related Technologies

作者： Knuepfer, Andreas Hilbrich, Tobias Protze, Joachim Schuchart, Joseph Tech Univ Dresden D-01062 Dresden Germany Rhein Westfal TH Aachen JARA High Performance Comp D-52056 Aachen Germany

ISBN: (纸本)9783319264288;9783319264271

The development of correct high performance computing applications is challenged by software defects that result from parallel programming. We present an automatic tool that provides novel correctness capabilities for application developers of OpenSHMEM applications. These applications follow a Single Program Multiple Data (SPMD) model of parallel programming. A strict form of SPMD programming requires that certain types of operations are textually aligned, i.e., they need to be called from the same source code line in every process. This paper proposes and demonstrates run-time checks that assert such behavior for OpenSHMEM collective communication calls. The resulting tool helps to check program consistency in an automatic and scalable fashion. We introduce the types of checks that we cover and include strict checks that help application developers to detect deviations from expected program behavior. Further, we discuss how we can utilize a parallel tool infrastructure to achieve a scalable and maintainable implementation for these checks. Finally, we discuss an extension of our checks towards further types of OpenSHMEM operations.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Realistic Task parallelization of the H.264 Decoding Algorithm for Multiprocessors 17

Realistic Task Parallelization of the H.264 Decoding Algorit...

引用

2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC)

作者： Lin, Xiaohao Liu, Weichen Xiao, Chunming Dai, Jie Luo, Xianlu Zhang, Dan Liu, Duo Wu, Kaijie Zhuge, Qingfeng Sha, Edwin H. M. Chongqing Univ Coll Comp Sci Chongqing Peoples R China

ISBN: (纸本)9781479989379

There is a phenomenon that hardware technology has developed ahead of software technology in recent years. Companies lack of software techniques that can fully utilize the modern multi-core computing resources, mainly due to the difficulty of investigating the inherent parallelism inside a software. This problem exists in products ranging from energy-sensitive smartphones to performance-eager data centers. In this paper, we present a case study on the parallelization of the complex industry standard H.264 HDTV decoder application in multi-core systems. An optimal schedule of the tasks is obtained and implemented by a carefully-defined software parallelization framework (SPF). The parallel software framework is proposed together with a set of rules to direct parallel software programming (PSPR). A pre-processing phase based on the rules is applied to the source code to make the SPF applicable. The task-level parallel version of the H.264 decoder is implemented and tested extensively on a workstation running Linux. Significant performance improvement is observed for a set of benchmarks composed of 720p videos. The SPF and the PSPR will together serve as a reference for future parallel software implementations and direct the development of automated tools.

关键词： Decoding Industries parallel processing parallel programming Software Software algorithms Videos

来源：评论

学校读者我要写书评

暂无评论

Supporting Scientists in Re-engineering Sequential Programs to parallel Using Model-driven Engineering 1

Supporting Scientists in Re-engineering Sequential Programs ...

引用

IEEE ACM 1st International Workshop on Software Engineering for High Performance Computing in Science (SE4HPCS)

作者： Almorsy, Mohamed Grundy, John Swinburne Univ Technol Ctr Comp & Engn Software & Syst Hawthorn Vic Australia

ISBN: (纸本)9781467370820

Developing complex computational-intensive and data-intensive scientific applications requires effective utilization of the computational power of the available computing platforms including grids, clouds, clusters, multicore and many-core processors, and graphical processing units (GPUs). However, scientists who need to leverage such platforms are usually not parallel or distributed programming experts. Thus, they face numerous challenges when implementing and porting their software-based experimental tools to such platforms. In this paper, we introduce a sequential-to-parallel engineering approach to help scientists in engineering their scientific applications. Our approach is based on capturing sequential program details, planned parallelization aspects, and program deployment details using a set of domain-specific visual languages (DSVLs). Then, using code generation, we generate the corresponding parallel program using necessary parallel and distributed programming models (MPI, OpenCL, or OpenMP). We summarize three case studies (matrix multiplication, N-Body simulation, and signal processing) to evaluate our approach.

关键词： parallel programming High-Performance Computing Domain-specific Visual Languages

来源：评论

学校读者我要写书评

暂无评论

STAPL-RTS: An Application Driven Runtime System 15

STAPL-RTS: An Application Driven Runtime System

引用

29th ACM International Conference on Supercomputing (ICS)

作者： Papadopoulos, Ioannis Thomas, Nathan Fidel, Adam Amato, Nancy M. Rauchwerger, Lawrence Texas A&M Univ College Stn TX 77843 USA

ISBN: (纸本)9781450335591

Modern HPC systems are growing in complexity, as they move towards deeper memory hierarchies and increasing use of computational heterogeneity via GPUs or other accelerators. When developing applications for these platforms, programmers are faced with two bad choices. On one hand, they can explicitly manage all machine resources, writing programs decorated with low level primitives from multiple APIs (e.g. Hybrid MPI / OpenMP applications). Though seemingly necessary for efficient execution, it is an inherently non-scalable way to write software. Without a separation of concerns, only small programs written by expert developers actually achieve this efficiency. Furthermore, the implementations are rigid, difficult to extend, and not portable. Alternatively, users can adopt higher level programming environments to abstract away these concerns. Extensibility and portability, however, often come at the cost of lost performance. The mapping of a user's application onto the system now occurs without the contextual information that was immediately available in the more coupled approach. In this paper, we describe a framework for the transfer of high level, application semantic knowledge into lower levels of the software stack at an appropriate level of abstraction. Using the stapl library, we demonstrate how this information guides important decisions in the runtime system (stapl-rts), such as multi-protocol communication coordination and request aggregation. Through examples, we show how generic programming idioms already known to C++ programmers are used to annotate calls and increase performance.

关键词： parallel programming Data flow Runtime Systems Application Driven Optimizations Distributed Memory Shared Memory Remote Method Invocation

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 184 185 186 187 188 189 190 191 192 193 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：