检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

5,037 篇 会议
1,442 篇 期刊文献
129 册 图书
75 篇 学位论文

馆藏范围

6,683 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,968 篇 工学
- 3,385 篇 计算机科学与技术...
- 2,001 篇 软件工程
- 991 篇 电气工程
- 236 篇 信息与通信工程
- 178 篇 电子科学与技术（可...
- 136 篇 控制科学与工程
- 66 篇 机械工程
- 52 篇 生物医学工程（可授...
- 52 篇 生物工程
- 44 篇 仪器科学与技术
- 32 篇 材料科学与工程（可...
- 30 篇 力学（可授工学、理...
- 28 篇 动力工程及工程热...
- 28 篇 土木工程
- 21 篇 光学工程
- 20 篇 石油与天然气工程
676 篇 理学
- 395 篇 数学
- 118 篇 物理学
- 87 篇 生物学
- 78 篇 系统科学
- 33 篇 化学
- 28 篇 统计学（可授理学、...
- 25 篇 地球物理学
354 篇 管理学
- 262 篇 管理科学与工程(可...
- 98 篇 图书情报与档案管...
- 62 篇 工商管理
68 篇 教育学
- 62 篇 教育学
59 篇 医学
- 44 篇 临床医学
- 22 篇 基础医学(可授医学...
30 篇 法学
- 27 篇 社会学
17 篇 农学
15 篇 经济学
12 篇 文学
6 篇 艺术学
4 篇 军事学

主题

6,683 篇 parallel program...
1,068 篇 concurrent compu...
1,006 篇 parallel process...
572 篇 programming prof...
482 篇 application soft...
466 篇 computer science
466 篇 computer archite...
402 篇 hardware
340 篇 message passing
335 篇 distributed comp...
320 篇 libraries
316 篇 computational mo...
248 篇 computer languag...
231 篇 high performance...
230 篇 program processo...
229 篇 runtime
198 篇 parallel archite...
196 篇 parallel algorit...
193 篇 yarn
179 篇 costs

机构

14 篇 carnegie mellon ...
13 篇 barcelona superc...
11 篇 brno university ...
11 篇 univ illinois de...
11 篇 school of comput...
11 篇 intel corporatio...
10 篇 univ pisa dept c...
10 篇 stanford univ st...
9 篇 school of applie...
9 篇 department of co...
9 篇 carnegie mellon ...
9 篇 mathematics and ...
9 篇 department of co...
9 篇 rice univ housto...
8 篇 department of co...
8 篇 ibm thomas j. wa...
8 篇 univ alberta dep...
8 篇 department of co...
8 篇 irisa rennes
8 篇 tech univ berlin

作者

31 篇 griebler dalvan
25 篇 sarkar vivek
21 篇 danelutto marco
20 篇 fernandes luiz g...
19 篇 loulergue freder...
17 篇 badia rosa m.
16 篇 torquati massimo
15 篇 mencagli gabriel...
15 篇 olukotun kunle
14 篇 wolf felix
12 篇 g. runger
12 篇 gonzalez-escriba...
12 篇 ayguade eduard
12 篇 m. sato
11 篇 hoefler torsten
11 篇 dinavahi venkata
11 篇 benini luca
11 篇 valero mateo
11 篇 sato mitsuhisa
11 篇 t. rauber

语言

6,497 篇 英文
133 篇 其他
21 篇 中文
17 篇 俄文
7 篇 土耳其文
2 篇 德文
2 篇 朝鲜文
1 篇 西班牙文
1 篇 日文
1 篇 葡萄牙文

检索条件"主题词=Parallel programming"

共 6683 条记录，以下是611-620 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Towards Scalable and Expressive Stream Packet Processing

Towards Scalable and Expressive Stream Packet Processing

引用

IEEE Global Communications Conference (GLOBECOM)

作者： Fais, Alessandra Lettieri, Giuseppe Procissi, Gregorio Giordano, Stefano Univ Pisa Dipartimento Ingn Informaz Pisa Italy

ISBN: (纸本)9781728181042

Modern multi-core servers are powerful enough to process multi-gigabit live packet streams on the network data plane. However, in most cases network programmers must build their applications from scratch, by implementing both the interfaces towards the lower hardware level and the proper mechanisms for parallel programming. Data Stream Processing (DaSP) frameworks have recently emerged as promising approaches to overcome the above issues and to let programmers simply focus on the logic of the application to develop. However, DaSP platforms are generally not designed for the networking domain, in terms of both performance and functions. In this paper, we selected the WindFlow DaSP framework and built suitable extensions to attach multiple (accelerated) packet sources of data to it. We then implemented a simple monitoring application on top of WindFlow and carried out stress tests with synthetic and real traffic. The results prove that performance scale linearly with the processing cores so that the application was able to process the whole amount of live data up to nearly 20 Gbps rate.

关键词： Data Stream Processing Multi-Core Platform Accelerated Source parallel programming Software Optimization

来源：评论

学校读者我要写书评

暂无评论

Using Metadata-indexing to Improve the Efficiency of Complex Operations

Using Metadata-indexing to Improve the Efficiency of Complex...

引用

IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus)

作者： Kirikova, Anastasia Mironov, Artem Smolensk State Univ SmolGU Phys & Math Dept Smolensk Russia

ISBN: (纸本)9781665404761

The article discusses a way to improve the efficiency of complex query execution in modern DBMS. The method is based on the use of tree structures, key hash codes, and the possibility of optimization based on partitioning. As an additional aspect of optimization, the method of parallel operation of the proposed method is described.

关键词： parallel programming big data SQL-query perceptual hashing databases

来源：评论

学校读者我要写书评

暂无评论

Vate: Runtime Adaptable Probabilistic programming for Java 1

Vate: Runtime Adaptable Probabilistic Programming for Java

引用

1st Workshop on Machine Learning and Systems (EuroMLSys)

作者： Goodman, Daniel Pocock, Adam Peck, Jason Steele, Guy Oracle Labs Redwood City CA 94065 USA

ISBN: (纸本)9781450382984

Inspired by earlier work on Augur, Vate is a probabilistic programming language for the construction of JVM based probabilistic models with an Object-Oriented interface. As a compiled language it is able to examine the dependency graph of the model to produce optimised code that can be dynamically targeted to different platforms. Using Gibbs Sampling, Metropolis-Hastings and variable marginalisation it can handle a range of model types and is able to efficiently infer values, estimate probabilities, and execute models.

关键词： Probabilistic programming Java Augur MCMC parallel programming

来源：评论

学校读者我要写书评

暂无评论

Synchronization Overlap Trade-Off for a Model of Spatial Distribution of Species 21st

Synchronization Overlap Trade-Off for a Model of Spatial Dis...

引用

21st International Conference on Computational Science and Its Applications (ICCSA)

作者： Bioco, Joao Prata, Paula Canovas, Fernando Fazendeiro, Paulo Univ Beira Interior UBI Covilha Portugal Univ Beira Interior C4 Ctr Competencias Cloud Comp C4 UBI Covilha Portugal Univ Catolica San Antonio Murcia Fac Ciencias Salud Murcia Spain Inst Telecomunicacoes IT Covilha Covilha Portugal

ISBN: (纸本)9783030869601;9783030869595

Despite of the widespread implementation of agent-based models in ecological modeling and another several areas, modelers have been concerned by the time consuming of these type of models. This paper presents a strategy to parallelize an agent-based model of spatial distribution of biological species, operating in a multi-stage synchronous distributed memory mode, as a way to obtain gains in the performance while reducing the need for synchronization. A multiprocessing implementation divides the environment (a rectangular grid corresponding to the study area) into stage-subsets, according to the number of defined or available processes. In order to ensure that there is no information loss, each stage-subset is extended with an overlapping section from each one of its neighbouring stage-subsets. The effect of the size of this overlapping on the quality of the simulations is studied. These results seem to indicate that it is possible to establish an optimal trade-off between the level of redundancy and the synchronization frequency. The reported paralellization method was tested in a standalone multicore machine but may be seamlessly scalable to a computation cluster.

关键词： parallel programming Multiprocessing Agent-based modelling and simulation Synchronization-reducing algorithms

来源：评论

学校读者我要写书评

暂无评论

A High-Performance Heterogeneous Critical Path Analysis Framework

A High-Performance Heterogeneous Critical Path Analysis Fram...

引用

IEEE High Performance Extreme Computing Conference (HPEC)

作者： Zamani, Yasin Huang, Tsung-Wei Univ Utah Dept Elect & Comp Engn Salt Lake City UT 84132 USA

ISBN: (纸本)9781665423694

Emphasis on static timing analysis (STA) has shifted from graph-based analysis (GBA) to path-based analysis (PBA) for reducing unwanted slack pessimism. However, it is extremely time-consuming for a PBA engine to analyze a large set of critical paths. Recent years have seen many parallel PBA applications, but most of them are limited to CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a high-performance graphics processing unit (GPU)-accelerated PBA framework that efficiently analyzes the timing of a generated critical path set. We represent the path set in three dimensions, timing test, critical path, and pin, to structure the granularity of parallelism scalable to arbitrary problem sizes. In addition, we leverage task-based parallelism to decompose the PBA workload into CPU-GPU dependent tasks where kernel computation and data processing overlap efficiently. Experimental results show that our framework applied to an important PBA application can speed up the state-of-the-art baseline up to 10x on a million-gate design.

关键词： parallel programming Static Timing Analysis (STA) Path-Based Analysis (PBA) CUDA Graph

来源：评论

学校读者我要写书评

暂无评论

Towards systematic parallel programming of graph problems via tree decomposition and tree parallelism

Towards systematic parallel programming of graph problems vi...

引用

2nd ACM SIGPLAN Workshop on Functional High-Performance Computing, FHPC 2013 - Co-located with the 18th ACM SIGPLAN International Conference on Functional programming, ICFP 2013

作者： Wang, Qi Chen, Meixian Liu, Yu Hu, Zhenjiang STAP Group School of Software Shanghai Jiao Tong University China Dpt of Computer Science and Engineering Shanghai Jiao Tong University China University for Advanced Studies Japan National Institute of Informatics Japan

ISBN: (纸本)9781450323819

Many graph optimization problems, such as theMaximumWeighted Independent Set problem, are NP-hard. For large scale graphs that have billions of edges or vertices, these problems are hard to be computed directly even using popular data-intensive frameworks like MapReduce or Pregel that are deployed on large computerclusters, because of the extremely high computational complexity. On the other hand, many studies have shown the existence of polynomial time algorithms on graphs with bounded treewidth, which makes it possible to solve these problems on large graphs. However, the algorithms are usually difficult to be understood or parallelized. In this paper, we propose a novel programming framework which provides a user-friendly programming interface and automatic in-black-box parallelization. The programming interface, which is a simple and straightforward abstraction called Generate- Test-Aggregate (GTA for short), is used to describe a set of graph problems. We propose to derive bottom-up dynamic programming algorithms on tree decompositions from the user-specified GTA algorithms, and further transform the bottom-up algorithms to parallel ones which run in a divide-and-conquer manner on a list of subtrees. Besides, balanced tree partition strategies are discussed for efficient parallel computing. Our preliminary experimental results on the Maximum Weighted Independent Set problem demonstrate the practical viability of our approaches. Copyright © 2013 by the Association for Computing Machinery, Inc. (ACM).

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Enabling OpenMP Task parallelism on Multi-FPGAs 29

Enabling OpenMP Task Parallelism on Multi-FPGAs

引用

29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

作者： Nepomuceno, Ramon Sterle, Renan Valarini, Guilherme Pereira, Marcio Yviquel, Herve Araujo, Guido Univ Estadual Campinas Inst Comp Campinas SP Brazil

ISBN: (纸本)9781665435550

FPGA-based accelerators have received increasing attention recently. Nevertheless, the amount of resources available on even the most powerful FPGA is still not enough to speed up very large workloads. To achieve that, FPGAs need to be interconnected in a Multi-FPGA architecture. However, programming such architecture is a challenging endeavor. This paper extends the OpenMP task-based computation offloading model to enable several FPGAs to work as a single Multi-FPGA architecture. Experimental results, for a set of OpenMP stencil applications running on a Multi-FPGA platform, have shown close to linear speedups as the number of FPGAs and IP-cores per FPGA increases.

关键词： Multi FPGAs OpenMP parallel programming Task parallelism

来源：评论

学校读者我要写书评

暂无评论

Ghostwriter: A Cache Coherence Protocol for Error-Tolerant Applications 21

Ghostwriter: A Cache Coherence Protocol for Error-Tolerant A...

引用

50th International Conference on parallel Processing (ICPP)

作者： Kao, Henry San Miguel, Joshua Jerger, Natalie Enright Univ Toronto Toronto ON Canada Univ Wisconsin Madison WI USA Huawei Technol Canada Markham ON Canada

ISBN: (纸本)9781450384414

Coherence induced cache misses are an important aspect limiting the scalability of shared memory parallel programs. Many coherence misses are avoidable, namely misses due to false sharing when different threads write to different memory addresses that are contained within the same cache block causing unnecessary invalidations. Past work has proposed numerous ways to mitigate false sharing from coherence protocols optimized for certain sharing patterns, to software tools for false-sharing detection and repair. Our work leverages approximate computing and store value similarity in error-tolerant multi-threaded applications. We introduce a novel cache coherence protocol which implements an approximate store instruction and coherence states to allow some limited incoherence within approximatable shared data to mitigate both coherence misses and coherence traffic for various sharing patterns. For applications from the Phoenix and AxBench suites, we see dynamic energy improvements within the NoC and memory hierarchy of up to 50.1% and speedup of up to 37.3% with low output error for approximate applications that exhibit false sharing.

关键词： Cache Coherence Approximate Computing parallel programming

来源：评论

学校读者我要写书评

暂无评论

The Heuristic Algorithm For Symmetric Horizontal Data Distribution

The Heuristic Algorithm For Symmetric Horizontal Data Distri...

引用

IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus)

作者： Munerman, Victor Munerman, Daniel Samoilova, Tatyana Smolensk State Univ SmolGU Phys & Math Dept Smolensk Russia

ISBN: (纸本)9781665404761

The article considers one algorithm for the optimal distribution of "objects" of an arbitrary nature among "storages", the essence of which is determined by the subject area. Some subject areas for which the optimal distribution problem is relevant are considered. Authors considers the problem of accelerating of the Join operation is considered. In the case of big data parallel processing, the Join operation requires uniform distribution of data between the cluster processors. In this case, parallel implementation of the Join operation will be effective only when the computational complexities of its execution in all database fragments will be minimally different from each other. The optimality criterion should ensure uniform distribution of data. A detailed description of the heuristic optimal distribution algorithm is given. Objective functions for the problems under consideration are proposed. A description is given of the experiments that made it possible to assess the quality of the heuristic greedy optimal distribution algorithm. As a result of these experiments, the dependences of the execution time of the algorithm on the number of distributed objects and the quality of distribution (the difference between the maximum and minimum storage capacity) on the number of stores and the interval of the values of the objects weight. It is shown that the algorithm is quite simple and can be easily implemented in any programming language. The running time of the algorithm, even for big data, is small, which allows it to be effectively used in the preparation of data for parallel solving problems with high computational complexity. The algorithm shows good results when distributing ables-operands across data warehouses. The largest storage capacity differs from the smallest by a small amount.

关键词： optimal distribution parallel programming heuristic algorithm

来源：评论

学校读者我要写书评

暂无评论

Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs 3

Measurement and Analysis of GPU-Accelerated OpenCL Computati...

引用

IEEE/ACM International Workshop on programming and Performance Visualization Tools (ProTools)

作者： Cherian, Aaron Thomas Zhou, Keren Grubisic, Dejan Meng, Xiaozhu Mellor-Crummey, John Rice Univ Dept Comp Sci Houston TX 77251 USA

ISBN: (纸本)9781665411103

Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in supercomputers, including the US Department of Energy's forthcoming exascale systems. Since the execution model for GPUs differs from that for conventional processors, applications need to be rewritten to exploit GPU parallelism. Performance tools are needed for such GPU-accelerated systems to help developers assess how well applications offload computation onto GPUs. In this paper, we describe extensions to Rice University's HPCToolkit performance tools that support measurement and analysis of Intel's DPC++ programming model for GPU-accelerated systems atop an implementation of the industry-standard OpenCL framework for heterogeneous parallelism on Intel GPUs. HPCToolkit supports three techniques for performance analysis of programs atop OpenCL on Intel GPUs. First, HPCToolkit supports profiling and tracing of OpenCL kernels. Second, HPCToolkit supports CPU-GPU blame shifting for OpenCL kernel executions-a profiling technique that can identify code that executes on one or more CPUs while GPUs are idle. Third, HPCToolkit supports fine-grained measurement, analysis, and attribution of performance metrics to OpenCL GPU kernels, including instruction counts, execution latency, and SIMD waste. The paper describes these capabilities and then illustrates their application in case studies with two applications that offload computations onto Intel GPUs.

关键词： Supercomputers High performance computing Performance analysis parallel programming

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 58 59 60 61 62 63 64 65 66 67 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：