检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

5,038 篇 会议
1,444 篇 期刊文献
129 册 图书
75 篇 学位论文

馆藏范围

6,686 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,970 篇 工学
- 3,387 篇 计算机科学与技术...
- 2,002 篇 软件工程
- 990 篇 电气工程
- 237 篇 信息与通信工程
- 178 篇 电子科学与技术（可...
- 136 篇 控制科学与工程
- 66 篇 机械工程
- 52 篇 生物医学工程（可授...
- 52 篇 生物工程
- 44 篇 仪器科学与技术
- 33 篇 材料科学与工程（可...
- 30 篇 力学（可授工学、理...
- 28 篇 动力工程及工程热...
- 28 篇 土木工程
- 21 篇 光学工程
- 20 篇 石油与天然气工程
677 篇 理学
- 396 篇 数学
- 118 篇 物理学
- 87 篇 生物学
- 78 篇 系统科学
- 33 篇 化学
- 28 篇 统计学（可授理学、...
- 25 篇 地球物理学
354 篇 管理学
- 262 篇 管理科学与工程(可...
- 98 篇 图书情报与档案管...
- 62 篇 工商管理
68 篇 教育学
- 62 篇 教育学
59 篇 医学
- 44 篇 临床医学
- 22 篇 基础医学(可授医学...
30 篇 法学
- 27 篇 社会学
17 篇 农学
15 篇 经济学
12 篇 文学
6 篇 艺术学
4 篇 军事学

主题

6,686 篇 parallel program...
1,067 篇 concurrent compu...
1,005 篇 parallel process...
572 篇 programming prof...
482 篇 application soft...
466 篇 computer science
466 篇 computer archite...
401 篇 hardware
340 篇 message passing
335 篇 distributed comp...
320 篇 libraries
315 篇 computational mo...
248 篇 computer languag...
231 篇 high performance...
230 篇 program processo...
229 篇 runtime
198 篇 parallel archite...
196 篇 parallel algorit...
193 篇 yarn
179 篇 costs

机构

14 篇 carnegie mellon ...
13 篇 barcelona superc...
11 篇 brno university ...
11 篇 univ illinois de...
11 篇 school of comput...
11 篇 intel corporatio...
10 篇 univ pisa dept c...
10 篇 stanford univ st...
9 篇 school of applie...
9 篇 department of co...
9 篇 carnegie mellon ...
9 篇 mathematics and ...
9 篇 department of co...
9 篇 rice univ housto...
8 篇 department of co...
8 篇 ibm thomas j. wa...
8 篇 univ alberta dep...
8 篇 department of co...
8 篇 irisa rennes
8 篇 tech univ berlin

作者

31 篇 griebler dalvan
25 篇 sarkar vivek
21 篇 danelutto marco
20 篇 fernandes luiz g...
19 篇 loulergue freder...
17 篇 badia rosa m.
16 篇 torquati massimo
15 篇 mencagli gabriel...
15 篇 olukotun kunle
14 篇 wolf felix
12 篇 g. runger
12 篇 gonzalez-escriba...
12 篇 ayguade eduard
12 篇 m. sato
11 篇 hoefler torsten
11 篇 dinavahi venkata
11 篇 benini luca
11 篇 valero mateo
11 篇 sato mitsuhisa
11 篇 t. rauber

语言

6,494 篇 英文
139 篇 其他
21 篇 中文
17 篇 俄文
7 篇 土耳其文
2 篇 德文
2 篇 朝鲜文
1 篇 西班牙文
1 篇 日文
1 篇 葡萄牙文

检索条件"主题词=Parallel Programming"

共 6686 条记录，以下是331-340 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Scalable, Programmable and Dense: The HammerBlade Open-Source RISC-V Manycore 51

Scalable, Programmable and Dense: The HammerBlade Open-Sourc...

引用

ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)

作者： Jung, Dai Cheol Ruttenberg, Max Gao, Paul Davidson, Scott Petrisko, Daniel Li, Kangli Kamath, Aditya K. Cheng, Lin Xie, Shaolin Pan, Peitian Zhao, Zhongyuan Yue, Zichao Veluri, Bandhav Muralitharan, Sripathi Sampson, Adrian Lumsdaine, Andrew Zhang, Zhiru Batten, Christopher Oskin, Mark Richmond, Dustin Taylor, Michael Bedford Univ Washington Seattle WA 98195 USA Cornell Univ Ithaca NY USA PNNL Richland WA USA Univ Calif Santa Cruz Santa Cruz CA USA

ISBN: (纸本)9798350326598;9798350326581

Existing tiled manycore architectures propose to convert abundant silicon resources into general-purpose parallel processors with unmatched computational density and programmability. However, as we approach 100K cores in one chip, conventional manycore architectures struggle to navigate three key axes: scalability, programmability, and density. Many manycores sacrifice programmability for density;or scalability for programmability. In this paper, we explore HammerBlade, which simultaneously achieves scalability, programmability and density. HammerBlade is a fully open-source RISC-V manycore architecture, which has been silicon-validated with a 2048-core ASIC implementation using a 14/16nm process. We evaluate the system using a suite of parallel benchmarks that captures a broad spectrum of computation and communication patterns.

关键词： manycore architecture parallel programming open-source hardware RISC-V

来源：评论

学校读者我要写书评

暂无评论

Edge-parallel Graph Encoder Embedding

Edge-Parallel Graph Encoder Embedding

引用

1st International Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Lubonja, Ariel Shen, Cencheng Priebe, Carey Burns, Randal Johns Hopkins Univ Dept Comp Sci Baltimore MD 21218 USA Univ Delaware Dept Appl Econ & Stat Newark DE USA Johns Hopkins Univ Dept Appl Math & Stat Baltimore MD USA

ISBN: (纸本)9798350364613;9798350364606

New algorithms for embedding graphs have reduced the asymptotic complexity of finding low-dimensional representations. One-Hot Graph Encoder Embedding (GEE) uses a single, linear pass over edges and produces an embedding that converges asymptotically to the spectral embedding. The scaling and performance benefits of this approach have been limited by a serial implementation in an interpreted language. We refactor GEE into a parallel program in the Ligra graph engine that maps functions over the edges of the graph and uses lock-free atomic instructions to prevent data races. On a graph with 1.86 edges, this results in a 500 times speedup over the original implementation and a 17 times speedup over a just-in-time compiled version.

关键词： graph embedding graph processing parallel programming

来源：评论

学校读者我要写书评

暂无评论

A Deep Dive into Task-Based parallelism in Python

A Deep Dive into Task-Based Parallelism in Python

引用

1st International Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Ruys, William Lee, Hochan You, Bozhi Talati, Shreya Park, Jaeyoung Almgren-Bell, James Yan, Yineng Fernando, Milinda Biros, George Erez, Mattan Burtscher, Martin Rossbach, Christopher J. Pingali, Keshav Gligoric, Milos Univ Texas Austin Austin TX 78712 USA Texas State Univ San Marcos TX USA

ISBN: (纸本)9798350364613;9798350364606

Modern Python programs in high-performance computing call into compiled libraries and kernels for performance critical tasks. However, effectively parallelizing these finer-grained, and often dynamic, kernels across modern heterogeneous platforms remains a challenge. First, we perform an experimental study to examine the impact of Python's Global Interpreter Lock (GIL), and potential speedups under a GIL-less PEP703 future, to guide runtime design. Using our optimized runtime, we explore scheduling tasks with constraints that require resources across multiple, potentially diverse, devices through the introduction of new programming abstractions and runtime mechanisms. We extend an existing Python tasking library, Parla, to augment its performance and add support for such multi -device tasks. Our experimental analysis, using tasks graphs from synthetic and real applications, shows at least a 3x (and up to 6x) performance improvement over its predecessor in scenarios with high GIL contention. When scheduling multi-GPU tasks, we observe an Ax reduction in per-task launching overhead compared to a multi-process system.

关键词： task runtime parallel programming Python heterogeneous computing

来源：评论

学校读者我要写书评

暂无评论

MPR: An MPI Framework for Distributed Self-adaptive Stream Processing 30th

MPR: An MPI Framework for Distributed Self-adaptive Stream P...

引用

30th European Conference on parallel and Distributed Processing (Euro-Par)

作者： Loff, Junior Griebler, Dalvan Fernandes, Luiz Gustavo Binder, Walter Univ Svizzera Italiana USI Fac Informat Lugano Switzerland Pontif Catholic Univ Rio Grande do Sul PUCRS Sch Technol Porto Alegre RS Brazil

ISBN: (纸本)9783031695827;9783031695834

Stream processing systems must often cope with workloads varying in content, format, size, and input rate. The high variability and unpredictability make statically fine-tuning them very challenging. Our work addresses this limitation by providing a new framework and run-time system to simplify implementing and assessing new self-adaptive algorithms and optimizations. We implement a prototype on top of MPI called MPR and show its functionality. We focus on horizontal scaling by supporting the addition and removal of processes during execution time. Experiments reveal that MPR can achieve performance similar to that of a handwritten static MPI application. We also assess MPR's adaptation capabilities, showing that it can readily re-configure itself, with the help of a self-adaptive algorithm, in response to workload variations.

关键词： Stream Processing Distributed Systems Self-adaptive parallel programming programming Abstractions MPI

来源：评论

学校读者我要写书评

暂无评论

Evaluating Communication Pattern Representations in Execution Trace Gantt Charts

Evaluating Communication Pattern Representations in Executio...

引用

2024 IEEE Working Conference on Software Visualization

作者： Scully-Allison, Connor Isaacs, Katherine E. Univ Utah Sci Comp & Imaging Inst SCI Salt Lake City UT 84112 USA

ISBN: (纸本)9798331528492;9798331528485

Gantt charts are frequently used to explore execution traces of large-scale parallel programs. In these visualizations, each parallel processor is assigned a row showing the computation state of a processor at a particular time. Lines are drawn between rows to show communication between these processors. When drawn to align equivalent calls across rows, visual patterns can emerge reflecting communication behavior of the executing code. However, though these patterns have the same definition at any scale, they can be obscured by the density of rendered lines when displaying more than a few hundred processors. We seek to understand the effectiveness of various strategies for recognizing these patterns in Gantt charts. Specifically, we conduct a pre-registered user study comparing recognition of patterns when viewing all processors, a subset of processors, or a set of abstracted glyphs overlaid on the chart. We find that all strategies have limitations when scaling, motivating further designs. Our results further indicate that for simple patterns, the glyphs are more effective in general pattern recognition while the zoomed subsets provide nuance to specific characteristics, such as offsets, in patterns. These results suggest the development of a combined approach may be appropriate to enable pattern comprehension in large-scale Gantt charts.

关键词： visualization high performance computing parallel programming communication gantt chart

来源：评论

学校读者我要写书评

暂无评论

Performance Impact of Removing Data Races from GPU Graph Analytics Programs 27

Performance Impact of Removing Data Races from GPU Graph Ana...

引用

27th International Symposium on Workload Characterization

作者： Liu, Yiqian VanAusdal, Avery Burtscher, Martin Texas State Univ Dept Comp Sci San Marcos TX 78666 USA

ISBN: (纸本)9798350356045;9798350356038

Some of the fastest CUDA codes contain "benign" data races to boost their performance. However, such races can lead to unpredictable behavior and incorrect results on other hardware and compilers, making their elimination crucial for producing reliable and portable programs. This paper investigates the performance impact of removing data races from six high-end graph analytics codes. We identify and eliminate the races from these GPU programs by adding necessary synchronization and validating their correctness. We present our race-free codes and their original versions as an open-source suite. Comparing the performance of our new codes with their baseline counterparts on multiple inputs and GPUs, we observe that race-free implementations do not always incur a performance penalty. In fact, some race-free versions are faster, with our validated maximal independent set implementation achieving a 5-11% speedup. Our findings indicate that race-free code can reach comparable or even superior performance, supporting the adoption of best practices for parallel programming.

关键词： parallel programming data races software verification and validation graph analytics CUDA

来源：评论

学校读者我要写书评

暂无评论

Peachy parallel Assignments (EduPar 2024)

Peachy Parallel Assignments (EduPar 2024)

引用

1st International Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Lazar, Alina Scheelk, Ethan Shoop, Elizabeth Bunde, David P. Youngstown State Univ Youngstown OH 44555 USA Macalester Coll St Paul MN USA Knox Coll Galesburg IL USA

ISBN: (纸本)9798350364613;9798350364606

We present two new assignments in the Peachy parallel Assignments series of assignments for teaching parallel and distributed computing. Submitted assignments must have been successfully used previously and are selected for being easy for other instructors to adopt and for being "cool and inspirational" so that students spend time on them and talk about them with others. The first assignment in this paper familiarizes students with the RAFT library for performing GPU-accelerated computation, pail of the RAPIDS AI ecosystem. Students use this library to accelerate a Radius Nearest Neighbor computation, finding all points within a given distance from a query point. In the second assignment, students parallelize a bird flocking simulation using OpenMP or OpenACC. It is a visual assignment which allows students to readily see the performance improvement.

关键词： Peachy parallel Assignments parallel computing education parallel programming Curriculum development Visualizing parallelism RAPIDS Al ecosystem Radius Nearest Neighbor Boid flocking algorithm

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Alternatives to Accelerate Scientific Numerical Calculations on Graphics Processing Units Using Python 10th

Evaluation of Alternatives to Accelerate Scientific Numerica...

引用

10th Latin American Conference on High Performance Computing (CARLA)

作者： Villalobos, Johansell Meneses, Esteban Natl High Technol Ctr Adv Comp Lab San Jose Costa Rica Costa Rica Inst Technol Sch Comp Cartago Costa Rica

ISBN: (纸本)9783031521850;9783031521867

In this paper, the Numba, JAX, CuPy, PyTorch, and TensorFlow Python GPU accelerated libraries were benchmarked using scientific numerical kernels on a NVIDIA V100 GPU. The benchmarks consisted of a simple Monte Carlo estimation, a particle interaction kernel, a stencil evolution of an array, and tensor operations. The benchmarking procedure included general memory consumption measurements, a statistical analysis of scalability with problem size to determine the best libraries for the benchmarks, and a productivity measurement using source lines of code (SLOC) as a metric. It was statistically determined that the Numba library outperforms the rest on the Monte Carlo, particle interaction, and stencil benchmarks. The deep learning libraries show better performance on tensor operations. The SLOC count was similar for all the libraries except Numba which presented a higher SLOC count which implies more time is needed for code development.

关键词： parallel programming parallel Python Graphics Processing Units

来源：评论

学校读者我要写书评

暂无评论

parallel Program Generation for Hybrid Tabular-Textual Question Answering 8th

Parallel Program Generation for Hybrid Tabular-Textual Quest...

引用

8th International Joint Conference on Web and Big Data and Web-Age Information Management (APWeb-WAIM)

作者： Yang, Wenke Yang, Zihan Chen, Liuyi Yan, Ruiqing Yang, Zhengyi Zhang, Linhan Tang, Yifu Data Principles Beijing Technol Co Ltd Beijing Peoples R China Univ Melbourne Melbourne Vic Australia Hunan Univ Changsha Peoples R China Univ New South Wales Sydney NSW Australia Deakin Univ Melbourne Vic Australia

ISBN: (纸本)9789819772315;9789819772322

Hybrid tabular-textual question answering (HTQA) involves tapping into a mosaic of data sources, traditionally managed through LSTM-based step-by-step reasoning, which has been vulnerable to exposure bias and subsequent error accumulation. This paper introduces an innovative parallel program generation method, ConcurGen, aiming to transform this paradigm by simultaneously formulating comprehensive program constructs that seamlessly blend operations and values. This approach not only rectifies the inherent pitfalls of sequential methodologies but also infuses efficiency into the process. When subjected to rigorous evaluation on benchmarks like the ConvFinQA and MultiHiertt datasets, our methodology showcased significant superiority over prevalent models such as FinQANet and MT2Net. This was evidenced by enhancements in various performance metrics, effectively raising the bar for what's deemed state-of-the-art. Notably, beyond setting these commendable benchmarks, our method facilitates a striking acceleration in program creation, achieving speeds nearly 21 times faster. Additionally, a salient feature of our approach becomes evident when numerical reasoning steps escalate: unlike traditional models, our system sustains its robust performance, emphasizing its adaptability and resilience in complex scenarios.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Clupiter: a Raspberry Pi mini-supercomputer for educational purposes 22

Clupiter: a Raspberry Pi mini-supercomputer for educational ...

引用

IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) / BigDataSE Conference / CSE Conference / EUC Conference / ISCI Conference

作者： Rodriguez-Iglesias, Alonso Martin, Maria J. Tourino, Juan Univ A Coruna CITIC Comp Architecture Grp La Coruna Spain

ISBN: (纸本)9798350381993;9798350382006

The main objective of this work is to bring supercomputing and parallel processing closer to non-specialized audiences by building a Raspberry Pi cluster, called Clupiter, which emulates the operation of a supercomputer. It consists of eight Raspberry Pi devices interconnected to each other so that they can run jobs in parallel. To make it easier to show how it works, a web application has been developed. It allows launching parallel applications and accessing a monitoring system to see the resource usage when these applications are running. The NAS parallel Benchmarks (NPB) are used as demonstration applications. From this web application a couple of educational videos can also be accessed. They deal, in a very informative way, with the concepts of supercomputing and parallel programming.

关键词： Supercomputer parallel programming MPI Raspberry Pi NAS parallel Benchmarks (NPB)

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 30 31 32 33 34 35 36 37 38 39 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：