检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

16,240 篇 会议
369 篇 期刊文献
22 册 图书

馆藏范围

16,631 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

9,338 篇 工学
- 8,537 篇 计算机科学与技术...
- 4,020 篇 软件工程
- 1,985 篇 电气工程
- 1,383 篇 信息与通信工程
- 673 篇 电子科学与技术（可...
- 535 篇 控制科学与工程
- 228 篇 网络空间安全
- 187 篇 仪器科学与技术
- 140 篇 机械工程
- 115 篇 生物医学工程（可授...
- 106 篇 动力工程及工程热...
- 105 篇 测绘科学与技术
- 97 篇 光学工程
- 91 篇 生物工程
- 82 篇 建筑学
- 70 篇 土木工程
- 63 篇 环境科学与工程（可...
- 61 篇 安全科学与工程
1,973 篇 理学
- 1,505 篇 数学
- 245 篇 物理学
- 203 篇 统计学（可授理学、...
- 177 篇 系统科学
- 115 篇 生物学
- 100 篇 地球物理学
- 69 篇 化学
1,462 篇 管理学
- 1,204 篇 管理科学与工程(可...
- 468 篇 工商管理
- 321 篇 图书情报与档案管...
106 篇 医学
- 86 篇 临床医学
96 篇 经济学
- 93 篇 应用经济学
56 篇 法学
53 篇 农学
18 篇 教育学
12 篇 文学
9 篇 军事学
1 篇 艺术学

主题

2,212 篇 parallel process...
1,199 篇 computer archite...
1,129 篇 concurrent compu...
1,116 篇 distributed comp...
1,063 篇 computational mo...
1,038 篇 application soft...
1,017 篇 distributed proc...
991 篇 hardware
905 篇 computer science
710 篇 graphics process...
595 篇 runtime
527 篇 scalability
520 篇 parallel process...
507 篇 algorithm design...
496 篇 parallel program...
490 篇 parallel algorit...
470 篇 graphics process...
460 篇 kernel
446 篇 processor schedu...
440 篇 conferences

机构

38 篇 ibm thomas j. wa...
33 篇 college of compu...
31 篇 school of comput...
27 篇 oak ridge nation...
26 篇 university of ch...
26 篇 oak ridge natl l...
25 篇 georgia inst tec...
25 篇 ohio state univ ...
24 篇 department of co...
23 篇 pacific northwes...
22 篇 tsinghua univers...
21 篇 argonne national...
21 篇 oak ridge nation...
20 篇 georgia inst tec...
19 篇 college of compu...
19 篇 school of comput...
19 篇 department of co...
19 篇 argonne natl lab...
19 篇 pacific northwes...
19 篇 national laborat...

作者

39 篇 jack dongarra
31 篇 dongarra jack
29 篇 zomaya albert y.
26 篇 bader david a.
23 篇 feng wu-chun
22 篇 boukerche azzedi...
19 篇 hoefler torsten
18 篇 gagan agrawal
18 篇 schulz martin
16 篇 dhabaleswar k. p...
16 篇 p. sadayappan
16 篇 wang yijie
15 篇 ito yasuaki
15 篇 yves robert
14 篇 h. casanova
14 篇 alexey lastovets...
14 篇 azad ariful
13 篇 dongsheng li
13 篇 wang guojun
13 篇 kishore kothapal...

语言

16,421 篇 英文
180 篇 其他
27 篇 中文
2 篇 土耳其文
1 篇 葡萄牙文

检索条件"任意字段=IEEE International Symposium on Parallel and Distributed Processing with Applications"

共 16631 条记录，以下是241-250 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

A memosducer based biquaternionic signal processing for nonlinear ultrasonics imaging applied to nondestructive testing

A memosducer based biquaternionic signal processing for nonl...

引用

2024 ieee Ultrasonics, Ferroelectrics, and Frequency Control Joint symposium

作者： Furui, Sadataka Dos Santos, Serge Teikyo Univ Fac Sci & Engn Utsunomiya Tochigi 320 Japan Univ Tours INSA Ctr Val Loire Imaging Brain & Neuropsychiat iBraiN U1253 INSERM F-41034 Blois France

ISBN: (纸本)9798350371918;9798350371901

Optimization of memristor based ultrasonic transducers for mesoscopic characterization of biomaterials has been presented during the last ieee 2022 ISAF. In parallel, the development of quaternionic Clifford based transforms for improving deep learning algorithms within nondestructive testing (NDT) has been recently suggested. Practical implementatiion of this signal processing tool was tested on the time reversal based nonlinear elastic wave spectroscopy (TR-NEWS) experiments. The (2+1)D image data is suggested using quaternions H, detecting anomalous scattering positions in media with hysteresis, while in the (3+1)D image data analysis, an extension was necessary. The (3+1)D image processing was conducted using Clifford algebra A3,1 isomorphic to M2(H), where ultrasonic signals are represented in the biquaternionic space. The optimal weight function of paths associated with this biquaternionic signal processing was obtained by modifying the Echo State Network (ESN) method, and comparison with TR-NEWS experiments was conducted on complex samples (biomaterials or NDT samples) with intrinsic hysteretic nonlinearities. Stability of the weight function of the ultrasonic (US) wave path in (3+1)D is checked by the Machine Learning technique.

关键词： Ultrasonic transducers

来源：评论

学校读者我要写书评

暂无评论

Fast Deterministic Gathering with Detection on Arbitrary Graphs: The Power of Many Robots 37

Fast Deterministic Gathering with Detection on Arbitrary Gra...

引用

37th ieee international parallel and distributed processing symposium (IPDPS)

作者： Molla, Anisur Rahaman Mondal, Kaushik Moses, William K., Jr. Indian Stat Inst Comp & Commun Sci Kolkata India Indian Inst Technol Ropar Dept Math Ropar India Univ Durham Dept Comp Sci Durham England

ISBN: (纸本)9798350337662

Over the years, much research involving mobile computational entities has been performed. From modeling actual microscopic (and smaller) robots, to modeling software processes on a network, many important problems have been studied in this context. Gathering is one such fundamental problem in this area. The problem of gathering k robots, initially arbitrarily placed on the nodes of an n-node graph, asks that these robots coordinate and communicate in a local manner, as opposed to global, to move around the graph, find each other, and settle down on a single node as fast as possible. A more difficult problem to solve is gathering with detection, where once the robots gather, they must subsequently realize that gathering has occurred and then terminate. In this paper, we propose a deterministic approach to solve gathering with detection for any arbitrary connected graph that is faster than existing deterministic solutions for even just gathering (without the requirement of detection) for arbitrary graphs. In contrast to earlier work on gathering, it leverages the fact that there are more robots present in the system to achieve gathering with detection faster than those previous papers that focused on just gathering. The state of the art solution for deterministic gathering [Ta-Shma and Zwick, TALG, 2014] takes O-similar to (n(5) log l) rounds, where l is the smallest label among robots and O-similar to hides a polylog factor. We design a deterministic algorithm for gathering with detection with the following trade-offs depending on how many robots are present: (i) when k = >= n/2 + 1, the algorithm takes O(n(3)) rounds, (ii) when k >= [n/3] + 1, the algorithm takes O(n(4) log n) rounds, and (iii) otherwise, the algorithm takes O-similar to (n(5)) rounds. The algorithm is not required to know k, but only n.

关键词： Gathering Mobile agents Mobile robots distributed algorithms Arbitrary graphs

来源：评论

学校读者我要写书评

暂无评论

parallel K-means on GPU using Warp-Centric Strategies 30

Parallel K-means on GPU using Warp-Centric Strategies

引用

30th ieee international Conference on parallel and distributed Systems, ICPADS 2024

作者： Cordeiro, Michel B. Zola, Wagner M. Nunan Federal University of Parana Curitiba Brazil

ISBN: (纸本)9798331515966

K-means clustering is a popular unsupervised machine learning method widely used in various applications, such as data mining, image processing, and social sciences. However, clustering can be computationally expensive, especially when applied to large datasets. In this article, we explore the performance optimization of k-means clustering through GPU parallelization strategies using the CUDA framework. Due to a warp-centric approach, the proposed implementation is named warp-KMeans. We have compared warp-KMeans to other best-in-class GPU implementations of the k-means algorithm, including the top-tier NVIDIA RAPIDS, a widely used set of libraries for high-performance data analysis on GPUs. Throughout our experiments, we observed that warp-KMeans excels in all scenarios, achieving an acceleration of up to 8.4 compared to the NVIDIA RAPIDS algorithm, when the number of centroids is low, while also exhibiting good scalability and up to 2.94 speedup, concerning various other dataset sizes, higher dimensional points, and number of centroids. © 2024 ieee.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Lightning: Scaling the GPU Programming Model Beyond a Single GPU 36

Lightning: Scaling the GPU Programming Model Beyond a Single...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Heldens, Stijn Hijma, Pieter Werkhoven, Ben Van Maassen, Jason van Nieuwpoort, Rob, V Netherlands eSci Ctr Amsterdam Netherlands Univ Amsterdam Amsterdam Netherlands Vrije Univ Amsterdam Amsterdam Netherlands

ISBN: (纸本)9781665481069

The GPU programming model is primarily aimed at the development of applications that run one GPU. However, this limits the scalability of GPU code to the capabilities of a single GPU in terms of compute power and memory capacity. To scale GPU applications further, a great engineering effort is typically required: work and data must be divided over multiple GPUs by hand, possibly in multiple nodes, and data must be manually spilled from GPU memory to higher-level memories. We present Lightning: a framework that follows the common GPU programming paradigm but enables scaling to large problems with ease. Lightning supports multi-GPU execution of GPU kernels, even across multiple nodes, and seamlessly spills data to higher-level memories (main memory and disk). Existing CUDA kernels can easily be adapted for use in Lightning, with data access annotations on these kernels allowing Lightning to infer their data requirements and the dependencies between subsequent kernel launches. Lightning efficiently distributes the work/data across GPUs and maximizes efficiency by overlapping scheduling, data movement, and kernel execution when possible. We present the design and implementation of Lightning, as well as experimental results on up to 32 GPUs for eight benchmarks and one real-world application. Evaluation shows excellent performance and scalability, such as a speedup of 57.2x over the CPU using Lighting with 16 GPUs over 4 nodes and 80 GB of data, far beyond the memory capacity of one GPU.

关键词： GPU distributed computing CUDA programming model

来源：评论

学校读者我要写书评

暂无评论

Modeling pre-Exascale AMR parallel I/O Workloads via Proxy applications 36

Modeling pre-Exascale AMR Parallel I/O Workloads via Proxy A...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Godoy, William F. Delozier, Jenna Watson, Gregory R. Oak Ridge Natl Lab Comp Sci & Math Div Oak Ridge TN 37830 USA Georgia Inst Technol Coll Comp Atlanta GA 30332 USA

ISBN: (纸本)9781665497473

The present work investigates the modeling of pre-exascale input/output (DO) workloads of Adaptive Mesh Refinement (AMR) simulations through a simple proxy application. We collect data from the AMReX Castro framework running on the Summit supercomputer for a wide range of scales and mesh partitions for the hydrodynamic Sedov case as a baseline to provide sufficient coverage to the formulated proxy model. The non-linear analysis data production rates are quantified as a function of a set of input parameters such as output frequency, grid size, number of levels, and the Courant-Friedrichs-Lewy (CFL) condition number for each rank, mesh level and simulation time step. Linear regression is then applied to formulate a simple analytical model which allows to translate AMReX inputs into MACSio proxy I/O application parameters, resulting in a simple "kernel" approximation for data production at each time step. Results show that MACSio can simulate actual AMReX nonlinear "static" I/O workloads to a certain degree of confidence on the Summit supercomputer using the present methodology. The goal is to provide an initial level of understanding of AMR I/O workloads via lightweight proxy applications models to facilitate autotune data management strategies in anticipation of exascale systems.

关键词： Proxy I/O AMR MACSio HPC exascale

来源：评论

学校读者我要写书评

暂无评论

Towards Fine-grained parallelism in parallel and distributed Python Libraries

Towards Fine-grained Parallelism in Parallel and Distributed...

引用

1st international Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Kerney, Jamison Raicu, Joan Raicu, John Chard, Kyle IIT Coll Comp Chicago IL 60616 USA Univ Chicago Dept Comp Sci Chicago IL 60637 USA

ISBN: (纸本)9798350364613;9798350364606

There is a growing need, for example in machine learning and analytics, to decompose applications into smaller schedulable units. Such decomposition can improve performance, reduce energy consumption, and increase resource utilization. Unfortunately, enabling fine-grained parallelism comes with significant overheads and requires improvements at all layers of the programming stack. We consider the challenges of supporting fine-grained parallelism in the increasingly popular Python-based programming libraries. Specifically, we focus on Parsl, a Python library that is widely used to parallelize the execution of fine-grained Python functions. Parsl's Python-based runtime supports a maximum throughput of around 1200 tasks per second insufficient to meet modern application needs. We perform a comprehensive analysis of Parsl and identify areas that prohibit it from achieving higher throughput. We first profile Parsl components and identify that, with fine-grained tasks workers are often not saturated. We find that tasks spend a majority of their time in the components between the scheduler and worker, however, we also learned that the scheduler is capable of submitting thousands of tasks per second. We then focused on developing new optimizations and implementing crucial components in C to improve throughput. Our new implementation increases Parsl's throughput 6 fold.

关键词： Python

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel PageRank Algorithm for Network Analysis 36

Efficient Parallel PageRank Algorithm for Network Analysis

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Vandromme, Maxence Petiton, Serge G. Univ Lille CNRS UMR CRIStAL 9189 Lille France

ISBN: (纸本)9781665497473

We propose an efficient version of the PageRank algorithm for adjacency matrices, that reduces the complexity by a factor two. This method computes the A(T)x operation on the transpose matrix A(T) without having to explicitly normalize and transpose the matrix. We implement the method using standard row-major and column-major matrix storage formats. We perform experiments with parallel implementations in OpenMP, on synthetic data as well as on matrices extracted from large-scale graphs. The experiments are done on two different Intel processors from recent generations. The column-major storage format version of our method shows good scaling and outperforms the standard PageRank in a majority of cases, even when not considering the preprocessing burden in the latter.

关键词： distributed processing Instruction sets Scalability Memory management Graphics processing units Network analyzers Complexity theory

来源：评论

学校读者我要写书评

暂无评论

Performance Portability of the Chapel Language on Heterogeneous Architectures

Performance Portability of the Chapel Language on Heterogene...

引用

1st international Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Milthorpe, Josh Wang, Xianghao Azizi, Ahmad Oak Ridge Natl Lab POB 2009 Oak Ridge TN 37830 USA Australian Natl Univ Canberra ACT Australia

ISBN: (纸本)9798350364613;9798350364606

A performance-portable application can run on a variety of different hardware platforms, achieving an acceptable level of performance without requiring significant rewriting for each platform. Several performance-portable programming models are now suitable for high-performance scientific application development, including OpenMP and Kokkos. Chapel is a parallel programming language that supports the productive development of high-performance scientific applications and has recently added support for GPU architectures through native code generation. Using three mini-apps BabelStream, miniBUDE, and TeaLeaf we evaluate the Chapel language's performance portability across various CPU and GPU platforms. In our evaluation, we replicate and build on previous studies of performance portability using mini-apps, comparing Chapel against OpenMP, Kokkos, and the vendor programming models CUDA and HIP. We find that Chapel achieves comparable performance portability to OpenMP and Kokkos and identify several implementation issues that limit Chapel's performance portability on certain platforms.

关键词： performance portability Chapel language mini app parallel programming general-purpose GPU progranuning

来源：评论

学校读者我要写书评

暂无评论

parallel Integrity Authentication Data Structure Construction for Encrypted Range Queries 21

Parallel Integrity Authentication Data Structure Constructio...

引用

21st ieee international symposium on parallel and distributed processing with applications, 13th ieee international Conference on Big Data and Cloud Computing, 16th ieee international Conference on Social Computing and Networking and 13th international Conference on Sustainable Computing and Communications, ISPA/BDCloud/SocialCom/SustainCom 2023

作者： Wang, Zhaokang Pan, Jiahui Zhou, Lu Zhang, Zhonghui Ji, Caocong Nanjing University of Aeronautics and Astronautics College of Computer Science and Technology Nanjing China

ISBN: (纸本)9798350329223

With the rapid growth of cloud computing, outsourcing databases to cloud servers is becoming increasingly popular. Query integrity authentication is an effective technique to obtain reliable query results from untrusted clouds. ServeDB (Wu et al., ICDE 2019) is a state-of-the-art system that provides integrity authentication for encrypted range queries. However, ServeDB is a serial system. The high construction cost of its authentication data structure SVETree is the main performance bottleneck that limits its scalability to large datasets. In this study, we overcome the scalability limitation by parallelizing the SVETree construction workflow using the MapReduce framework. We propose the link-free storage layout to store the tree-based SVETree structure in a distributed key-value storage. The parallel SVETree construction algorithm reduces the communication latency of the key-value storage with the batch put/get optimization. The algorithm balances the workload of different tree nodes with the record-centric parallelization optimization. Furthermore, it avoids triggering out-of-core shuffles with the multi-round in-memory shuffle technique. The experimental results show that the parallel algorithm running on 128 cores achieves a 52.7 × speedup over the original serial algorithm. The parallel algorithm exhibits near-linear data and machine scalability. © 2023 ieee.

关键词： MapReduce

来源：评论

学校读者我要写书评

暂无评论

High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, applications, and Experimental Analysis 36

High-order Line Graphs of Non-uniform Hypergraphs: Algorithm...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Liu, Xu T. Firoz, Jesun Aksoy, Sinan Amburg, Ilya Lumsdaine, Andrew Joslyn, Cliff Praggastis, Brenda Gebremedhin, Assefaw H. Univ Washington Seattle WA 98195 USA Washington State Univ Pullman WA 99164 USA Pacific Northwest Natl Lab Richland WA 99352 USA

ISBN: (纸本)9781665481069

Hypergraphs offer flexible and robust data representations for many applications, but methods that work directly on hypergraphs are not readily available and tend to be prohibitively expensive. Much of the current analysis of hypergraphs relies on first performing a graph expansion - either based on the nodes (clique expansion), or on the hyperedges (line graph) - and then running standard graph analytics on the resulting representative graph. However, this approach suffers from massive space complexity and high computational cost with increasing hypergraph size. Here, we present efficient, parallel algorithms to accelerate and reduce the memory footprint of higher-order graph expansions of hypergraphs. Our results focus on the hyperedge-based s-line graph expansion, but the methods we develop work for higher-order clique expansions as well. To the best of our knowledge, ours is the first framework to enable hypergraph spectral analysis of a large dataset on a single sharedmemory machine. Our methods enable the analysis of datasets from many domains that previous graph-expansion-based models are unable to provide. The proposed s-line graph computation algorithms are orders of magnitude faster than state-of-the-art sparse general matrix-matrix multiplication methods, and obtain approximately 2 - 31x speedup over a prior state-of-the-art heuristic-based algorithm for s-line graph computation.

关键词： Hypergraphs parallel hypergraph algorithms line graphs intersection graphs clique expansion

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 21 22 23 24 25 26 27 28 29 30 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：