检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

16,240 篇 会议
369 篇 期刊文献
22 册 图书

馆藏范围

16,631 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

9,338 篇 工学
- 8,537 篇 计算机科学与技术...
- 4,020 篇 软件工程
- 1,985 篇 电气工程
- 1,383 篇 信息与通信工程
- 673 篇 电子科学与技术（可...
- 535 篇 控制科学与工程
- 228 篇 网络空间安全
- 187 篇 仪器科学与技术
- 140 篇 机械工程
- 115 篇 生物医学工程（可授...
- 106 篇 动力工程及工程热...
- 105 篇 测绘科学与技术
- 97 篇 光学工程
- 91 篇 生物工程
- 82 篇 建筑学
- 70 篇 土木工程
- 63 篇 环境科学与工程（可...
- 61 篇 安全科学与工程
1,973 篇 理学
- 1,505 篇 数学
- 245 篇 物理学
- 203 篇 统计学（可授理学、...
- 177 篇 系统科学
- 115 篇 生物学
- 100 篇 地球物理学
- 69 篇 化学
1,462 篇 管理学
- 1,204 篇 管理科学与工程(可...
- 468 篇 工商管理
- 321 篇 图书情报与档案管...
106 篇 医学
- 86 篇 临床医学
96 篇 经济学
- 93 篇 应用经济学
56 篇 法学
53 篇 农学
18 篇 教育学
12 篇 文学
9 篇 军事学
1 篇 艺术学

主题

2,212 篇 parallel process...
1,199 篇 computer archite...
1,129 篇 concurrent compu...
1,116 篇 distributed comp...
1,063 篇 computational mo...
1,038 篇 application soft...
1,017 篇 distributed proc...
991 篇 hardware
905 篇 computer science
710 篇 graphics process...
595 篇 runtime
527 篇 scalability
520 篇 parallel process...
507 篇 algorithm design...
496 篇 parallel program...
490 篇 parallel algorit...
470 篇 graphics process...
460 篇 kernel
446 篇 processor schedu...
440 篇 conferences

机构

38 篇 ibm thomas j. wa...
33 篇 college of compu...
31 篇 school of comput...
27 篇 oak ridge nation...
26 篇 university of ch...
26 篇 oak ridge natl l...
25 篇 georgia inst tec...
25 篇 ohio state univ ...
24 篇 department of co...
23 篇 pacific northwes...
22 篇 tsinghua univers...
21 篇 argonne national...
21 篇 oak ridge nation...
20 篇 georgia inst tec...
19 篇 college of compu...
19 篇 school of comput...
19 篇 department of co...
19 篇 argonne natl lab...
19 篇 pacific northwes...
19 篇 national laborat...

作者

39 篇 jack dongarra
31 篇 dongarra jack
29 篇 zomaya albert y.
26 篇 bader david a.
23 篇 feng wu-chun
22 篇 boukerche azzedi...
19 篇 hoefler torsten
18 篇 gagan agrawal
18 篇 schulz martin
16 篇 dhabaleswar k. p...
16 篇 p. sadayappan
16 篇 wang yijie
15 篇 ito yasuaki
15 篇 yves robert
14 篇 h. casanova
14 篇 alexey lastovets...
14 篇 azad ariful
13 篇 dongsheng li
13 篇 wang guojun
13 篇 kishore kothapal...

语言

16,421 篇 英文
180 篇 其他
27 篇 中文
2 篇 土耳其文
1 篇 葡萄牙文

检索条件"任意字段=IEEE International Symposium on Parallel and Distributed Processing with Applications"

共 16631 条记录，以下是221-230 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Modeling Memory Contention between Communications and Computations in distributed HPC Systems 36

Modeling Memory Contention between Communications and Comput...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Denis, Alexandre Jeannot, Emmanuel Swartvagher, Philippe Inria Bordeaux Sud Ouest Bordeaux France

ISBN: (纸本)9781665497473

To amortize the cost of MPI communications, distributed parallel HPC applications can overlap network communications with computations in the hope that it improves global application performance. When using this technique, both computations and communications are running at the same time. But computation usually also performs some data movements. Since data for computations and for communications use the same memory system, memory contention may occur when computations are memory-bound and large messages are transmitted through the network at the same time. In this paper we propose a model to predict memory band-width for computations and for communications when they are executed side by side, according to data locality and taking contention into account. Elaboration of the model allowed to better understand locations of bottleneck in the memory system and what are the strategies of the memory system in case of contention. The model was evaluated on many platforms with different characteristics, and showed a prediction error in average lower than 4 %.

关键词： HPC MPI Memory Contention NUMA Band-width Predictive Models Multicore processing

来源：评论

学校读者我要写书评

暂无评论

Adopting parallel processing for Rapid Generation of Transcripts in Multimedia-rich Online Information Environment

Adopting Parallel Processing for Rapid Generation of Transcr...

引用

37th ieee international parallel and distributed processing symposium (IPDPS)

作者： Cakmak, Mert Can Okeke, Obianuju Spann, Billy Agarwal, Nitin Univ Arkansas COSMOS Res Ctr Little Rock AR 72204 USA

ISBN: (纸本)9798350311990

The growth of online media platforms, particularly multimedia-rich social networks such as YouTube, has resulted in a demand for efficient data collection and analysis techniques. One of the critical data elements for multimedia-rich social platforms is the video transcript, which is not readily available through social platforms. Traditional methods for transcript generation are time-consuming and are challenged by the vast amount of data. This study proposes a methodology that leverages parallel computing and the Python multiprocessing library to improve the speed of transcript collection from YouTube. The methodology utilizes YouTube's Transcript API to extract YouTube-generated transcripts and OpenAI's Whisper model to generate transcriptions on videos without native YouTube transcriptions. Additionally, the Googletrans Translation API was used to translate transcriptions from non-English videos. The results showed a significant improvement in processing time and performance, enabling researchers to conduct various studies on a larger scope of YouTube data with ease. With parallel processing, the YouTube Transcript API showed a 2100.88% performance increase, the Whisper model showed a 29.45% improvement, and the Googletrans API showed a 738.46% increase compared to the sequential processing baseline using the same process. The total time consumption was reduced by 25.54% from 105.64 hours to 78.66 hours. The methodology developed in this study is not limited to YouTube and can be applied to other social media platforms, making it a versatile solution for data collection and analysis.

关键词： YouTube parallel processing social media data collection data crawling

来源：评论

学校读者我要写书评

暂无评论

An Architecture-Independent CGRA Compiler enabling OpenMP applications 36

An Architecture-Independent CGRA Compiler enabling OpenMP Ap...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Kojima, Takuya Adhi, Boma Cortes, Carlos Tan, Yiyu Sano, Kentaro Univ Tokyo Grad Sch Informat Sci & Technol Tokyo Japan RIKEN Ctr Computat Sci R CCS Kobe Hyogo Japan

ISBN: (纸本)9781665497473

Coarse-Grained reconfigurable architecture (CGRA) is a promising platform for HPC systems in the post-Moore's era. A single-source programming model is essential for practical heterogeneous computing. However, we do not have a canonical programming model and a frontend compiler for it. Existing versatile CGRAs, in respect to their execution model, computational capability, and system structure, magnify the difficulty of orchestrating the compiler techniques. It consequently forces designers of the CGRAs to develop the compiler from scratch, working only for their architectures. Such an approach is outdated, given other successful accelerators like GPU and FPGAs. This paper presents a new CGRA compiler framework in order to reduce development efforts of CGRA applications. OpenMP annotated codes are fed into the proposed compiler, as recent OpenMP support device offloading to the accelerators. This property improves the reusability of the existing source code for HPC workloads. The design of the compiler is inspired by LLVM, which is the most famous compiler framework so that the frontend is built to be architecture-independent. In this work, we demonstrate that the proposed compiler can handle different types of CGRAs without changing the source codes. In addition, we discuss the effect of architecture-independent optimization algorithms. We also provide an open-source implementation of the compiler framework at https://***/hal-lab-u-tokyo/CGRAOmp.

关键词： Codes Runtime Computational modeling Graphics processing units Programming Heterogeneous networks Reconfigurable architectures

来源：评论

学校读者我要写书评

暂无评论

Interactive textbooks for parallel and distributed computing across the undergraduate CS curriculum

Interactive textbooks for parallel and distributed computing...

引用

1st international Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Shoop, Elizabeth Brown, Richard Matthews, Suzanne J. Adams, Joel C. Macalester Coll St Paul MN 55105 USA St Olaf Coll Northfield MN 55057 USA US Mil Acad West Point NY 10996 USA Calvin Univ Grand Rapids MI USA

ISBN: (纸本)9798350364613;9798350364606

It has been a decade since the ACM/ieee CS2013 Curriculum guidelines recommended that all CS students learn about parallel and distributed computing (PDC). But few textbooks for "core" CS courses especially first-year courses include coverage of PDC topics. To fill this gap, we have written free, online, beginner- and intermediate-level PDC textbooks, containing interactive C/C++ OpenMP, MPI, mpi4py, CUDA, and OpenACC code examples that students can run and modify directly in the browser. The books address a serious challenge to leaching PDC concepts, namely, easy access to the powerful hardware needed for observing patterns and scalability. This paper describes the content of these textbooks and the underlying infrastructure that make them possible. We believe the described textbooks fill a critical gap in PDC education and will be very useful for the community.

关键词： C C plus computing education interactive MPI OpenACC OpenMP parallel software textbook

来源：评论

学校读者我要写书评

暂无评论

GIM (Ghost In the Machine): A Coarse -Grained Reconfigurable Compute -In -Memory Platform for Exploring Machine -Learning Architectures

GIM (Ghost In the Machine): A Coarse -Grained Reconfigurable...

引用

1st international Conference on Smart Energy Systems and Artificial Intelligence (SESAI)

作者： Borowicz, Maya Ding, James Fan, Winnie Gao, Zhongqi Jackson, Davis Lu, Ares Rohlfsen, Sophia Simar, Ray Rice Univ Houston TX 77005 USA

ISBN: (纸本)9798350364613;9798350364606

Machine-learning (ML) algorithms are finding wide adoption across a rich spectrum of application domains with diverse requirements in terms of performance, power, and cost. These diverse requirements are making it necessary to explore a large space of ML architectures and reexamine fundamental computational structures, a process of exploration that is very expensive. To get around the costly computations associated with large data sets and long training times, there have been increasing investments in specialized fixed-function hardware. However, this specialized hardware is expensive and hard to generalize to address the spectrum of applications. For our experiments, we focus on a novel, highly parallel, superset ML architecture, and use it to test the capabilities of new coarse-grained FPGAs containing hundreds and thousands of DSP slices with dedicated local storage. These new coarse-grained architectures allow us to achieve ASIC-like clock rate and reductions in power while exploring novel and common ML architectures.

关键词： AMD architecture back propagation coarse grained compute in memory DSP FPGA machine learning neural network parallel processing Rice University RISC V Simulink systolic UltraScale Zync

来源：评论

学校读者我要写书评

暂无评论

Shared-Memory parallel Algorithms for Fully Dynamic Maintenance of 2-Connected Components 36

Shared-Memory Parallel Algorithms for Fully Dynamic Maintena...

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Haryan, Chirayu Anant Ramakrishna, G. Kothapalli, Kishore Banerjee, Dip Sankar Indian Inst Technol Tirupati Tirupati Andhra Pradesh India Int Inst Informat Technol Hyderabad Hyderabad India Indian Inst Technol Jodhpur Jodhpur Rajasthan India

ISBN: (纸本)9781665481069

Finding the biconnected components of a graph has a large number of applications in many other graph problems including planarity testing, computing the centrality metrics, finding the (weighted) vertex cover, coloring, and the like. Recent years saw the design of efficient algorithms for this problem across sequential and parallel computational models. However, current algorithms do not work in the setting where the underlying graph changes over time in a dynamic manner via the insertion or deletion of edges. the insertion or deletion of edges. Dynamic algorithms in the sequential setting that obtain the biconnected components of a graph upon insertion or deletion of a single edge are known from over two decades ago. parallel algorithms for this problem are not heavily studied. In this paper, we design shared-memory parallel algorithms that obtain the biconnected components of a graph subsequent to the insertion or deletion of a batch of edges. Our algorithms hence will be capable of exploiting the parallelism adduced due to a batch of updates. We implement our algorithms on an AMD EPYC 7742 CPU having 128 cores. Our experiments on a collection of 10 realworld graphs from multiple classes indicate that our algorithms outperform parallel state-of-the-art static algorithms.

关键词： Semiconductor device modeling Measurement distributed processing Heuristic algorithms Instruction sets Computational modeling Maintenance engineering

来源：评论

学校读者我要写书评

暂无评论

A Dynamic distributed Scheduler for DNN Inference on the Edge 1

A Dynamic Distributed Scheduler for DNN Inference on the Edg...

引用

1st international symposium on parallel Computing and distributed Systems, PCDS 2024

作者： Hu, Fei Mishra, Shivakant University of Colorado Department of Computer Science BoulderCO United States

ISBN: (纸本)9798350349658

Edge computing plays a pivotal role in IoT applications that require rapid and secure data processing. How-ever, these applications are typically resource-demanding, and the resources available at the edge are often significantly less than those available in cloud environments. Thus, optimally utilizing these constrained resources is key to fulfilling various application requirements such as low latency, privacy, and cost-effectiveness. Given the dynamic and hybrid characteristics of IoT environments, static scheduling approaches frequently fail to adequately address these complex and varying demands. This paper presents the design, implementation, and evaluation of a dynamic distributed scheduler tailored for DNN inferences on edge computing platforms, based on sophisticated sub-model profiling techniques. This scheduler's core feature is its ability to continuously monitor the state of the IoT infrastructure, dynam-ically adapting the distribution of computing tasks according to real-time environmental and network conditions. Diverging from previous research that typically segments DNN models into two parts, this study proposes a more granular approach that splits models into multiple segments, thereby maximizing the utilization of diverse edge computing resources. The implementation of this scheduler, including integration with a GPU, demonstrates its effectiveness across various DNN models, highlighting its practical utility in leveraging edge computing capabilities. © 2024 ieee.

关键词： Cloud platforms

来源：评论

学校读者我要写书评

暂无评论

Highly Scalable Large-Scale Asynchronous Graph processing using Actors 23

Highly Scalable Large-Scale Asynchronous Graph Processing us...

引用

23rd ieee/ACM international symposium on Cluster, Cloud and Internet Computing (CCGrid)

作者： Elmougy, Youssef Hayashi, Akihiro Sarkar, Vivek Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (纸本)9798350302080

With the accelerating growth of Big Data, real-world graph processing applications now need to tackle graphs with billions of vertices and trillions of edges, thereby increasing the demand for effective solutions to application scalability. Unfortunately, current approaches to implementing these applications on modern HPC systems exhibit poor scale-out performance with increasing numbers of nodes. The scalability challenges for these applications are driven by large data sizes, synchronization overheads, and fine-grained communications with irregular data accesses and poor locality. This paper presents the scalability of a novel Actor-based programming system, which provides a lightweight runtime that supports fine-grained asynchronous execution and automatic message aggregation atop a Partitioned Global Address Space (PGAS) communication layer. Evaluations of the Jaccard Index and PageRank applications on the NERSC Perlmutter system demonstrate nearly perfect scaling up to 1, 000 nodes and 64K cores (one-third of the targeted 3000-nodes for Perlmutter). In addition, our Actor-based implementations of Jaccard Index and PageRank executed with parallel efficiencies of 85.7% and 63.4% for the largest run of 64K cores. This performance represents a 29.6x speedup relative to UPC and OpenSHMEM versions of PageRank.

关键词： parallel Graph processing Actors Selectors distributed Systems Shared Memory PGAS OpenSHMEM

来源：评论

学校读者我要写书评

暂无评论

Accelerating CKKS Homomorphic Encryption with Data Compression on GPUs 67

Accelerating CKKS Homomorphic Encryption with Data Compressi...

引用

67th ieee international Midwest symposium on Circuits and Systems (MWSCAS)

作者： Phan, Quoc Bao Nguyen, Linh Nguyen, Tuy Tan No Arizona Univ Sch Informat Comp & Cyber Syst Flagstaff AZ 86011 USA

ISBN: (纸本)9798350387186;9798350387179

Homomorphic encryption (HE) algorithms, particularly the Cheon-Kim-Kim-Song (CKKS) scheme, offer significant potential for secure computation on encrypted data, making them valuable for privacy-preserving machine learning. However, high latency in large integer operations in the CKKS algorithm hinders the processing of large datasets and complex computations. This paper proposes a novel strategy that combines lossless data compression techniques with the parallel processing power of graphics processing units to address these challenges. Our approach demonstrably reduces data size by 90% and achieves significant speedups of up to 100 times compared to conventional approaches. This method ensures data confidentiality while mitigating performance bottlenecks in CKKS-based computations, paving the way for more efficient and scalable HE applications.

关键词： Homomorphic encryption graphics processing units data compression CKKS privacy-preserving

来源：评论

学校读者我要写书评

暂无评论

Essentials of parallel Graph Analytics 36

Essentials of Parallel Graph Analytics

引用

36th ieee international parallel and distributed processing symposium (ieee IPDPS)

作者： Osama, Muhammad Porumbescu, Serban D. Owens, John D. Univ Calif Davis Davis CA 95616 USA

ISBN: (纸本)9781665497473

We identify the graph data structure, frontiers, operators, an iterative loop structure, and convergence conditions as essential components of graph analytics systems based on the native-graph approach. Using these essential components, we propose an abstraction that captures all the significant programming models within graph analytics, such as bulksynchronous, asynchronous, shared-memory, message-passing, and push vs. pull traversals. Finally, we demonstrate the power of our abstraction with an elegant modern C++ implementation of single-source shortest path and its required components.

关键词： parallel graph analytics graph traversal algorithms

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 19 20 21 22 23 24 25 26 27 28 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：