检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

8,715 篇 会议
174 篇 期刊文献
13 册 图书

馆藏范围

8,902 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

4,772 篇 工学
- 4,265 篇 计算机科学与技术...
- 1,771 篇 软件工程
- 978 篇 电气工程
- 737 篇 信息与通信工程
- 291 篇 控制科学与工程
- 212 篇 电子科学与技术（可...
- 109 篇 生物医学工程（可授...
- 105 篇 生物工程
- 101 篇 机械工程
- 84 篇 仪器科学与技术
- 73 篇 动力工程及工程热...
- 66 篇 网络空间安全
- 58 篇 力学（可授工学、理...
- 43 篇 安全科学与工程
- 39 篇 材料科学与工程（可...
- 35 篇 交通运输工程
- 35 篇 环境科学与工程（可...
982 篇 理学
- 626 篇 数学
- 209 篇 物理学
- 124 篇 系统科学
- 117 篇 生物学
- 101 篇 统计学（可授理学、...
- 52 篇 化学
564 篇 管理学
- 420 篇 管理科学与工程(可...
- 184 篇 图书情报与档案管...
- 170 篇 工商管理
88 篇 医学
- 58 篇 临床医学
- 45 篇 基础医学(可授医学...
55 篇 法学
- 35 篇 社会学
44 篇 经济学
- 43 篇 应用经济学
22 篇 农学
16 篇 教育学
16 篇 文学
4 篇 军事学
3 篇 艺术学

主题

1,122 篇 distributed comp...
750 篇 concurrent compu...
594 篇 memory managemen...
553 篇 computational mo...
538 篇 computer archite...
470 篇 parallel process...
426 篇 cloud computing
371 篇 computer science
340 篇 random access me...
317 篇 scalability
297 篇 memory
295 篇 application soft...
269 篇 hardware
249 篇 bandwidth
246 篇 servers
244 篇 message passing
234 篇 costs
228 篇 resource managem...
221 篇 high performance...
220 篇 runtime

机构

26 篇 institute of com...
19 篇 university of ch...
17 篇 georgia inst tec...
14 篇 tsinghua univers...
13 篇 school of comput...
13 篇 oak ridge natl l...
13 篇 ohio state univ ...
11 篇 lawrence berkele...
11 篇 univ chinese aca...
11 篇 pacific northwes...
11 篇 the islamic univ...
10 篇 college of compu...
10 篇 ibm thomas j. wa...
10 篇 lawrence berkele...
10 篇 ohio state univ ...
10 篇 institute of inf...
10 篇 department of co...
10 篇 tsinghua univ pe...
10 篇 college of compu...
10 篇 intel corporatio...

作者

18 篇 hai jin
17 篇 p. banerjee
14 篇 liu jie
13 篇 li dongsheng
13 篇 jin hai
11 篇 a. choudhary
10 篇 dongarra jack
10 篇 p. sadayappan
10 篇 wang wei
9 篇 hoefler torsten
9 篇 dongsheng li
9 篇 liu yang
9 篇 dou yong
9 篇 lai zhiquan
9 篇 nong xiao
9 篇 j. saltz
8 篇 beineke kevin
8 篇 zhiyi huang
8 篇 lumsdaine andrew
8 篇 mitsuhisa sato

语言

8,800 篇 英文
72 篇 其他
28 篇 中文
4 篇 俄文
1 篇 西班牙文
1 篇 葡萄牙文
1 篇 土耳其文

检索条件"任意字段=Distributed Memory Computing Conference"

共 8902 条记录，以下是11-20 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Reservoir Echo State Network for Classification of Multivariate Time Series 30

Reservoir Echo State Network for Classification of Multivari...

引用

30th International conference on High Performance computing, Data, and Analytics (HiPC)

作者： Purkayastha, Basab Bijoy Barma, Shovan Indian Inst Technol Guwahati Dept Phys Gauhati India Indian Inst Informat Technol Guwahati Dept ECE Gauhati India

ISBN: (纸本)9798350383782;9798350383799

Multivariate time series (MTS) classification has been tackled using various methods, including Reservoir computing (RC), which generates efficient vectorized representations like reservoir state (RS). RS shines when handling extensive classes or training sets but demands longer processing and substantial memory. Addressing this, in this study we present the Parallel Reservoir Echo State Network (PR-ESN), an optimized parallel training and evaluation algorithm rooted in the ESN principle. It leverages both CPU-shared memory and parallel distributed memory architecture to efficiently capture reservoir state's optimal model space representation, addressing computational challenges in MTS analysis. Distinguishing itself from previous works, PR-ESN combines distributed parallel processing at the network level and shared memory multiprocessing at the node level. This results in reduced memory requirements and faster processing, making it a significant contribution to the field. Key features include PR-ESN's distributed training and evaluation, shared memory parallelization, and MSR concatenation for comprehensive analysis of distributed model space representations. Testing on real-world MTS and benchmark ECG data proves PR-ESN-based classifiers achieve superior accuracy and faster processing times with optimal memory usage. Testing on real-world MTS and benchmark ECG data proves PR-ESN-based classifiers achieve superior accuracy and faster processing times with optimal memory usage.

关键词： Shared memory Parallelism distributed memory Parallelism Echo State Network Reservoir computing Parallel computing Multivariate Time Series Classification

来源：评论

学校读者我要写书评

暂无评论

MegaMmap: Blurring the Boundary Between memory and Storage for Data-Intensive Workloads 24

MegaMmap: Blurring the Boundary Between Memory and Storage f...

引用

2024 International conference for High Performance computing, Networking, Storage and Analysis

作者： Logan, Luke Kougkas, Anthony Sun, Xian-He IIT Chicago IL 60616 USA

ISBN: (数字)9798350352917

ISBN: (纸本)9798350352924;9798350352917

Large-scale data analytics, scientific simulation, and deep learning codes in HPC perform massive computations on data greatly exceeding the bounds of main memory. These out-of-core algorithms suffer from severe data movement penalties, programming complexity, and limited code reuse. To solve this, HPC sites have steadily increased DRAM capacity. However, this is not sustainable due to financial and environmental costs. A more elegant, low-cost, and portable solution is to expand memory to distributed multi-tiered storage. In this work, we propose MegaMmap: a software distributed shared memory (DSM) that enlarges effective memory capacity through intelligent tiered DRAM and storage management. MegaMmap provides workload-aware data organization, eviction, and prefetching policies to reduce DRAM consumption while ensuring speedy access to critical data. A variety of memory coherence optimizations are provided through an intuitive hinting system. Evaluations show that various workloads can be executed with a fraction of the DRAM while offering competitive performance.

关键词： HPC Systems Software memory Tiering Storage Tiering

来源：评论

学校读者我要写书评

暂无评论

A Sparsity-Aware distributed-memory Algorithm for Sparse-Sparse Matrix Multiplication 24

A Sparsity-Aware Distributed-Memory Algorithm for Sparse-Spa...

引用

2024 International conference for High Performance computing, Networking, Storage and Analysis

作者： Hong, Yuxi Buluc, Aydin Univ Calif Berkeley Appl Math & Computat Res Div Berkeley CA 94720 USA

ISBN: (数字)9798350352917

ISBN: (纸本)9798350352924;9798350352917

Multiplying two sparse matrices (SpGEMM) is a common computational primitive used in many areas including graph algorithms, bioinformatics, algebraic multigrid solvers, and randomized sketching. distributed-memory parallel algorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that use 2D and 3D partitioning. Sparsity-aware 1D algorithms can theoretically reduce communication by not fetching nonzeros of the sparse matrices that do not participate in the multiplication. Here, we present a distributed-memory 1D SpGEMM algorithm and implementation. It uses MPI RDMA operations to mitigate the cost of packing/unpacking submatrices for communication, and it uses a block fetching strategy to avoid excessive fine-grained messaging. Our results show that our 1D implementation outperforms state-of-the-art 2D and 3D implementations within CombBLAS for many configurations, inputs, and use cases, while remaining conceptually simpler.

关键词： parallel computing numerical linear algebra sparse matrix-matrix multiplication SpGEMM RDMA 1D algorithm sparsity-aware 1D SpGEMM algorithm 1D SpGEMM algorithm

来源：评论

学校读者我要写书评

暂无评论

Accelerating distributed DLRM Training with Optimized TT Decomposition and Micro-Batching 24

Accelerating Distributed DLRM Training with Optimized TT Dec...

引用

2024 International conference for High Performance computing, Networking, Storage and Analysis

作者： Wang, Weihu Xia, Yaqi Yang, Donglin Zhou, Xiaobo Cheng, Dazhao Wuhan Univ Sch Comp Sci Wuhan Peoples R China NVIDIA Santa Clara CA USA Univ Macau IOTSC Taipa Macao Peoples R China Univ Macau Dept CIS Taipa Macao Peoples R China

ISBN: (数字)9798350352917

ISBN: (纸本)9798350352924;9798350352917

Deep Learning Recommendation Models (DLRMs) are pivotal in various sectors, yet they are hindered by the high memory demands of embedding tables and the significant communication overhead in distributed training environments. Traditional approaches, like Tensor-Train (TT) decomposition, although effective for compressing these tables, introduce substantial computational burdens. Furthermore, existing frameworks for distributed training are inadequate due to the excessive data exchange requirements. This paper proposes EcoRec, an advanced library designed to expedite the training of DLRMs through a synergistic integration of TT decomposition technology and distributed training. EcoRec introduces a novel computation pattern that eliminates redundancy in TT operations, alongside an efficient multiplication pathway, significantly reducing computational time. Additionally, it provides a unique micro-batching technique with sorted indices to decrease memory demands without additional computational costs. EcoRec also features a novel pipeline training system for embedding layers, ensuring balanced data distribution and enhanced communication efficiency. EcoRec, built on PyTorch and CUDA, has been evaluated on a 32 GPU cluster. The results show EcoRec significantly outperforms the existing ELRec system, achieving up to a 3.1x speedup and a 38.5% reduction in memory requirements. EcoRec marks a notable advancement in high-performance DLRM training.

关键词： Training Deep learning High performance computing Pipelines Redundancy memory management Graphics processing units distributed databases Libraries Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

Asynchronous distributed-memory Parallel Algorithms for Influence Maximization 24

Asynchronous Distributed-Memory Parallel Algorithms for Infl...

引用

2024 International conference for High Performance computing, Networking, Storage and Analysis

作者： Singhal, Shubhendra Pal Hati, Souvadra Young, Jeffrey Sarkar, Vivek Hayashi, Akihiro Vuduc, Richard Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (数字)9798350352917

ISBN: (纸本)9798350352924;9798350352917

Influence maximization (IM) is the problem of finding the k most influential nodes in a graph. We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM algorithm, influence maximization via martingales (IMM). The baseline relies on a bulk-synchronous parallel approach and uses replication to reduce communication and achieve approximate load balance, at the cost of synchronization and high memory requirements. By contrast, our method fully distributes the data, thereby improving memory scalability, and uses fine-grained asynchronous parallelism to improve network utilization and the cost of doing more communication. We show our design and implementation can achieve up to 29.6x speedup over the MPI-based state-of-the-art on synthetic and real-world network graphs. Moreover, ours is the first implementation that can run IMM to find influencers in the 'twitter' graph (41M nodes and 1.4B edges) in 200 seconds using 8K CPU cores of NERSC Perlmutter supercomputer.

关键词： Influence maximization FA-BSP PGAS IMM

来源：评论

学校读者我要写书评

暂无评论

Performance Analysis of the NICAM Benchmark on MN-Core Processor

Performance Analysis of the NICAM Benchmark on MN-Core Proce...

引用

2024 Workshops of the International conference for High Performance computing, Networking, Storage and Analysis, SC Workshops 2024

作者： Takayashiki, Hikaru Saito, Natsuko Imachi, Hiroto Sakamoto, Ryo Makino, Junichiro Fixstars Corporation Tokyo Japan Preferred Networks Inc. Tokyo Japan Kobe University Kobe Japan

ISBN: (纸本)9798350355543

Large-scale Computational Fluid Dynamics (CFD) simulations are typical HPC applications that require both high memory bandwidth and large memory capacity. However, it is difficult to achieve high performance for such applications on modern high-performance processors due to their low memory bandwidth compared to their high computational power. Near-memory computing can overcome this problem by placing on-chip memory near arithmetic units and reducing off-chip accesses. MN-Core is a distributed memory SIMD processor with each core having its own addressable memory, realizing a near-memory computing processor. MN-Core can be an attractive platform for executing bandwidth-demanding HPC applications. This paper reports the performance of MN-Core for three kernels from the NICAM benchmark, taken from NICAM global climate model. The evaluation results show that MN-Core realizes 986 GFLOPS at the maximum, which is 13.4% of its peak performance. This efficiency is comparable to those obtained on CPUs with high memory bandwidth, such as Fujitsu A64FX. © 2024 IEEE.

关键词： Accelerator distributed memory HPC Near-memory computing SIMD

来源：评论

学校读者我要写书评

暂无评论

Distributing Context-Aware Shared memory Data Structures: A Case Study on Singly-Linked Lists 25

Distributing Context-Aware Shared Memory Data Structures: A ...

引用

26th International conference on distributed computing and Networking, ICDCN 2025

作者： Ravishankar, Raaghav Kulkarni, Sandeep Peri, Sathya Sharma, Gokarna Michigan State University East LansingMI United States Indian Institute of Technology Hyderabad Hyderabad India Kent State University KentWA United States

ISBN: (纸本)9798400710629

In this paper, we study the partitioning of a context-aware shared memory data structure so that it can be implemented as a distributed data structure running on multiple machines. By context-aware data structures, we mean that the result of an operation not only depends upon the value of the shared data but also upon the previous operations performed by the same client. While there is substantial work on designing distributed data structures, designing distributed context-aware data structures has not received much attention. As a case study of the context-aware data structure, we transform a shared memory context-aware lock-free singly-linked list into a distributed lock-free context-aware singly-linked list. We present two protocols that preserve these properties of client-visible operations of the linked list. In both protocols, the client-visible operations remain lock-free. Also, our transformation approach does not utilize any hardware primitives (except a compare-and-swap operation on a single word). We note that our transformation is generic and can be used for other lock-free context-aware data structures that can be constructed from singly-linked lists. © 2025 Copyright held by the owner/author(s).

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Scratchpad Spiking Neural Network Accelerator 3

A Scratchpad Spiking Neural Network Accelerator

引用

3rd International conference on computing and Machine Intelligence (ICMI)

作者： Karakchi, Rasha Univ South Carolina Dept Comp Sci & Engn Columbia SC 29208 USA

ISBN: (纸本)9798350372977;9798350372984

Spiking Neural Networks (SNNs) have recently been used as a computational model for applications such as deep learning, image recognition and machine learning. Similar to the biological brain, SNN neurons depend on the membrane level to fire an output. If the level exceeds a specified threshold, the neuron sends an output to activate next neurons. This leads to an unbalanced workload among the neurons. The dynamically-changing membrane level is stored inside a neuron. In hardware, this storage can be implemented as a register or on-chip memory, which determines the amount of consumed resources and, in turns, affects the network scalability. SNN accelerators have recently been implemented on UltraScale FPGAs devices for high-performance purposes. On-chip memories on these devices are classified as distributed memory, Block RAMs (BRAMs) and Ultra RAMs (URAMs). In this paper, we explored the impact of using different on-chip memories to store the membrane level of SNN neurons. We implemented a parameterizable SpIking Neural networK (SINK) accelerator where the network capacity and weight width are parameters. SINK has the ability to run in four different modes based on the memory type. We ran SINK on UltraScale Zync104 FPGA device and measure the utilization of the hardware resources (LUTs), registers, memory, power consumption and performance. The results show that URAM can be the best fit to store the membrane level, since it used 30%, 11% and 2% less LUTs, Regs, and power 2% respectively comparing with BRAM and distributed memory

关键词： Spiking Neural Network FPGA Onchip memory URAM BRAM distributed memory

来源：评论

学校读者我要写书评

暂无评论

Batch Updates of distributed Streaming Graphs using Linear Algebra

Batch Updates of Distributed Streaming Graphs using Linear A...

引用

2024 Workshops of the International conference for High Performance computing, Networking, Storage and Analysis, SC Workshops 2024

作者： Hassani, Elaheh Hussain, Md Taufique Azad, Ariful Indiana University Dept. of Intelligent Systems Engineering BloomingtonIN United States

ISBN: (纸本)9798350355543

We develop a distributed-memory parallel algorithm for performing batch updates on streaming graphs, where vertices and edges are continuously added or removed. Our algorithm leverages distributed sparse matrices as the core data structures, utilizing equivalent sparse matrix operations to execute graph updates. By reducing unnecessary communication among processes and employing shared-memory parallelism, we accelerate updates of distributed graphs. Additionally, we maintain a balanced load in the output matrix by permuting the resultant matrix during the update process. We demonstrate that our streaming update algorithm is at least 25 times faster than alternative linear-algebraic methods and scales linearly up to 4,096 cores (32 nodes) on a Cray EX supercomputer. © 2024 IEEE.

关键词： batch graph updates distributed-memory algorithms parallel computing parallel graph algorithms scalability in graph processing sparse matrices streaming graphs

来源：评论

学校读者我要写书评

暂无评论

FastStore: A High-Performance RDMA-enabled distributed Key-Value Store with Persistent memory 43

FastStore: A High-Performance RDMA-enabled Distributed Key-V...

引用

43rd IEEE International conference on distributed computing Systems (ICDCS)

作者： Xiong, Ziwei Jiang, Dejun Xiong, Jin Chinese Acad Sci Inst Comp Technol State Key Lab Processors Beijing Peoples R China Chinese Acad Sci Inst Comp Technol Res Ctr Adv Comp Syst Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China

ISBN: (纸本)9798350339864

distributed persistent key-value store (KVS) plays an important role in today's storage infrastructure. The development of persistent memory (PM) and remote direct memory access (RDMA) allows to build distributed persistent KVS to provide fast data access. However, prior works focus on either PM-oriented or RDMA-oriented optimizations for key-value stores. We find these optimizations disallow a simple porting of RDMA-enabled KVS to PM or vice versa. This paper proposes FastStore, a high-performance distributed persistent KVS, by fully exploiting RDMA features and PM-friendly optimizations. First, FastStore utilizes RDMA-enabled PM exposure to establish direct indexing at the client side to reduce RTTs for reading values. Meanwhile, PM exposure allows PM sharing among cluster nodes, which helps to mitigate attribute-value skewness. Then, FastStore designs PM-friendly ownership transferring log and failure-atomic slotted-page allocator to achieve highly efficient PM management without PM leakage. Finally, FastStore proposes volatile search key to its B+tree indexing to reduce excessive PM accesses. We implement FastStore and the evaluation shows that FastStore outperforms the state-of-the-art ordered KVS Sherman by 2.8x higher throughput and 71.5% fewer RTTs.

关键词： distributed Key-Value Store RDMA Persistent memory

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：