检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

9,285 篇 会议
362 篇 期刊文献
33 册 图书
1 篇 学位论文

馆藏范围

9,681 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

4,567 篇 工学
- 4,178 篇 计算机科学与技术...
- 2,157 篇 软件工程
- 847 篇 电气工程
- 629 篇 信息与通信工程
- 313 篇 控制科学与工程
- 226 篇 电子科学与技术（可...
- 83 篇 网络空间安全
- 65 篇 机械工程
- 56 篇 石油与天然气工程
- 55 篇 材料科学与工程（可...
- 52 篇 仪器科学与技术
- 50 篇 生物医学工程（可授...
- 40 篇 动力工程及工程热...
- 34 篇 生物工程
- 32 篇 建筑学
- 31 篇 安全科学与工程
- 29 篇 环境科学与工程（可...
- 28 篇 力学（可授工学、理...
- 27 篇 土木工程
- 26 篇 光学工程
1,073 篇 理学
- 863 篇 数学
- 129 篇 统计学（可授理学、...
- 125 篇 系统科学
- 101 篇 物理学
- 48 篇 生物学
- 33 篇 化学
808 篇 管理学
- 626 篇 管理科学与工程(可...
- 296 篇 工商管理
- 220 篇 图书情报与档案管...
71 篇 经济学
- 68 篇 应用经济学
22 篇 法学
22 篇 医学
18 篇 农学
16 篇 文学
10 篇 教育学
6 篇 军事学
1 篇 艺术学

主题

1,213 篇 distributed data...
993 篇 distributed comp...
954 篇 parallel process...
780 篇 concurrent compu...
778 篇 computer science
693 篇 databases
616 篇 computer archite...
586 篇 application soft...
553 篇 computational mo...
464 篇 parallel process...
369 篇 scalability
359 篇 distributed comp...
352 篇 distributed proc...
327 篇 hardware
325 篇 database systems
296 篇 parallel program...
294 篇 processor schedu...
294 篇 costs
291 篇 resource managem...
269 篇 fault tolerance

机构

32 篇 ibm thomas j. wa...
20 篇 school of comput...
19 篇 oak ridge natl l...
15 篇 college of compu...
13 篇 oak ridge nation...
13 篇 oak ridge nation...
13 篇 pacific northwes...
12 篇 iit dept comp sc...
12 篇 lawrence berkele...
12 篇 argonne national...
12 篇 mathematics and ...
11 篇 department of co...
11 篇 georgia institut...
11 篇 department of co...
11 篇 mathematics and ...
11 篇 department of co...
11 篇 department of co...
11 篇 lawrence berkele...
10 篇 school of comput...
10 篇 lawrence berkele...

作者

21 篇 a. choudhary
15 篇 boukerche azzedi...
13 篇 dongarra jack
13 篇 sun xian-he
11 篇 hoefler torsten
11 篇 s.k. das
11 篇 jack dongarra
11 篇 kurt rothermel
10 篇 choudhary a
9 篇 raicu ioan
9 篇 jun zhang
9 篇 m. takizawa
9 篇 yong chen
9 篇 ciprian dobre
9 篇 l.r. welch
9 篇 welch lonnie r.
9 篇 t. kurc
9 篇 chen haibo
9 篇 florin pop
9 篇 cameron kirk w.

语言

9,597 篇 英文
63 篇 其他
21 篇 中文

检索条件"任意字段=International Symposium on Databases in Parallel and Distributed Systems"

共 9681 条记录，以下是141-150 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Research on Optimization of Communication Technology Based on Deep Learning 3

Research on Optimization of Communication Technology Based o...

引用

3rd international Conference on Electronics and Information Technology, EIT 2024

作者： Zhang, Zhaoxiang China Mobile Group Device Co. Ltd BeiJing100053 China

ISBN: (数字)9798350369151

ISBN: (纸本)9798350369151

distributed deep learning systems commonly use synchronous data parallelism to train models. However, communication overhead can be costly in distributed environments with limited communication bandwidth. To reduce communication overhead, an intuitive approach is to increase the communication interval. However, increasing the communication interval often affects the convergence speed of the model, requiring more epochs to train the model to the target accuracy, thereby reducing the statistical efficiency of the training algorithm. To address these issues in distributed synchronous data parallel training algorithms and communication interval selection methods, this paper proposes a training algorithm that combines skip communication strategies and correction techniques to ensure low communication overhead and high statistical efficiency. The training algorithm maintains a small batch size through local updates in each training process and employs correction techniques to reduce the divergence among individual local models, thereby ensuring high statistical efficiency. At the same time, the training algorithm utilizes a skip communication strategy to update the global model, meaning that the global model is not updated with every iteration, but rather updated once every few iterations. This reduces the frequency of communication, ensuring low communication overhead. Experimental results show that this training algorithm reduces total training time by 88.9%, 74.9%, 34.1%, and 82.4% compared to SSGD, SkipSSGD, LocalSGD, and SMA, respectively. © 2024 IEEE.

关键词： Training Deep learning distributed databases Bandwidth parallel processing Data models Communications technology Information technology Optimization Convergence

来源：评论

学校读者我要写书评

暂无评论

Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments 24

Apodotiko: Enabling Efficient Serverless Federated Learning ...

引用

24th IEEE/ACM international symposium on Cluster, Cloud, and Internet Computing (CCGrid)

作者： Chadha, Mohak Jensen, Alexander Gu, Jianfeng Abboud, Osama Gerndt, Michael Tech Univ Munich Chair Comp Architecture & Parallel Syst Garching Germany Huawei Technol Munich Germany

ISBN: (纸本)9798350395679;9798350395662

Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients while keeping the data decentralized. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies, particularly Function-as-a-Service (FaaS) for FL, can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders. However, current serverless FL systems still suffer from the presence of stragglers, i.e., slow clients that impede the collaborative training process. While strategies aimed at mitigating stragglers in these systems have been proposed, they overlook the diverse hardware resource configurations among FL clients. To this end, we present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client's hardware capacity and dataset size to intelligently prioritize and select clients for each training round, thereby minimizing the effects of stragglers on system performance. We comprehensively evaluate Apodotiko across diverse datasets, considering a mix of CPU and GPU clients, and compare its performance against five other FL training strategies. Results from our experiments demonstrate that Apodotiko outperforms other FL training strategies, achieving an average speedup of 2.75x and a maximum speedup of 7.03x. Furthermore, our strategy significantly reduces cold starts by a factor of four on average, demonstrating suitability in serverless environments.

关键词： Federated learning Deep learning Serverless computing Function-as-a-service Straggler mitigation

来源：评论

学校读者我要写书评

暂无评论

Direct solution of larger coupled sparse/dense linear systems using low-rank compression on single-node multi-core machines in an industrial context 36

Direct solution of larger coupled sparse/dense linear system...

引用

36th IEEE international parallel and distributed Processing symposium (IEEE IPDPS)

作者： Agullo, Emmanuel Felsoci, Marek Sylvand, Guillaume Inria Bordeaux Sud Ouest HiePACS Team Bordeaux France Airbus Cent R&T HiePACS Team Issy Les Moulineaux France

ISBN: (纸本)9781665481069

While hierarchically low-rank compression methods are now commonly available in both dense and sparse direct solvers, their usage for the direct solution of coupled sparse/dense linear systems has been little investigated. The solution of such systems is though central for the simulation of many important physics problems such as the simulation of the propagation of acoustic waves around aircrafts. Indeed, the heterogeneity of the jet flow created by reactors often requires a Finite Element Method (FEM) discretization, leading to a sparse linear system, while it may be reasonable to assume as homogeneous the rest of the space and hence model it with a Boundary Element Method (BEM) discretization, leading to a dense system. In an industrial context, these simulations are often operated on modern multicore workstations with fully-featured linear solvers. Exploiting their low-rank compression techniques is thus very appealing for solving larger coupled sparse/dense systems (hence ensuring a finer solution) on a given multicore workstation, and of course - possibly do it fast. The standard method performing an efficient coupling of sparse and dense direct solvers is to rely on the Schur complement functionality of the sparse direct solver. However, to the best of our knowledge, modern fully-featured sparse direct solvers offering this functionality return the Schur complement as a non compressed matrix. In this paper, we study the opportunity to process larger systems in spite of this constraint. For that we propose two classes of algorithms, namely multi-solve and multi-factorization, consisting in composing existing parallel sparse and dense methods on well chosen submatrices. An experimental study conducted on a 24 cores machine equipped with 128 GiB of RAM shows that these algorithms, implemented on top of state-of-the-art sparse and dense direct solvers, together with proper low-rank assembly schemes, can respectively process systems of 9 million and 2.5 million to

关键词： sparse and dense matrices large linear systems direct method parallel solvers low-rank compression Finite Elements Method (FEM) Boundary Elements Method (BEM) FEM/BEM coupling

来源：评论

学校读者我要写书评

暂无评论

Mnemonic: A parallel Subgraph Matching System for Streaming Graphs 36

Mnemonic: A Parallel Subgraph Matching System for Streaming ...

引用

36th IEEE international parallel and distributed Processing symposium (IEEE IPDPS)

作者： Bhattarai, Bibek Huang, Howie George Washington Univ Washington DC 20052 USA

ISBN: (纸本)9781665481069

Finding patterns in large highly connected datasets is critical for value discovery in business development and scientific research. This work focuses on the problem of subgraph matching on streaming graphs, which provides utility in a myriad of real-world applications ranging from social network analysis to cybersecurity. Each application poses a different set of control parameters, including the restrictions for a match, type of data stream, and search granularity. The problem-driven design of existing subgraph matching systems makes them challenging to apply for different problem domains. This paper presents Mnemonic, a programmable system that provides a high-level API and democratizes the development of a wide variety of subgraph matching solutions. Importantly, Mnemonic also delivers key data management capabilities and optimizations to support real-time processing on long-running, high-velocity multi-relational graph streams. The experiments demonstrate the versatility of Mnemonic, as it outperforms several state-of-the-art systems by up to two orders of magnitude.

关键词： subgraph graph pattern matching isomorphism streaming

来源：评论

学校读者我要写书评

暂无评论

Synchronous parallel multisplitting method with convergence acceleration using a local Krylov-based minimization for solving linear systems 36

Synchronous parallel multisplitting method with convergence ...

引用

36th IEEE international parallel and distributed Processing symposium (IEEE IPDPS)

作者： Tchakorom, Medane A. Couturier, Raphael Charr, Jean-Claude Univ Bourgogne Franche Comte UBFC CNRS FEMTO ST Inst Belfort France

ISBN: (纸本)9781665497473

Computer simulations of physical phenomena, such as heat transfer, often require the solution of linear equations. These linear equations occur in the form Ax = b, where A is a matrix, b is a vector, and x is the vector of unknowns. Iterative methods are the most adapted to solve large linear systems because they can be easily parallelized. This paper presents a variant of the multisplitting iterative method with convergence acceleration using the Krylov-based minimization method. This paper particularly focuses on improving the convergence speed of the method with an implementation based on the PETSc (Portable Extensible Toolkit for Scientific Computation) library. This was achieved by reducing the need for synchronization - data exchange - during the minimization process and adding a preconditioner before the multisplitting method. All experiments were performed either over one or two sites of the Grid5000 platform and up to 128 cores were used. The results for solving a 2D Laplacian problem of size 1024(2) components, show a speed up of up to 23X and 86X when respectively compared to the algorithm in [8] and to the general multisplitting implementation.

关键词： Iterative methods Krylov methods Multisplitting Linear Solvers PETSc MPI

来源：评论

学校读者我要写书评

暂无评论

LBCB: One-sided RDMA-based distributed B+ tree Index with Low Bandwidth Consumption 22

LBCB: One-sided RDMA-based Distributed B+ tree Index with Lo...

引用

22nd IEEE international symposium on parallel and distributed Processing with Applications, ISPA 2024

作者： Liu, Jibo Xi, Rui Cao, Qinzhen Nie, Xiaowen Hou, Zhuohan University of Electronic Science and Technology of China Chengdu China

ISBN: (纸本)9798331509712

Disaggregated memory architecture segregates computing and memory resources into distinct pools interconnected by a high-speed one-sided RDMA (Remote Direct Memory Access) network, enhancing memory utilization, reducing costs, and facilitating elastic scaling of computing and memory resources. However, optimizing index structures to maximize the benefits of this framework poses significant challenges. Despite progress in B+ tree indexes for disaggregated memory systems, they still suffer from severe write and read amplification issues, which impede latency and throughput *** this paper, we propose LBCB, a B+ tree index for disaggregated memory, which significantly reduces the bandwidth consumption of index operations. First, LBCB introduces an RDMA friendly B+ tree leaf node structure, which improves concurrency while reducing the bandwidth consumption. Second, LBCB designs a logical fusion FAA lock that synchronizes more information with a single RDMA communication, significantly reducing the number of RDMA network round trips. Finally, LBCB optimistically compresses the critical path of index operations, further optimizing latency and throughput. Evaluation results show that compared with the state-of-the-art distributed B+ tree index, the write performance of LBCB is improved by 1.42 times. © 2024 IEEE.

关键词： Memory architecture

来源：评论

学校读者我要写书评

暂无评论

CuGraph C++ primitives: Vertex/edge-centric building blocks for parallel graph computing

CuGraph C++ primitives: Vertex/edge-centric building blocks ...

引用

2023 IEEE international parallel and distributed Processing symposium Workshops, IPDPSW 2023

作者： Kang, Seunghwa Hastings, Chuck Eaton, Joe Rees, Brad NVIDIA United States

ISBN: (纸本)9798350311990

Software development of high-performance graph algorithms is difficult on modern parallel computers. To simplify this task, we have designed and implemented a collection of C++ graph primitives, basic building blocks, within cuGraph to assist graph analytics software developers on parallel computers, ranging from desktops to large clusters. This graph primitives API provides a vertex/edge-centric C++ Standard Template Library (STL)-like interface, allowing users to pick a primitive algorithm, and specify desired operations on vertices and edges and how to reduce the output of such operations through C++ functors. The API implementation is responsible for executing these functors on the underlying hardware. In this case, the graph primitives are implemented to run on NVIDIA GPU systems, from a single-GPU to multi-GPUs in a distributed cluster. RAPIDS cuGraph is NVIDIA's graph analytics solution for data scientists and software integrators. Algorithms in cuGraph are either implemented using the cuGraph C++ primitives API or being migrated over to using the primitives API. The Louvain and PageRank algorithms have been tested on clusters with over 1000 GPUs. © 2023 IEEE.

关键词： C++ (programming language)

来源：评论

学校读者我要写书评

暂无评论

DEAN: A Lightweight and Resource-efficient Blockchain Protocol for Reliable Edge Computing 36

DEAN: A Lightweight and Resource-efficient Blockchain Protoc...

引用

36th IEEE international parallel and distributed Processing symposium (IEEE IPDPS)

作者： Al-Mamun, Ahdullah Shen, Haoting Zhao, Dongfang Univ Nevada Reno NV 89557 USA

ISBN: (纸本)9781665481069

Edge computing draws a lot of recent research interests because of the performance improvement by offloading many workloads from the remote data center to nearby edge nodes. Nonetheless, one open challenge of this emerging paradigm lies in the potential security issues on edge nodes. This paper proposes a cooperative protocol, namely DEAN, equipped with a unique resource-efficient quorum building mechanism to adopt blockchain seamlessly in an edge computing infrastructure to prevent data manipulation and allow fair data sharing with quick recovery under resource constraints of limited storage, computing, and network capacity. Specifically, DEAN leverages a parallel mechanism equipped with three independent core components, effectively achieving low resource consumption while allowing secured parallel block processing on edge nodes. We have implemented a system prototype based on DEAN and experimentally verified its effectiveness with a comparison with four popular blockchain implementations: Ethereum, Parity, IOTA, and Hyperledger Fabric. Experimental results show that the system prototype exhibits high resilience to arbitrary failures. Performance-wise, DEAN-based blockchain implementation outperforms the state-of-the-art blockchain systems with up to 88.6x higher throughput and 26x lower latency.

关键词： Edge computing Blockchains distributed computing Consensus protocols

来源：评论

学校读者我要写书评

暂无评论

Accelerating BFT Database with Transaction Reconstruction

Accelerating BFT Database with Transaction Reconstruction

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Aoi Kida Hideyuki Kawashima Keio University Kanagawa Japan

ISBN: (数字)9798350364606

ISBN: (纸本)9798350364613

Data stores utilized in modern data-intensive applications are expected to demonstrate rapid read and write capabilities and robust fault tolerance. Byzantine fault-tolerant database (BFT database) can execute transactions concurrently and tolerate arbitrary faults (Byzantine fault). We consider cryptographic and communication processing as performance bottlenecks in the transaction processing of BFT databases. This paper presents a transaction reconstruction method, re-constructing a single transaction from multiple transactions to streamline cryptographic and communication processes. We evaluated the proposed method with Basil (state-of-the-art BFT database) in experiments. In an environment where nodes are geographically centralized, the proposed method demonstrates up to approximately 2.5 times higher throughput and reduces latency by up to about 30% than vanilla Basil. In an environment where nodes are geographically distributed, the proposed method demonstrates up to approximately 50 times higher throughput and reduces latency by up to about 75% than vanilla Basil.

关键词： Fault tolerance Data centers distributed processing databases Conferences Fault tolerant systems distributed databases

来源：评论

学校读者我要写书评

暂无评论

ADTopk: All-Dimension Top-k Compression for High-Performance Data-parallel DNN Training 24

ADTopk: All-Dimension Top-k Compression for High-Performance...

引用

33rd international symposium on High-Performance parallel and distributed Computing (HPDC)

作者： Ming, Zhangqiang Hu, Yuchong Zhou, Wenxiang Zheng, Xinjue Yao, Chenxuan Feng, Dan Huazhong Univ Sci & Technol Wuhan Hubei Peoples R China Huazhong Univ Sci & Technol Shenzhen Res Inst Shenzhen Guangdong Peoples R China

ISBN: (纸本)9798400704130

Data-parallel deep neural networks (DNN) training systems deployed across nodes have been widely used in various domains, while the system performance is often bottlenecked by the communication overhead among workers for synchronizing gradients. Top-k sparsification compression is the de facto approach to alleviate the communication bottleneck, which truncates the gradient to its largest.. elements before sending it to other nodes. However, we observe that the traditional Top-k still has performance issues: i) the gradient at each layer of a DNN is typically represented as a tensor of multiple dimensions, and the largest.. elements selected by the traditional Top-k are centered in only some of all dimensions and hence the training may miss many dimensions (we call dimension missing), which leads to low convergence performance;ii) the traditional Top-k performs the selection by globally sorting the gradient elements in each layer (we call single global sorting), which leads to a low GPU core parallelism and hence a low training throughput. In this paper, we propose an all-dimension Top-k sparsification scheme, called ADTopk, which selects the largest.. elements from all dimensions of the gradient tensor in each layer, meaning that each dimension must provide some elements, so as to avoid the dimension missing. Further, ADTopk enables each dimension to perform sorting locally within the elements of the dimension, and thus all dimensions can perform multiple local sortings independently and parallelly, instead of a single global sorting for the entire gradient tensor in each layer. On top of ADTopk, we further propose an interleaving compression scheme and an efficient threshold estimation algorithm so as to enhance the performance of ADTopk. We build a sparsification compression data-parallel DNN training framework and implement a compression library containing state-of-the-art sparsification algorithms. Experiments on a local cluster and Alibaba Cloud show that compa

关键词： High Performance Data-parallel DNN Training Gradient Sparsification Compression

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 11 12 13 14 15 16 17 18 19 20 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：