检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

4,668 篇 会议
194 篇 期刊文献
27 册 图书

馆藏范围

4,889 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,499 篇 工学
- 2,385 篇 计算机科学与技术...
- 1,470 篇 软件工程
- 419 篇 信息与通信工程
- 385 篇 电气工程
- 193 篇 控制科学与工程
- 159 篇 电子科学与技术（可...
- 75 篇 网络空间安全
- 33 篇 材料科学与工程（可...
- 33 篇 生物医学工程（可授...
- 26 篇 机械工程
- 26 篇 安全科学与工程
- 21 篇 光学工程
- 21 篇 动力工程及工程热...
- 21 篇 生物工程
- 20 篇 建筑学
- 19 篇 土木工程
- 17 篇 力学（可授工学、理...
- 17 篇 环境科学与工程（可...
- 15 篇 交通运输工程
792 篇 理学
- 669 篇 数学
- 103 篇 统计学（可授理学、...
- 89 篇 系统科学
- 66 篇 物理学
- 29 篇 生物学
- 19 篇 化学
451 篇 管理学
- 322 篇 管理科学与工程(可...
- 229 篇 工商管理
- 153 篇 图书情报与档案管...
51 篇 经济学
- 51 篇 应用经济学
18 篇 法学
- 17 篇 社会学
15 篇 医学
9 篇 教育学
9 篇 农学
5 篇 文学
4 篇 军事学

主题

564 篇 distributed comp...
540 篇 computer science
490 篇 parallel process...
485 篇 concurrent compu...
410 篇 parallel process...
379 篇 application soft...
342 篇 distributed data...
335 篇 distributed comp...
310 篇 databases
288 篇 computer archite...
249 篇 database systems
222 篇 computational mo...
220 篇 delay
215 篇 hardware
211 篇 costs
188 篇 processor schedu...
172 篇 protocols
170 篇 computer network...
162 篇 large-scale syst...
152 篇 parallel program...

机构

20 篇 ibm thomas j. wa...
11 篇 department of co...
10 篇 department of co...
8 篇 school of comput...
8 篇 department of co...
8 篇 department of co...
7 篇 ieee
7 篇 department of co...
7 篇 department of ee...
7 篇 department of co...
7 篇 college of compu...
7 篇 hewlett packard ...
7 篇 northwestern uni...
6 篇 department of co...
6 篇 syracuse univ sy...
6 篇 department of co...
6 篇 department of co...
6 篇 lawrence berkele...
6 篇 college of compu...
6 篇 univ of californ...

作者

20 篇 a. choudhary
16 篇 a. boukerche
10 篇 s.k. das
10 篇 boukerche azzedi...
10 篇 choudhary a
10 篇 li keqin
9 篇 j. saltz
9 篇 sun xian-he
9 篇 choudhary alok
9 篇 k. schwan
9 篇 das sajal k.
8 篇 l.r. welch
8 篇 a. makinouchi
8 篇 chen haibo
8 篇 anon
8 篇 a.a. chien
7 篇 c. katsinis
7 篇 t. kurc
7 篇 d.k. panda
7 篇 keqin li

语言

4,864 篇 英文
23 篇 其他
2 篇 中文

检索条件"任意字段=Proceedings International Symposium on Databases in Parallel and Distributed Systems"

共 4889 条记录，以下是151-160 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Private Information Retrieval Using Circulant Permutation Matrices or the Zero Matrix

Private Information Retrieval Using Circulant Permutation Ma...

引用

IEEE international symposium on Information Theory (ISIT)

作者： Su, Yi-Sheng Natl Chung Cheng Univ Dept Commun Engn Chiayi Taiwan

ISBN: (纸本)9781538682098

Private information retrieval (PIR) protocols allow a user to retrieve entries of a database without revealing the index of the desired item. Information-theoretical privacy can be achieved by the use of several servers and specific retrieval algorithms. In this paper, we investigate the problem of PIR under erasure-coded distributed storage systems and construct PIR protocols with optimal computational complexity for the servers, reasonable communication complexity, and low storage overhead. The proposed constructions also enjoy the advantages of low encoding complexity and low memory requirement for storing all possible queries for the user. More specifically, we concentrate on the study of using circulant permutation matrices or the zero matrix to construct PIR protocols for non-communicating servers.

关键词： Privacy Data privacy Protocols Memory management distributed databases Resists Information retrieval

来源：评论

学校读者我要写书评

暂无评论

Mathematics of Digital Hyperspace

Mathematics of Digital Hyperspace

引用

35th IEEE international parallel and distributed Processing symposium (IPDPS)

作者： Kepner, Jeremy Davis, Timothy Gadepally, Vijay Jananthan, Hayden Milechin, Lauren MIT Lincoln Lab Supercomp Ctr Cambridge MA 02139 USA MIT Comp Sci & AI Lab Cambridge MA 02139 USA MIT Math Dept Cambridge MA 02139 USA Texas A&M College Stn TX USA Vanderbilt Nashville TN USA MIT Dept Earth Atmospher & Planetary Sci Cambridge MA 02139 USA

ISBN: (纸本)9781665435772

Social media, e-commerce, streaming video, e-mail, cloud documents, web pages, traffic flows, and network packets fill vast digital lakes, rivers, and oceans that we each navigate daily. This digital hyperspace is an amorphous flow of data supported by continuous streams that stretch standard concepts of type and dimension. The unstructured data of digital hyperspace can be elegantly represented, traversed, and transformed via the mathematics of hypergraphs, hypersparse matrices, and associative array algebra. This paper explores a novel mathematical concept, the semilink, that combines pairs of semirings to provide the essential operations for graph analytics, database operations, and machine learning. The GraphBLAS standard currently supports hypergraphs, hypersparse matrices, the mathematics required for semilinks, and seamlessly performs graph, network, and matrix operations. With the addition of key based indices (such as pointers to strings) and semilinks, GraphBLAS can become a richer associative array algebra and be a plug-in replacement for spreadsheets, database tables, and data centric operating systems, enhancing the navigation of unstructured data found in digital hyperspace.

关键词： graphs hypergraphs hypersparse networks polystore databases algebra

来源：评论

学校读者我要写书评

暂无评论

Machine Learning for CUDA+MPI Design Rules 36

Machine Learning for CUDA+MPI Design Rules

引用

36th IEEE international parallel and distributed Processing symposium Workshops, IPDPSW 2022

作者： Pearson, Carl Javeed, Aurya Devine, Karen Sandia National Laboratories AlbuquerqueNM United States

ISBN: (纸本)9781665497473

We present a new strategy for automatically exploring the design space of key CUDA + MPI programs and providing design rules that discriminate slow from fast implementations. In such programs, the order of operations (e.g., G PU kernels, MPI communication) and assignment of operations to resources (e.g., G PU streams) makes the space of possible designs enormous. systems experts have the task of redesigning and reoptimizing these programs to effectively utilize each new platform. This work provides a prototype tool to reduce that burden. In our approach, a directed acyclic graph of CUDA and MPI operations defines the design space for the program. Monte-Carlo tree search discovers regions of the design space that have large impact on the program's performance. A sequence-to-vector transformation defines features for each explored im-plementation, and each implementation is assigned a class label according to its relative performance. A decision tree is trained on the features and labels to produce design rules for each class;these rules can be used by systems experts to guide their implementations. We demonstrate our strategy using a key kernel from scientific computing - sparse-matrix vector multiplication - on a platform with multiple MPI ranks and GPU streams. © 2022 IEEE.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Exponential Speedup over Locality in MPC with Optimal Memory 36

Exponential Speedup over Locality in MPC with Optimal Memory

引用

36th international symposium on distributed Computing, DISC 2022

作者： Balliu, Alkida Brandt, Sebastian Fischer, Manuela Latypov, Rustam Maus, Yannic Olivetti, Dennis Uitto, Jara Gran Sasso Science Institute L'Aquila Italy CISPA Helmholtz Center for Information Security Saarbrücken Germany ETH Zürich Switzerland Aalto University Espoo Finland TU Graz Austria

ISBN: (纸本)9783959772556

Locally Checkable Labeling (LCL) problems are graph problems in which a solution is correct if it satisfies some given constraints in the local neighborhood of each node. Example problems in this class include maximal matching, maximal independent set, and coloring problems. A successful line of research has been studying the complexities of LCL problems on paths/cycles, trees, and general graphs, providing many interesting results for the LOCAL model of distributed computing. In this work, we initiate the study of LCL problems in the low-space Massively parallel Computation (MPC) model. In particular, on forests, we provide a method that, given the complexity of an LCL problem in the LOCAL model, automatically provides an exponentially faster algorithm for the low-space MPC setting that uses optimal global memory, that is, truly linear. While restricting to forests may seem to weaken the result, we emphasize that all known (conditional) lower bounds for the MPC setting are obtained by lifting lower bounds obtained in the distributed setting in tree-like networks (either forests or high girth graphs), and hence the problems that we study are challenging already on forests. Moreover, the most important technical feature of our algorithms is that they use optimal global memory, that is, memory linear in the number of edges of the graph. In contrast, most of the state-of-the-art algorithms use more than linear global memory. Further, they typically start with a dense graph, sparsify it, and then solve the problem on the residual graph, exploiting the relative increase in global memory. On forests, this is not possible, because the given graph is already as sparse as it can be, and using optimal memory requires new solutions. © Alkida Balliu, Sebastian Brandt, Manuela Fischer, Rustam Latypov, Yannic Maus, Dennis Olivetti, and Jara Uitto.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory 55

Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inheren...

引用

55th Annual IEEE/ACM international symposium on Microarchitecture (MICRO)

作者： Park, Jisung Azizi, Roknoddin Oliveira, Geraldo F. Sadrosadati, Mohammad Nadig, Rakesh Novo, David Gomez-Luna, Juan Kim, Myungsuk Mutlu, Onur Swiss Fed Inst Technol Zurich Switzerland POSTECH Pohang South Korea Univ Montpellier LIRMM CNRS Montpellier France Kyttngpook Natl Univ Sangju South Korea

ISBN: (数字)9781665462723

ISBN: (纸本)9781665462723

Bulk bitwise operations, i.e., bitwise operations on large bit vectors, are prevalent in a wide range of important application domains, including databases, graph processing, genome analysis, cryptography, and hyper-dimensional computing. In conventional systems, the performance and energy efficiency of bulk bitwise operations are bottlenecked by data movement between the compute units (e.g., CPUs and GPUs) and the memory hierarchy. In-flash processing (i.e., processing data inside NAND flash chips) has a high potential to accelerate bulk bitwise operations by fundamentally reducing data movement through the entire memory hierarchy, especially when the processed data does not fit into main memory. We identify two key limitations of the state-of-the-art in-flash processing technique for bulk bitwise operations;(i) it falls short of maximally exploiting the bit-level parallelism of bulk bitwise operations that could be enabled by leveraging the unique cell-array architecture and operating principles of NAND flash memory;(ii) it is unreliable because it is not designed to take into account the highly error-prone nature of NAND flash memory. We propose Flash-Cosmos (Flash Computation with One-Shot Multi-Operand Sensing), a new in-flash processing technique that significantly increases the performance and energy efficiency of bulk bitwise operations while providing high reliability. Flash-Cosmos introduces two key mechanisms that can be easily supported in modern NAND flash chips: (i) MultiWordline Sensing (MWS), which enables bulk bitwise operations on a large number of operands (tens of operands) with a single sensing operation, and (ii) Enhanced SLC-mode Programming (ESP), which enables reliable computation inside NAND flash memory We demonstrate the feasibility of performing bulk bitwise operations with high reliability in Flash-Cosmos by testing 160 real 3D NAND flash chips. Our evaluation shows that Flash-Cosmos improves average performance and energy efficiency by

关键词： Three-dimensional displays Memory management parallel processing Programming Energy efficiency Sensors Reliability

来源：评论

学校读者我要写书评

暂无评论

12th IEEE international Workshop on Accelerators and Hybrid Emerging systems

Proceedings - 2022 IEEE 36th International Parallel and Dist...

引用

proceedings - 2022 IEEE 36th international parallel and distributed Processing symposium Workshops, IPDPSW 2022 2022年 369-370页

作者： Oden, Lena FernUni Hagen Germany

来源：评论

学校读者我要写书评

暂无评论

Caracal: Contention Management with Deterministic Concurrency Control 21

Caracal: Contention Management with Deterministic Concurrenc...

引用

28th ACM symposium on Operating systems Principles (SOSP)

作者： Qin, Dai Brown, Angela Demke Goel, Ashvin Univ Toronto Toronto ON Canada

ISBN: (纸本)9781450387095

Deterministic databases offer several benefits: they ensure serializable execution while avoiding concurrency-control related aborts, and they scale well in distributed environments. Today, most deterministic database designs use partitioning to scale up and avoid contention. However, partitioning requires significant programmer effort, leads to poor performance under skewed workloads, and incurs unnecessary overheads in certain uncontended workloads. We present the design of Caracal, a novel shared-memory, deterministic database that performs well under both skew and contention. Our deterministic scheme batches transactions in epochs and executes the transactions in an epoch in a predetermined order. Our scheme enables reducing contention by batching concurrency control operations. It also allows analyzing the transactions in the epoch to determine contended keys accurately. Certain transactions can then be split into independent contended and uncontended pieces and run deterministically and in parallel, further reducing contention. Based on these ideas, we present two novel optimizations, batch append and split-on-demand, for managing contention. With these optimizations, Caracal scaleswell and outperforms existing deterministic schemes in most workloads by 1.9x to 9.7x.

关键词： deterministic concurrency control contention main-memory databases

来源：评论

学校读者我要写书评

暂无评论

Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments

Apodotiko: Enabling Efficient Serverless Federated Learning ...

引用

IEEE/ACM international symposium on Cluster Computing and the Grid (CCGRID)

作者： Mohak Chadha Alexander Jensen Jianfeng Gu Osama Abboud Michael Gerndt Chair of Computer Architecture and Parallel Systems Technische Universität München Garching (near Munich) Germany Huawei Technologies Munich Germany

ISBN: (数字)9798350395662

ISBN: (纸本)9798350395679

Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients while keeping the data decentralized. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies, particularly Function-as-a-Service (FaaS) for FL, can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders. However, current serverless FL systems still suffer from the presence of stragglers, i.e., slow clients that impede the collaborative training process. While strategies aimed at mitigating stragglers in these systems have been proposed, they overlook the diverse hardware resource configurations among FL clients. To this end, we present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client’s hardware capacity and dataset size to intelligently prioritize and select clients for each training round, thereby minimizing the effects of stragglers on system performance. We comprehensively evaluate Apodotiko across diverse datasets, considering a mix of CPU and GPU clients, and compare its performance against five other FL training strategies. Results from our experiments demonstrate that Apodotiko outperforms other FL training strategies, achieving an average speedup of 2.75x and a maximum speedup of 7.03x. Furthermore, our strategy significantly reduces cold starts by a factor of four on average, demonstrating suitability in serverless environments.

关键词： Training Federated learning System performance Prevention and mitigation Collaboration Serverless computing Graphics processing units distributed databases Hardware Data models

来源：评论

学校读者我要写书评

暂无评论

Bandwidth Characterization of DeepSpeed on distributed Large Language Model Training

Bandwidth Characterization of DeepSpeed on Distributed Large...

引用

IEEE international symposium on Performance Analysis of systems and Software

作者： Bagus Hanindhito Bhavesh Patel Lizy K. John The University of Texas at Austin Austin Texas USA Dell Technologies Round Rock Texas USA

ISBN: (数字)9798350376388

ISBN: (纸本)9798350376395

The exponential growth of the training dataset and the size of the large language model (LLM) significantly outpaces the incremental memory capacity increase in the graphics pro-cessing units (GPUs). Thousands of GPUs are needed to handle state-of-the-art models, which require building an expensive AI GPU cluster that is out of reach for most researchers. This not only makes the cost to train the model more costly but also signifies the environmental impact. To improve the efficiency and scalability of existing infrastructure to handle increasingly demanding training tasks, Microsoft released DeepSpeed, an open-source optimization library for PyTorch that can easily be integrated into existing training flow with minimal code changes. This paper presents a comprehensive third-party evaluation of DeepSpeed for training GPT-2-like LLM on mainstream GPU clusters that are more accessible to everyone. The evaluation includes memory usage analysis and bandwidth characterization in addition to the achieved model size and the attained compute throughput to help compare horizontal and vertical scaling. First, we examine the DeepSpeed ZeRO in single- and dual-node training against the popular distributed training libraries: PyTorch distributed Data-parallel (DDP) with data parallelism and Megatron-LM with data and model parallelism. While DDP achieves higher throughput due to less communication, the model size is limited to a single GPU memory capacity. In single-node training, Megatron-LM can fit a 4x larger model than the DDP, while ZeRO can handle a model with 0.8x-l.2x size of the Megatron-LM. Both Megatron-LM and ZeRO are reasonably competitive in terms of throughput. However, in dual-node training, Megatron- Lmsees a significant drop in throughput due to the excessive inter-node communication, achieving only 25 %-30 % of the throughput offered by ZeRO. Secondly, we evaluate ZeRO-Offload to consolidate multi-node training into single-node. With CPU offloading, ZeRO-Offloa

关键词： Training Analytical models Nonvolatile memory Computational modeling Graphics processing units distributed databases Bandwidth

来源：评论

学校读者我要写书评

暂无评论

StencilFlow: Mapping Large Stencil Programs to distributed Spatial Computing systems 21

StencilFlow: Mapping Large Stencil Programs to Distributed S...

引用

19th IEEE/ACM international symposium on Code Generation and Optimization (CGO)

作者： Licht, Johannes de Fine Kuster, Andreas De Matteis, Tiziano Ben-Nun, Tal Hofer, Dominic Hoefler, Torsten Swiss Fed Inst Technol Dept Comp Sci Zurich Switzerland MeteoSwiss Zurich Switzerland

ISBN: (纸本)9781728186139

Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the general case of mapping directed acyclic graphs of heterogeneous stencil computations to spatial computing systems, assuming large input programs without an iterative component. StencilFlow maximizes temporal locality and ensures deadlock freedom in this setting, providing end-to-end analysis and mapping from a high-level program description to distributed hardware. We evaluate our generated architectures on a Stratix 10 FPGA testbed, yielding 1.31 TOp /s and 4.18 TOp/s on single-device and multi-device, respectively, demonstrating the highest performance recorded for stencil programs on FPGAs to date. We then leverage the framework to study a complex stencil program from a production weather simulation application. Our work enables productively targeting distributed spatial computing systems with large stencil programs, and offers insight into architecture characteristics required for their efficient execution in practice.

关键词： Mapping

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共489页 << < 12 13 14 15 16 17 18 19 20 21 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：