检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

4,668 篇 会议
194 篇 期刊文献
27 册 图书

馆藏范围

4,889 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,499 篇 工学
- 2,385 篇 计算机科学与技术...
- 1,470 篇 软件工程
- 419 篇 信息与通信工程
- 385 篇 电气工程
- 193 篇 控制科学与工程
- 159 篇 电子科学与技术（可...
- 75 篇 网络空间安全
- 33 篇 材料科学与工程（可...
- 33 篇 生物医学工程（可授...
- 26 篇 机械工程
- 26 篇 安全科学与工程
- 21 篇 光学工程
- 21 篇 动力工程及工程热...
- 21 篇 生物工程
- 20 篇 建筑学
- 19 篇 土木工程
- 17 篇 力学（可授工学、理...
- 17 篇 环境科学与工程（可...
- 15 篇 交通运输工程
792 篇 理学
- 669 篇 数学
- 103 篇 统计学（可授理学、...
- 89 篇 系统科学
- 66 篇 物理学
- 29 篇 生物学
- 19 篇 化学
451 篇 管理学
- 322 篇 管理科学与工程(可...
- 229 篇 工商管理
- 153 篇 图书情报与档案管...
51 篇 经济学
- 51 篇 应用经济学
18 篇 法学
- 17 篇 社会学
15 篇 医学
9 篇 教育学
9 篇 农学
5 篇 文学
4 篇 军事学

主题

564 篇 distributed comp...
540 篇 computer science
490 篇 parallel process...
485 篇 concurrent compu...
410 篇 parallel process...
379 篇 application soft...
342 篇 distributed data...
335 篇 distributed comp...
310 篇 databases
288 篇 computer archite...
249 篇 database systems
222 篇 computational mo...
220 篇 delay
215 篇 hardware
211 篇 costs
188 篇 processor schedu...
172 篇 protocols
170 篇 computer network...
162 篇 large-scale syst...
152 篇 parallel program...

机构

20 篇 ibm thomas j. wa...
11 篇 department of co...
10 篇 department of co...
8 篇 school of comput...
8 篇 department of co...
8 篇 department of co...
7 篇 ieee
7 篇 department of co...
7 篇 department of ee...
7 篇 department of co...
7 篇 college of compu...
7 篇 hewlett packard ...
7 篇 northwestern uni...
6 篇 department of co...
6 篇 syracuse univ sy...
6 篇 department of co...
6 篇 department of co...
6 篇 lawrence berkele...
6 篇 college of compu...
6 篇 univ of californ...

作者

20 篇 a. choudhary
16 篇 a. boukerche
10 篇 s.k. das
10 篇 boukerche azzedi...
10 篇 choudhary a
10 篇 li keqin
9 篇 j. saltz
9 篇 sun xian-he
9 篇 choudhary alok
9 篇 k. schwan
9 篇 das sajal k.
8 篇 l.r. welch
8 篇 a. makinouchi
8 篇 chen haibo
8 篇 anon
8 篇 a.a. chien
7 篇 c. katsinis
7 篇 t. kurc
7 篇 d.k. panda
7 篇 keqin li

语言

4,864 篇 英文
23 篇 其他
2 篇 中文

检索条件"任意字段=Proceedings International Symposium on Databases in Parallel and Distributed Systems"

共 4889 条记录，以下是111-120 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

21st international symposium on Automated Technology for Verification and Analysis, ATVA 2023

21st International Symposium on Automated Technology for Ver...

引用

21st international symposium on Automated Technology for Verification and Analysis, ATVA 2023

ISBN: (纸本)9783031453281

The proceedings contain 38 papers. The special focus in this conference is on Automated Technology for Verification and Analysis. The topics include: Model Checking Strategies from Synthesis over Finite Traces;reactive Synthesis of Smart Contract Control Flows;synthesis of distributed Protocols by Enumeration Modulo Isomorphisms;controller Synthesis for Reactive systems with Communication Delay by Formula Translation;statistical Approach to Efficient and Deterministic Schedule Synthesis for Cyber-Physical systems;compositional High-Quality Synthesis;learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic systems;an Automata-Theoretic Approach to Synthesizing Binarized Neural Networks;syntactic vs Semantic Linear Abstraction and Refinement of Neural Networks;learning Nonlinear Hybrid Automata from Input–Output Time-Series Data;using Counterexamples to Improve Robustness Verification in Neural Networks;a Novel Family of Finite Automata for Recognizing and Learning ω -Regular Languages;on the Containment Problem for Deterministic Multicounter Machine Models;parallel and Incremental Verification of Hybrid Automata with Ray and Verse;an Automata Theoretic Characterization of Weighted First-Order Logic;Graph-Based Reductions for Parametric and Weighted MDPs;scenario Approach for Parametric Markov Models;Fast Verified SCCs for Probabilistic Model Checking.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Bandwidth Characterization of DeepSpeed on distributed Large Language Model Training

Bandwidth Characterization of DeepSpeed on Distributed Large...

引用

2024 IEEE international symposium on Performance Analysis of systems and Software, ISPASS 2024

作者： Hanindhito, Bagus Patel, Bhavesh John, Lizy K. The University of Texas at Austin AustinTX United States Dell Technologies Round RockTX United States

ISBN: (纸本)9798350376388

The exponential growth of the training dataset and the size of the large language model (LLM) significantly outpaces the incremental memory capacity increase in the graphics pro-cessing units (GPUs). Thousands of GPUs are needed to handle state-of-the-art models, which require building an expensive AI GPU cluster that is out of reach for most researchers. This not only makes the cost to train the model more costly but also signifies the environmental impact. To improve the efficiency and scalability of existing infrastructure to handle increasingly demanding training tasks, Microsoft released DeepSpeed, an open-source optimization library for PyTorch that can easily be integrated into existing training flow with minimal code changes. This paper presents a comprehensive third-party evaluation of DeepSpeed for training GPT-2-like LLM on mainstream GPU clusters that are more accessible to everyone. The evaluation includes memory usage analysis and bandwidth characterization in addition to the achieved model size and the attained compute throughput to help compare horizontal and vertical scaling. First, we examine the DeepSpeed ZeRO in single- and dual-node training against the popular distributed training libraries: PyTorch distributed Data-parallel (DDP) with data parallelism and Megatron-LM with data and model parallelism. While DDP achieves higher throughput due to less communication, the model size is limited to a single GPU memory capacity. In single-node training, Megatron-LM can fit a 4x larger model than the DDP, while ZeRO can handle a model with 0.8x-l.2x size of the Megatron-LM. Both Megatron-LM and ZeRO are reasonably competitive in terms of throughput. However, in dual-node training, Megatron- Lmsees a significant drop in throughput due to the excessive inter-node communication, achieving only 25 %-30 % of the throughput offered by ZeRO. Secondly, we evaluate ZeRO-Offload to consolidate multi-node training into single-node. With CPU offloading, ZeRO-Offloa

关键词： Bandwidth

来源：评论

学校读者我要写书评

暂无评论

An architecture interface and offload model for low-overhead, near-data, distributed accelerators 55

An architecture interface and offload model for low-overhead...

引用

55th Annual IEEE/ACM international symposium on Microarchitecture (MICRO)

作者： Baskaran, Saambhavi Kandemir, Mahmut Taylan Sampson, Jack Penn State Univ University Pk PA 16802 USA

ISBN: (数字)9781665462723

ISBN: (纸本)9781665462723

The performance and energy costs of coordinating and performing data movement have led to proposals adding compute units and/or specialized access units to the memory hierarchy. However, current on-chip offload models are restricted to fixed compute and access pattern types, which limits software-driven optimizations and the applicability of such an offload interface to heterogeneous accelerator resources. This paper presents a computation offload interface for multi-core systems augmented with distributed on-chip accelerators. With energy-efficiency as the primary goal, we define mechanisms to identify offload partitioning, create a low-overhead execution model to sequence these fine-grained operations, and evaluate a set of workloads to identify the complexity needed to achieve distributed near-data execution. We demonstrate that our model and interface, combining features of dataflow in parallel with near-data processing engines, can be profitably applied to memory hierarchies augmented with either specialized compute substrates or lightweight near-memory cores. We differentiate the benefits stemming from each of elevating data access semantics, near-data computation, inter-accelerator coordination, and compute/access logic specialization. Experimental results indicate a geometric mean (energy efficiency improvement;speedup;data movement reduction) of (3.3;1.59;2.4)x, (2.46;1.43;3.5)x and (1.46;1.65;1.48)x compared to an out-of-order processor, monolithic accelerator with centralized accesses and monolithic accelerator with decentralized accesses, respectively. Evaluating both lightweight core and CGRA fabric implementations highlights model flexibility and quantifies the benefits of compute specialization for energy efficiency and speedup at 1.23x and 1.43x, respectively.

关键词： distributed accelerator near-data offload energy efficiency heterogeneous architecture interface

来源：评论

学校读者我要写书评

暂无评论

Machine Learning Assisted HPC Workload Trace Generation for Leadership Scale Storage systems 22

Machine Learning Assisted HPC Workload Trace Generation for ...

引用

31st international symposium on High-Performance parallel and distributed Computing (HPDC)

作者： Paul, Arnab K. Choi, Jong Youl Karimi, Ahmad Maroof Wang, Feiyi Oak Ridge Natl Lab Oak Ridge TN 37830 USA BITS Pilani K K Birla Goa Campus Zuarinagar Goa India

ISBN: (纸本)9781450391993

Monitoring and analyzing a wide range of I/O activities in an HPC cluster is important in maintaining mission-critical performance in a large-scale, multi-user, parallel storage system. Center-wide I/O traces can provide high-level information and fine-grained activities per application or per user running in the system. Studying such large-scale traces can provide helpful insights into the system. It can be used to develop predictive methods for making predictive decisions, adjusting scheduling policies, or providing decisions for the design of next-generation systems. However, sharing real-world I/O traces to expedite such research efforts leaves a few concerns;i) the cost of sharing the large traces is expensive due to this large size, and ii) privacy concern is an issue. We address such issues by building an end-to-end machine learning (ML) workflow that can generate I/O traces for large-scale HPC applications. We leverage ML based feature selection and generative models for I/O trace generation. The generative models are trained on I/O traces collected by the darshan I/O characterization tool over a period of one year. We present a two-step generation process consisting of two deep-learning models, called the feature generator and the trace generator. The combination of two-step generative models provides robustness by reducing the bias of the model and accounting for the stochastic nature of the I/O traces across different runs of an application. We evaluate the performance of the generative models and show that the two-step model can generate time-series I/O traces with less than 20% root mean square error.

关键词： Darshan Generative Modeling Feature Selection Clustering parallel File System

来源：评论

学校读者我要写书评

暂无评论

Rethinking Stateful Stream Processing with RDMA 22

Rethinking Stateful Stream Processing with RDMA

引用

international Conference on Management of Data (SIGMOD)

作者： Del Monte, Bonaventura Zeuch, Steffen Rabl, Tilmann Markl, Volker Tech Univ Berlin Berlin Germany DFKI GmbH Kaiserslautern Germany Potsdam Univ HPI Potsdam Germany

ISBN: (纸本)9781450392495

Remote Direct Memory Access (RDMA) hardware has bridged the gap between network and main memory speed and thus invalidated the common assumption that network is often the bottleneck in distributed data processing systems. However, high-speed networks do not provide "plug-and-play" performance (e.g., using IP-over-InfiniBand) and require a careful co-design of system and application logic. As a result, system designers need to rethink the architecture of their data management systems to benefit from RDMA acceleration. In this paper, we focus on the acceleration of stream processing engines, which is challenged by real-time constraints and state consistency guarantees. To this end, we propose Slash, a novel stream processing engine that uses high-speed networks and RDMA to efficiently execute distributed streaming computations. Slash embraces a processing model suited for RDMA acceleration and scales out by omitting the expensive data re-partitioning demands of scale-out SPEs. While scale-out SPEs rely on data re-partitioning to execute a query over many nodes, Slash uses RDMA to share mutable state among nodes. Overall, Slash achieves a throughput improvement up to two orders of magnitude over existing systems deployed on an InfiniBand network. Furthermore, it is up to a factor of 22 faster than a self-developed solution that relies on RDMA-based data repartitioning to scale out query processing.

关键词： stream management distributed and parallel databases

来源：评论

学校读者我要写书评

暂无评论

ReSLB: Load Balanced Workflow for distributed Deep Learning Mass Spectrometry Database 24

ReSLB: Load Balanced Workflow for Distributed Deep Learning ...

引用

24th IEEE international Conference on High Performance Computing and Communications, 8th IEEE international Conference on Data Science and systems, 20th IEEE international Conference on Smart City and 8th IEEE international Conference on Dependability in Sensor, Cloud and Big Data systems and Application, HPCC/DSS/SmartCity/DependSys 2022

作者： Mocheng, Li Yang, Liu Yang, Ou Zhiguang, Chen Nong, Xiao Tao, Chen College of Computer National University of Defense Technology Institute for Quantum Information State Key Laboratory of High Performance Computing Changsha China College of Computer National University of Defense Technology Changsha China School of Computer Sun Yat-sen University Guangzhou China Beijing China

ISBN: (纸本)9798350319934

The proteomics data analysis pipeline based on the shotgun method requires efficient data processing methods. The parallel algorithm of mass spectrometry database search faces the problems of rapidly expanding database size but low scalability. The heterogeneous database search algorithm based on deep learning is an effective way to solve this problem. Still, the deep learning-based distributed parallel database search algorithm is lacking. This paper analyzes the database searching computational load using a neural network and designs ReSLB, a workflow for the restricted search of mass spectrometry data based on GPU cluster and neural network scoring algorithm. This work aims to ensure the high scalability of future deep learning-based mass spectrometry distributed databases. In the performance estimation of 256 GPUs, the load imbalance of less than 30% and the parallel efficiency of 60% are achieved. Compared with state-of-the-art, the time cost is reduced by 75%, and the parallel efficiency from 1 to 256 GPUs is 3.6x higher than that. © 2022 IEEE.

关键词： Mass spectrometry

来源：评论

学校读者我要写书评

暂无评论

An OpenMP-only Linear Algebra Library for distributed Architectures 34

An OpenMP-only Linear Algebra Library for Distributed Archit...

引用

34th IEEE international symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

作者： Cardoso, Carla Yviquel, Herve Valarini, Guilherme Leite, Gustavo Ceccato, Rodrigo Pereira, Marcio Souza, Alan Araujo, Guido Univ Estadual Campinas Inst Comp Sao Paulo Brazil Petrobras SA Sao Paulo Brazil

ISBN: (数字)9781665451574

ISBN: (纸本)9781665451574

This paper presents a dense linear algebra library for distributed memory systems called OMPC PLASMA. It leverages the OpenMP Cluster (OMPC) programming model to enable the execution of the PLASMA library using task parallelism on a distributed cluster architecture. OpenMP Cluster model is used to define the task regions that are then distribute across the cluster nodes by the OMPC runtime that automatically manages task scheduling, communications between nodes, and fault tolerance. The OMPC PLASMA library modifies various PLASMA functions to distribute the matrix across the nodes and perform the calculation using threads of the node. Experimental results show that OMPC PLASMA achieves 4.00x with 4 worker nodes, 7.00x with 8 worker nodes, and 12.00x with 16 worker nodes acceleration over its original implementation for a single node. A 3.00x speedup is achieved when comparing OMPC PLASMA execution to ScaLAPACK, for 4 worker nodes, and a matrix size of 90kx90k.

关键词： Dense Linear Algebra distributed Memory systems OMPC Cluster PLASMA

来源：评论

学校读者我要写书评

暂无评论

Empirical Study of Molecular Dynamics Workflow Data Movement: DYAD vs. Traditional I/O systems

Empirical Study of Molecular Dynamics Workflow Data Movement...

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Ian Lumsden Hariharan Devarajan Jack Marquez Stephanie Brink David Boehme Olga Pearce Jae-Seung Yeom Michela Taufer Department of Electrical Engineering and Computer Science University of Tennessee Knoxville TN USA Lawrence Livermore National Laboratory Livermore CA USA

ISBN: (数字)9798350364606

ISBN: (纸本)9798350364613

This experimental work examines data movement in molecular dynamics (MD) workflows, comparing the Dynamic and Asynchronous Data Streamliner (DYAD) middleware with traditional, industry-standard I/O systems such as XFS and Lustre. DYAD moves MD simulation frames to analytics processes, providing enhanced flexibility and efficiency for dynamic data transfers and in situ analytics. At the same time, traditional I/O storage systems provide durability and scalability for high-performance computing (HPC) systems. The study integrates MD workflows with common simulation codes, facilitating immediate capture and transfer of MD frames to a staging area. It explores various molecular models, from simple to complex, assessing data management performance and scalability. Different producer-consumer pairs, molecular models, and data transaction frequency enable testing across small to large-scale HPC scenarios, from single-node configurations to large, distributed environments. The findings reveal that adaptive mechanisms for minimizing synchronization, direct network communication between producer and consumer processes, and optimizations of both data movement and synchronization are crucial for performance and scalability in MD workflows.

关键词： Adaptation models Scalability Computational modeling distributed databases Production Data transfer Aerodynamics

来源：评论

学校读者我要写书评

暂无评论

distributed Tracing for InterPlanetary File System

Distributed Tracing for InterPlanetary File System

引用

parallel Computing and distributed systems (PCDS), international symposium on

作者： Sushant Kumar Gupta Marshall David Miller Rachel Han Haorui Guo Dept. of Computer Science Stanford University Palo Alto USA

ISBN: (数字)9798350349658

ISBN: (纸本)9798350349665

The InterPlanetary File System (IPFS) is on its way to becoming the backbone of the next generation of the web. However, it suffers from several performance bottlenecks, particularly on the content retrieval path, which are often difficult to debug. This is because content retrieval involves multiple peers on the decentralized network and the issue could lie anywhere in the network. Traditional debugging tools are insufficient to help web developers who face the challenge of slow loading websites and detrimental user experience. This limits the adoption and future scalability of IPFS. In this paper, we aim to gain valuable insights into how content retrieval requests propagate within the IPFS network as well as identify potential performance bottlenecks which could lead to opportunities for improvement. We propose a custom tracing framework that generates and manages traces for crucial events that take place on each peer during content retrieval. The framework leverages event semantics to build a timeline of each protocol involved in the retrieval, helping developers pinpoint problems. Additionally, it is resilient to malicious behaviors of the peers in the decentralized environment. We have implemented this framework on top of an existing IPFS implementation written in Java called Nabu. Our evaluation shows that the framework can identify network delays and issues with each peer involved in content retrieval requests at a very low overhead.

关键词： Java Scalability Semantics Loading distributed databases parallel processing User experience InterPlanetary File System Next generation networking Faces

来源：评论

学校读者我要写书评

暂无评论

Large-Scale Graphs Community Detection using Spark GraphFrames

Large-Scale Graphs Community Detection using Spark GraphFram...

引用

international symposium on parallel and distributed Computing

作者： Elena-Simona Apostol Adrian-Cosmin Cojocaru Ciprian-Octavian Truică National University of Science and Technology Politehnica Bucharest 313 Independentei Bucharest Romania Academy of Romanian Scientists 3 Ilfov Bucharest Romania

ISBN: (数字)9798350369199

ISBN: (纸本)9798350369205

With the emergence of social networks, online platforms dedicated to different use cases, and sensor networks, the emergence of large-scale graph community detection has become a steady field of research with real-world applications. Community detection algorithms have numerous practical applications, particularly due to their scalability with data size. Nonetheless, a notable drawback of community detection algorithms is their computational intensity [2], resulting in decreasing performance as data size increases. For this purpose, new frameworks that employ distributed systems such as Apache Hadoop and Apache Spark which can seamlessly handle large-scale graphs must be developed. In this paper, we propose a novel framework for community detection algorithms, i.e., K-Cliques, Louvain, and Fast Greedy, developed using Apache Spark GraphFrames. We test their performance and scalability on two real-world datasets. The experimental results prove the feasibility of developing graph mining algorithms using Apache Spark GraphFrames.

关键词： distributed processing Social networking (online) Scalability distributed databases Cluster computing Sparks Detection algorithms

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共489页 << < 8 9 10 11 12 13 14 15 16 17 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：