检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

10,908 篇 会议
230 篇 期刊文献
173 册 图书
3 篇 学位论文

馆藏范围

11,314 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

6,677 篇 工学
- 6,107 篇 计算机科学与技术...
- 2,914 篇 软件工程
- 1,181 篇 电气工程
- 1,067 篇 信息与通信工程
- 482 篇 电子科学与技术（可...
- 358 篇 控制科学与工程
- 176 篇 仪器科学与技术
- 152 篇 机械工程
- 117 篇 动力工程及工程热...
- 113 篇 生物医学工程（可授...
- 101 篇 生物工程
- 87 篇 光学工程
- 86 篇 建筑学
- 68 篇 土木工程
- 68 篇 网络空间安全
- 65 篇 化学工程与技术
- 61 篇 材料科学与工程（可...
- 58 篇 安全科学与工程
1,468 篇 理学
- 976 篇 数学
- 311 篇 物理学
- 156 篇 系统科学
- 140 篇 统计学（可授理学、...
- 136 篇 生物学
- 83 篇 化学
897 篇 管理学
- 647 篇 管理科学与工程(可...
- 306 篇 图书情报与档案管...
- 289 篇 工商管理
145 篇 医学
- 120 篇 临床医学
- 62 篇 基础医学(可授医学...
62 篇 经济学
- 62 篇 应用经济学
56 篇 法学
34 篇 农学
26 篇 教育学
18 篇 文学
11 篇 军事学

主题

1,463 篇 parallel process...
584 篇 computer archite...
553 篇 distributed comp...
534 篇 application soft...
529 篇 computational mo...
528 篇 concurrent compu...
342 篇 hardware
319 篇 graphics process...
308 篇 scalability
307 篇 parallel program...
305 篇 graphics process...
304 篇 computer science
285 篇 runtime
263 篇 big data
260 篇 optimization
249 篇 parallel process...
226 篇 throughput
225 篇 processor schedu...
214 篇 resource managem...
209 篇 bandwidth

机构

31 篇 national laborat...
23 篇 science and tech...
21 篇 department of co...
19 篇 national laborat...
18 篇 university of ch...
17 篇 school of comput...
17 篇 institute of com...
16 篇 school of comput...
15 篇 college of compu...
15 篇 georgia inst tec...
15 篇 barcelona superc...
15 篇 intel corporatio...
14 篇 school of comput...
14 篇 chinese acad sci...
14 篇 department of co...
13 篇 tsinghua univ de...
13 篇 ibm thomas j. wa...
13 篇 cent s univ sch ...
13 篇 college of compu...
13 篇 ohio state univ ...

作者

21 篇 jack dongarra
16 篇 badia rosa m.
16 篇 liu jie
15 篇 wang guojun
15 篇 yijie wang
15 篇 a. choudhary
14 篇 anon
14 篇 kurt rothermel
13 篇 jun wang
13 篇 koldehofe boris
13 篇 mencagli gabriel...
13 篇 prodan radu
13 篇 wang yijie
12 篇 fahringer thomas
12 篇 dongsheng li
12 篇 zomaya albert y.
12 篇 navaux philippe ...
12 篇 yong dou
11 篇 fernandes luiz g...
11 篇 guangwen yang

语言

11,185 篇 英文
95 篇 其他
35 篇 中文
2 篇 俄文
2 篇 土耳其文
1 篇 德文

检索条件"任意字段=International Conference on Parallel and Distributed Processing Techniques and Applications"

共 11314 条记录，以下是591-600 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Accelerating MPI AllReduce Communication with Efficient GPU-Based Compression Schemes on Modern GPU Clusters 39

Accelerating MPI AllReduce Communication with Efficient GPU-...

引用

39th international conference on High Performance Computing, ISC High Performance 2024

作者： Zhou, Qinghua Ramesh, Bharath Shafi, Aamir Abduljabbar, Mustafa Subramoni, Hari Panda, Dhabaleswar K. Department of Computer Science and Engineering The Ohio State University United States

ISBN: (纸本)9783982633602

With the increasing scale of High-Performance Computing (HPC) and Deep Learning (DL) applications through GPU adaptation, the seamless communication of data stored on GPUs has become a critical factor in enhancing overall application performance. AllReduce is a communication collective operation that is commonly used in HPC applications and distributed DL training, especially Data parallelism. Data parallelism is a common strategy where parallel GPUs are used to process the partitioned training dataset using a replica of the DL model. However, AllReduce operation for large GPU data still performs poorly due to the limited interconnect bandwidth between the GPU nodes. Some strategies of Gradient Quantization or Sparse AllReduce modifying the Stochastic Gradient Descent (SGD) algorithms may not support different training scenarios. Recent research shows integrating GPU-based compression into MPI libraries is efficient to achieve faster data transmission. In this paper, we propose optimized Recursive-Doubling and Ring AllReduce algorithms that encompass efficient collective-level GPU-based compression schemes in a state-of-the-art GPU-Aware MPI library. At the microbenchmark level, the proposed Recursive-Doubling and Ring algorithms with compression support achieve benefits of up to 75.3% and 85.5% respectively compared to the baseline, and 24.8% and 66.1% respectively compared to naive point-to-point compression on modern GPU clusters. For distributed DL training with PyTorch-DDP, these two approaches yield up to 32.3% and 35.7% faster training than the baseline, while maintaining similar accuracy. © 2024 Research Paper Proceedings of the ISC High Performance 2024. All rights reserved.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Data Management Model to Program Irregular Compute Kernels on FPGA: Application to Heterogeneous distributed System 27th

Data Management Model to Program Irregular Compute Kernels o...

引用

27th international European conference on parallel and distributed Computing (Euro-Par)

作者： Lenormand, Erwan Goubier, Thierry Cudennec, Loic Charles, Henri-Pierre Univ Paris Saclay LIST CEA F-91191 Gif Sur Yvette France DGA Maitrise Informat BP 7 F-35998 Rennes France Univ Grenoble Alpes LIST CEA F-38000 Grenoble France

ISBN: (纸本)9783031061561;9783031061554

This paper presents a data management model targeting heterogeneous distributed systems integrating reconfigurable accelerators. The purpose of this model is to reduce the complexity of developing applications with multidimensional sparse data structures. It relies on a shared memory paradigm, which is convenient for parallel programming of irregular applications. The distributed data, sliced in chunks, are managed by a Software-distributed Shared Memory (S-DSM). The integration of reconfigurable accelerators in this S-DSM, by breaking the master-slave model, allows devices to initiate access to chunks in order to accept data-dependent accesses. We use chunk partitioning of multidimensional sparse data structures, such as sparse matrices and unstructured meshes, to access them as a continuous data stream. This model enables to regularize memory accesses of irregular applications, to avoid the transfer of unnecessary data by providing fine-grained data access, and to efficiently hide data access latencies by implicitly overlaying the transferred data flow with the processed data flow. We have used two case studies to validate the proposed data management model: General Sparse Matrix-Matrix Multiplication (SpGEMM) and Shallow Water Equations (SWE) over an unstructured mesh. The results obtained show that the proposed model efficiently hides the data access latencies by reaching computation speeds close to those of an ideal case (i.e. without latency).

关键词： distributed shared memory Field programmable gate array Irregular application

来源：评论

学校读者我要写书评

暂无评论

24th international conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2024

24th International Conference on Embedded Computer Systems: ...

引用

24th international conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2024

ISBN: (纸本)9783031783791

The proceedings contain 38 papers. The special focus in this conference is on Embedded Computer Systems: Architectures, Modeling, and Simulation. The topics include: QCEDA: Using Quantum Computers for EDA;Real-Time Linux on RISC-V: Long-Term Performance Analysis of PREEMPT_RT Patches;RV-VP2: Unlocking the Potential of RISC-V Packed-SIMD for Embedded processing;A Novel System Simulation Framework for HBM2 FPGA Platforms;ONNX-To-Hardware Design Flow for Adaptive Neural-Network Inference on FPGAs;efficient Post-training Augmentation for Adaptive Inference in Heterogeneous and distributed IoT Environments;pooling On-the-Go for NoC-Based Convolutional Neural Network Accelerator;Vitamin-V: Serverless Cloud Computing Porting on RISC-V;Design and Implementation of an Open Source OpenGL SC 2.0.1 Installable Client Driver and Offline Compiler;Plan Your Defense: A Comparative Analysis of Leakage Detection Methods on RISC-V Cores;iVault: Architectural Code Concealing techniques to Protect Cryptographic Keys;I2DS: FPGA-Based Deep Learning Industrial Intrusion Detection System;ACRA: A Cutting-Edge Analytics Platform for Advanced Real-Time Corruption Risk Assessment and Investigation Prioritization;post Quantum Cryptography Research Lines in the Italian Center for Security and Rights in Cyberspace;advancing Future 5G/B5G Systems: The Int5Gent Approach;RISC-V Accelerators, Enablement and applications for Automotive and Smart Home in the ISOLDE Project;PMDI: An AI-Enabled Ecosystem for Cooperative Urban Mobility;Open Source Software Randomisation Framework for Probabilistic WCET Prediction on Multicore CPUs, GPUs and Accelerators;a Hypervisor Based Platform for the Development and Verification of Reliable Software applications.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Towards On-the-fly Self-Adaptation of Stream parallel Patterns 29

Towards On-the-fly Self-Adaptation of Stream Parallel Patter...

引用

29th Euromicro international conference on parallel, distributed and Network-Based processing (PDP)

作者： Vogel, Adriano Mencagli, Gabriele Griebler, Dalvan Danelutto, Marco Fernandes, Luiz Gustavo Pontifical Catholic Univ Rio Grande do Sul PUCRS Sch Technol Porto Alegre RS Brazil Tres de Maio Fac SETREM Lab Adv Res Cloud Comp LARCC Tres De Maio Brazil Univ Pisa UNIPI Comp Sci Dept Pisa Italy

ISBN: (纸本)9781665414555

Stream processing applications compute streams of data and provide insightful results in a timely manner, where parallel computing is necessary for accelerating the application executions. Considering that these applications are becoming increasingly dynamic and lung-running, a potential solution is to apply dynamic runtime changes. However, it is challenging for humans to continuously monitor and manually self-optimize the executions. In this paper, we propose self-adaptiveness of the parallel patterns used, enabling flexible on-the-fly adaptations. The proposed solution is evaluated with an existing programming framework and running experiments with a synthetic and a real-world application. The results show that the proposed solution is able to dynamically self-adapt to the most suitable parallel pattern configuration and achieve performance competitive with the best static cases. The feasibility of the proposed solution encourages future optimizations and other applicabilities.

关键词： parallel Computing parallel Patterns parallelism Abstractions Self adaptive systems Stream processing

来源：评论

学校读者我要写书评

暂无评论

A method for modelling and executing customized pipelines in serverless computing

A method for modelling and executing customized pipelines in...

引用

2023 IEEE international conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Cinaglia, Pietro Cannataro, Mario Magna Graecia University of Catanzaro Department of Health Sciences Italy Magna Graecia University of Catanzaro Data Analytics Research Center Department of Medical and Surgical Sciences Italy

ISBN: (纸本)9798350337488

Serverless computing is an emerging cloud service for executing distributed applications on cloud architecture. The possibility of performing functions without the need to manage any type of infrastructure has made this methodology particularly adopted in several fields, e.g., data processing and above all in parallel computing. The processing of large-scale genomic data needs many computational resources, resulting highly time-consuming. Therefore, the need of higher computing capabilities has translated into the increasing use of this technology. In this paper, we present a method for modelling and executing customized pipelines in serverless computing. We applied this one to the transcript-level expression analysis of samples from RNA sequencing (RNA-seq), by focusing on the most computationally expensive step: the mapping of reads to a reference genome. Our method has been implemented as an Amazon Web Services (AWS) Lambda function, that is deployed within our own serverless architecture. The parallel instances invoked in AWS Lambda are with negligible latencies, being managed by the provider, therefore, the average computational time are similar among experiments on similar samples. We denoted a relevant advantage in running time, by measuring an improvement up to 79.84% and 90.10% on the concurrent analysis of 10 samples, compared to the local environments having the following specifications: CPU 3.8 GHz 8 vcores and CPU 3.8 GHz 16 vcores, respectively. © 2023 IEEE.

关键词： AWS bioinformatics parallel analysis pipeline serverless

来源：评论

学校读者我要写书评

暂无评论

A systematic mapping study of leveraging Intelligent Systems applications in human capital development: In the example of developing professional communicative competence of economics students in English 7

A systematic mapping study of leveraging Intelligent Systems...

引用

7th international conference on Future Networks and distributed Systems, ICFNDS 2023

作者： Jamalova, Gulnora Aymatova, Farida Khasanova, Dilbar Shamsematova, Barno Department For The Coordination Of Joint Educational Programs Tashkent State University Of Economics Islam Karimov street 49 Tashkent city100066 Uzbekistan Foreign Languages Department International Islamic Academy Of Uzbekistan Uzbekistan

ISBN: (纸本)9798400709036

Developing professional communicative competence is vital for economics students, preparing them for effective communication in their future careers. Despite its significance, a research gap exists regarding the use of intelligent systems to enhance this competence in economics students. To fill this gap, this systematic mapping study investigates intelligent systems applications in human capital development, focusing on professional communicative competence development for economics students in English. The study involves a rigorous literature search and screening process, revealing diverse approaches and techniques to harness intelligent systems for human capital development. By synthesizing and analyzing these findings, this paper systematically maps the current landscape of intelligent systems' use in developing professional communicative competence among economics students. The results indicate that intelligent systems applications, including natural language processing, machine learning, and artificial intelligence, offer substantial potential to improve human capital development programs. They provide personalized, adaptive, and interactive learning experiences. Additionally, the study underscores the importance of incorporating pedagogical and instructional design principles when developing intelligent systems applications for human capital development. In summary, this systematic mapping study offers insights and recommendations for researchers, educators, and practitioners interested in leveraging intelligent systems for human capital development, especially in cultivating professional communicative competence among economics students in English. Moreover, it lays the groundwork for future empirical studies assessing the effectiveness of intelligent systems applications in this context. © 2023 ACM.

关键词： Intelligent systems

来源：评论

学校读者我要写书评

暂无评论

High-Performance Spatial Data Compression for Scientific applications 28th

High-Performance Spatial Data Compression for Scientific App...

引用

28th international conference on parallel and distributed Computing (Euro-Par)

作者： Kriemann, Ronald Ltaief, Hatem Minh Bau Luong Perez, Francisco E. Hernandez Im, Hong G. Keyes, David Max Planck Inst Math Sci Leipzig Germany KAUST Extreme Comp Res Ctr Thuwal Saudi Arabia KAUST Clean Combust Res Ctr Thuwal Saudi Arabia

ISBN: (纸本)9783031125973;9783031125966

We implement an efficient data compression algorithm that reduces the memory footprint of spatial datasets generated during scientific simulations. Storing regularly these datasets is typically needed for checkpoint/restart or for post-processing purposes. Our lossy compression approach, codenamed HLRcompress (https://***. de/rok/HLRcompress), combines a hierarchical low-rank approximation technique with binary compression. This novel hybrid method is agnostic to the particular domain of application. We study the impact of HLRcompress on accuracy using synthetic datasets to demonstrate the software capabilities, including robustness and versatility. We assess different algebraic compression methods and report performance results on various parallel architectures. We then integrate it into a workflow of a direct numerical simulation solver for turbulent combustion on distributed-memory systems. We compress the generated snapshots during time integration using accuracy thresholds for each individual chemical species, without degrading the practical accuracy of the overall pressure and temperature. We eventually compare against state-of-the-art compression software. Our implementation achieves on average greater than 100-fold compression of the original size of the datasets.

关键词： Algebraic/Binary Compression Scientific Datasets Hierarchical Matrices

来源：评论

学校读者我要写书评

暂无评论

indexPDT: A High Scalable distributed Classification Approach with Novel Cache Structure for Geo-location 25

indexPDT: A High Scalable Distributed Classification Approac...

引用

25th IEEE international conferences on High Performance Computing and Communications, 9th international conference on Data Science and Systems, 21st IEEE international conference on Smart City and 9th IEEE international conference on Dependability in Sensor, Cloud and Big Data Systems and applications, HPCC/DSS/SmartCity/DependSys 2023

作者： Sun, Zhijie Li, Jing Xie, Jun Zheng, Binfan Zeng, Li Zhao, Rongqian Huawei Technologies Co. Ltd. Shenzhen China

ISBN: (纸本)9798350330014

Geo-location, also known as measurement report (MR) location, is a technique to determine the geographic location of user equipment (UE) and the behaviour attribute of telephone traffic based on wireless signals measured by the mobile communication network. The geographic location information can help to support network performance monitoring and evaluation. Considering accuracy and cost, we mainly adopt a hybrid location scheme combined with feature matching location and Weighted Centroid Correction Location (WCCL). As for feature matching location, over 20 billion samples gathered from tens of thousands of cells daily updated. Due to the vast data scale, feature analysis encounters a severe performance bottleneck. To address this problem, we design the indexed parallel decision tree (indexPDT) operator and integrate it into WindTensor, a self-innovated distributed machine learning (ML) engine. indexPDT is a classifier unit of the random forest (RF) algorithm with a novel cache structure. It performs structured cache processing on the dataset's meta-information, which can accompany the splitting of nodes. The cache structure can be quickly converted into statistical information to help find the optimal splitting point, effectively reducing memory usage and improving performance. Under the public datasets testing on 5 nodes, the mean speedup ratios are 86x and 3x compared with SparkML and XGBoost, respectively. In the Geo-location scenario, for a single cell, the speedup ratios are 82x and 4x compared with SparkML and XG Boost, respectively. © 2023 IEEE.

关键词： Location

来源：评论

学校读者我要写书评

暂无评论

GRAPHGUESS: Approximate Graph processing System with Adaptive Correction 28th

GRAPHGUESS: Approximate Graph Processing System with Adaptiv...

引用

28th international conference on parallel and distributed Computing (Euro-Par)

作者： Ramezani, Morteza Kandemir, Mahmut T. Sivasubramaniam, Anand Penn State Univ State Coll PA 16801 USA

ISBN: (纸本)9783031125973;9783031125966

Graph-based data structures have drawn great attention in recent years. The large and rapidly growing trend on developing graph processing systems focuses mostly on improving the performance by preprocessing the input graph and modifying its layout. These systems usually take several hours to days to complete processing a single graph on high-end machines, let alone the overhead of pre-processing which most of the time can be dominant. Yet for most graph applications the exact answer is not always crucial, and providing a rough estimate of the final result is adequate. Approximate computing is introduced to trade off accuracy of results for computation or energy savings that could not be achieved by conventional techniques alone. In this work, we design, implement and evaluate GraphGuess, inspired from the domain of approximate graph theory and extend it to a general, practical graph processing system. GraphGuess is essentially an approximate graph processing technique with adaptive correction, which can be implemented on top of any graph processing system. We build a vertex-centric processing system based on GraphGuess, where it allows the user to trade off accuracy for better performance. Our experimental studies show that using GraphGuess can significantly reduce the processing time for large scale graphs while maintaining high accuracy.

关键词： Graph processing Approximate Computing

来源：评论

学校读者我要写书评

暂无评论

On the Dynamics of Non-IID Data in Federated Learning and High-Performance Computing

On the Dynamics of Non-IID Data in Federated Learning and Hi...

引用

Euromicro conference on parallel, distributed and Network-Based processing

作者： Daniela Annunziata Marzia Canzaniello Diletta Chiaro Stefano Izzo Martina Savoia Francesco Piccialli Department of Mathematics and Applications “R. Caccioppoli” University of Naples Federico II Naples Italy

This paper investigates the symbiosis of Federated Learning (FL) and High-Performance Computing (HPC) architectures, unraveling challenges introduced by the intricate interplay of heterogeneity and non-Independently and Identically distributed (non-lID) data. By leveraging the Flower framework, our research delves into the nuanced implications of FL in diverse HPC environments. We provide a comprehensive exploration of the heterogeneity within contemporary HPC architectures, spanning node organizations, memory hierarchies, and special-ized accelerators, emphasizing adaptability to this complexity. Methodologically, we simulate a FL scenario within our research laboratory, leveraging Flower to orchestrate collaborative model training across heterogeneous nodes. The experiments involve variations in the Dirichlet beta parameter, offering insights into the effects of non-lID data. Results encompass communication efficiency, energy efficiency, and global model accuracy, providing a holistic understanding of the performances across diverse HPC infrastructures. This research contributes to the ongoing discourse on efficient and scalable algorithms, providing insights for collaborative learning in the era of diverse HPC architectures.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 56 57 58 59 60 61 62 63 64 65 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：