检索结果-内蒙古大学图书馆

AFSD: Adaptive Feature Space Distillation for distributed Deep Learning

IEEE ACCESS 2022年 10卷 84569-84578页

作者： Khaleghian, Salman Ullah, Habib Johnsen, Einar Broch Andersen, Anders Marinoni, Andrea UiT Arctic Univ Norway Fac Sci & Technol N-9019 Tromso Norway Norwegian Univ Life Sci NMBU Fac Sci & Technol N-1430 As Norway Univ Oslo Dept Informat N-0315 Oslo Norway

We propose a novel and adaptive feature space distillation method (AFSD) to reduce the communication overhead among distributed computers. The proposed method improves the Codistillation process by supporting longer update interval rates. AFSD performs knowledge distillates across the models infrequently and provides flexibility to the models in terms of exploring diverse variations in the training process. We perform knowledge distillation in terms of sharing the feature space instead of output only. Therefore, we also propose a new loss function for the Codistillation technique in AFSD. Using the feature space leads to more efficient knowledge transfer between models with a longer update interval rates. In our method, the models can achieve the same accuracy as Allreduce and Codistillation with fewer epochs.

关键词： Convolutional neural networks Feature extraction Computational modeling Deep learning Adaptation models Data models Concurrent computing Knowledge management distributed computing distributed deep learning convolutional neural networks knowledge distillation codistillation

来源：评论

学校读者我要写书评

暂无评论

distributed-Swarm: A Real-Time Pattern Detection Model Based on Density Clustering

引用

IEEE ACCESS 2022年 10卷 59832-59842页

作者： Qian, Tiao Sun, Shiming Shan, Xin Wei, Xueyun Tai, Chunliang Liu, Chao NARI Grp Corp State Grid Elect Power Res Inst Nanjing 211106 Peoples R China NARI Technol Dev Ltd Co Nanjing 211106 Peoples R China

The advancement of power technology and the improvement of people's living standards promote the expansion of the power grid scale and the sharp rise in electricity consumption. In the power system, due to the use of various sensors, we can collect a large number of power data (eg. the spatial-temporal information of electric vehicle charging). Usually, such spatial-temporal data is generated in the form of a data stream. The analysis and mining of such data can be widely applied in power equipment condition monitoring and maintenance, user equipment anomaly warning, urban power grid analysis and other scenarios. Among them, the pattern detection of power data plays a key role in power data analysis. Since the power data such as the spatial-temporal information of electric vehicle charging is time-sensitive, it is crucial to perform real-time pattern mining in real-time monitoring systems. However, state-of-the-art pattern detection methods are built on batch mode. Extending such works directly to an online environment tends to result in (1) expensive network cost, (2) high processing latency, and (3) low accuracy results. In this paper, we propose a framework for frequent motion pattern detection of power data in the real-time distributed environment. Through the softmax differentiation function, the power data is filtered to reduce the workload and improve the performance of the framework. At the same time, we propose the concept of historical state matrix to solve the problem that the nodes of each physical partition in a distributed environment can not perceive each other. Extensive experiments are conducted on real dataset and the experimental results show that our pattern detection is about 70% faster than baseline methods, which proves the huge advantage of our approach over available solutions in the literature.

关键词： Power data real-time processing pattern detection distributed computing

来源：评论

学校读者我要写书评

暂无评论

Logical Synchrony and the Bittide Mechanism

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 2024年第11期35卷 1936-1948页

作者： Lall, Sanjay Cascaval, Calin Izzard, Martin Spalink, Tammo Google Mountain View CA 94043 USA

We introduce logical synchrony, a framework that allows distributed computing to be coordinated as tightly as in synchronous systems without the distribution of a global clock or any reference to universal time. We develop a model of events called a logical synchrony network, in which nodes correspond to processors and every node has an associated local clock which generates the events. We construct a measure of logical latency and develop its properties. A further model, called a multiclock network, is then analyzed and shown to be a refinement of the logical synchrony network. We present the bittide mechanism as an instantiation of multiclock networks, and discuss the clock control mechanism that ensures that buffers do not overflow or underflow. Finally we give conditions under which a logical synchrony network has an equivalent synchronous realization.

关键词： Computer networks distributed computing Computer networks distributed computing

来源：评论

学校读者我要写书评

暂无评论

HyperQueue: Efficient and ergonomic task graphs on HPC clusters

引用

SOFTWAREX 2024年 27卷

作者： Beranek, Jakub Bohm, Ada Palermo, Gianluca Martinovic, Jan Jansik, Branislav VSB Tech Univ Ostrava IT4innovat Ostrava Czech Republic Politecn Milano DEIB Milan Italy

Task graphs are a popular method for defining complex scientific simulations and experiments that run on distributed and HPC (High-performance computing) clusters, because they allow their authors to focus on the problem domain, instead of low-level communication between nodes, and also enable quick prototyping. However, executing task graphs on HPC clusters can be problematic in the presence of allocation managers like PBS or Slurm, which are not designed for executing a large number of potentially short-lived tasks with dependencies. To make task graph execution on HPC clusters more efficient and ergonomic, we have created HYPERQUEUE, an open-source task graph execution runtime tailored for HPC use-cases. It enables the execution of large task graphs on top of an allocation manager by aggregating tasks into a smaller amount of PBS/Slurm allocations and dynamically load balances tasks amongst all available nodes. It can also automatically submit allocations on behalf of the user, it supports arbitrary task resource requirements and heterogeneous HPC clusters, it is trivial to deploy and does not require elevated privileges.

关键词： distributed computing Task scheduling High performance computing Job manager

来源：评论

学校读者我要写书评

暂无评论

System-wide IoT design and programming: Patterns for decentralised collective processes

引用

INTERNET OF THINGS 2025年 29卷

作者： Casadei, Roberto Univ Bologna Dept Comp Sci & Engn Alma Mater Studiorum Via Univ 50 I-47521 Cesena Italy

The Internet of Things promotes a view of large-scale deployments of devices able to compute, communicate, and interact with their surrounding environment. In this context, one significant challenge revolves around designing and programming collective processes, i.e., durable activities involving the collaboration of large groups of devices. Examples of collective processes include distributed sensing, collective decision-making, collective movement/transport, and adaptive maintenance of system-level structures. To address the issues involved in developing such kinds of system-wide behaviours, research has proposed multiple approaches, abstractions, and algorithmic solutions. In particular, the approach of aggregate processes has emerged as a promising formal technique for programming collective processes by a macro-level perspective while supporting decentralisation, abstraction, and resilience. In order to characterise (i) previous work on aggregate processes, (ii) the usages and applications that this technique may foster, and (iii) draw general design insights in the realm of collective computing, this article provides a characterisation of common problems and solutions based on aggregate processes. What results is a catalogue of design patterns for decentralised collective processes. Specifically, we provide a taxonomy of patterns, describe each pattern in a schematic form, and discuss the implications for the design of collective processes for the Internet of Things and related scenarios.

关键词： Internet of Things Collective intelligence distributed computing Decentralised systems Macro-programming Design patterns

来源：评论

学校读者我要写书评

暂无评论

Toward a Universal Cryptographic Accelerator

引用

COMPUTER 2025年第1期58卷 105-108页

作者： Devadas, Srinivas Sanchez, Daniel MIT Elect Engn & Comp Sci Cambridge MA 02139 USA

Cyberphysical systems have disseminated devices that can be untrustworthy or compromised. Nevertheless, the privacy and integrity of computation and data can be guaranteed through cryptographic protocols. We address the computational burden posed by cryptography, and argue for a synergistic approach of designing programmable hardware accelerators for cryptography, followed by tailoring cryptographic protocols to this hardware.

关键词： Market Research Artificial Intelligence Computational Modeling Cloud computing Resource Management Dynamic Scheduling Decision Making Processor Scheduling distributed computing Cyber Physical Systems Resource Management Computational Resources Internet Of Things Development Of Services Application Programming Interface Functional Clusters Network Resources Cyber Physical Systems Storage Resources Edge Devices Information Technology Industry Service Components Containerized Microservices Perspective Of Migrants Artificial Intelligence Machine Learning Deep Reinforcement Learning Roadside Units ML Models

来源：评论

学校读者我要写书评

暂无评论

An open science grid implementation of the steady state genetic algorithm for crystal structure prediction

引用

JOURNAL OF COMPUTATIONAL SCIENCE 2024年 82卷

作者： Varela, Kristal N. Pagola, Gabriel I. Lund, Albert M. Ferraro, Marta B. Orendt, Anita M. Facelli, Julio C. Univ Buenos Aires Fac Ciencias Exactas & Nat Dept Fis Buenos Aires Argentina Inst Fis Buenos Aires IFIBA Buenos Aires Argentina Univ Utah Ctr High Performance Comp Salt Lake City UT USA Univ Utah Dept Biomed Informat Salt Lake City UT 84108 USA

In this paper we report the implementation and testing of algorithmic changes that have been implemented in MGAC, a crystal structure prediction system, to make it scalable and amenable to take advantage of such significant distributed resources as the Open Science Grid (OSG). The changes include the adoption of a steady state Genetic Algorithm (GA) and the adoption of a more general definition of the GA genome that eliminates the need of searching individually for each of the 230 possible space groups and the use of the Density Functional Theory with dispersion correction (DFT-D) as implemented in Quantum Espresso (QE) to calculate crystal energies. The performance of this implementation of MGAC, which in the following we label as MGAC-QE-OSG, is demonstrated for two test cases methanol and ethanol. In both cases the MGAC-QE-OSG can find the experimental structures of these compounds.

关键词： Genetic algorithms Crystal structure prediction Open science grid distributed computing

来源：评论

学校读者我要写书评

暂无评论

Analysis of workflow schedulers in simulated distributed environments

引用

JOURNAL OF SUPERcomputing 2022年第13期78卷 15154-15180页

作者： Beranek, Jakub Bohm, Stanislav Cima, Vojtech VSB Tech Univ Ostrava IT4Innovat Ostrava Czech Republic

Task graphs provide a simple way to describe scientific workflows (sets of tasks with dependencies) that can be executed on both HPC clusters and in the cloud. An important aspect of executing such graphs is the used scheduling algorithm. Many scheduling heuristics have been proposed in existing works;nevertheless, they are often tested in oversimplified environments. We provide an extensible simulation environment designed for prototyping and benchmarking task schedulers, which contains implementations of various scheduling algorithms and is open-sourced, in order to be fully reproducible. We use this environment to perform a comprehensive analysis of workflow scheduling algorithms with a focus on quantifying the effect of scheduling challenges that have so far been mostly neglected, such as delays between scheduler invocations or partially unknown task durations. Our results indicate that network models used by many previous works might produce results that are off by an order of magnitude in comparison to a more realistic model. Additionally, we show that certain implementation details of scheduling algorithms which are often neglected can have a large effect on the scheduler's performance, and they should thus be described in great detail to enable proper evaluation.

关键词： distributed computing DAG scheduling Task Scheduling Network models

来源：评论

学校读者我要写书评

暂无评论

Capacity Optimization of Large Intelligent Surface With Hardware Impairment Based on Meta-Deep Learning

引用

IEEE ACCESS 2024年 12卷 69359-69370页

作者： Mao, Yifan Xiao, Xiaoyu Hu, Zhirun Univ Manchester Dept Elect & Elect Engn Manchester M13 9PL Lancs England

This work proposes a sub-optimal method based on a two-layer structured meta-deep reinforcement learning (MDRL) approach to address the hardware impairment (HWI) optimization issue in large intelligent surface (LIS) systems. This method, designed for distributed LIS systems with reflection matrices, effectively enhances the system capacity and performance despite HWIs. Building upon existing techniques of dividing large-area LIS systems into multiple small-area subsystems, the simulated results demonstrate that sub-optimal LIS performance can be achieved with fewer samples in diverse dynamic wireless environments. This innovative approach enhances the adaptability of distributed LIS systems and offers an effective HWI management strategy, paving the way for future LIS system optimization.

关键词： Reflection Vectors Array signal processing Optimization Wireless communication Training distributed computing Reinforcement learning Hardware Large intelligent surface distributed system hardware impairment reflection matrix design fewer samples meta-deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Flexible distributed Matrix Multiplication

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2022年第11期68卷 7500-7514页

作者： Li, Weiqi Chen, Zhen Wang, Zhiying Jafar, Syed A. Jafarkhani, Hamid Univ Calif Irvine Ctr Pervas Commun & Comp CPCC Irvine CA 92697 USA

The distributed matrix multiplication problem with an unknown number of stragglers is considered, where the goal is to efficiently and flexibly obtain the product of two massive matrices by distributing the computation across N servers. There are up to N - R stragglers but the exact number is not known a priori. Motivated by reducing the computation load of each server, a flexible solution is proposed to fully utilize the computation capability of available servers. The computing task for each server is separated into several subtasks, constructed based on Entangled Polynomial codes by Yu et al. The final results can be obtained from either a larger number of servers with a smaller amount of computation completed per server or a smaller number of servers with a larger amount of computation completed per server. The required finite field size of the proposed solution is less than 2N. Moreover, the optimal design parameters such as the partitioning of the input matrices are discussed. Our constructions can also be generalized to other settings such as batch distributed matrix multiplication and secure distributed matrix multiplication.

关键词： distributed computing matrix multiplication flexible coded computing stragglers cost optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：