检索结果-内蒙古大学图书馆

52nd International Conference on parallel Processing (ICPP)

作者： Dai, Fei Chen, Yawen Huang, Zhiyi Zhang, Haibo Univ Otago Dunedin New Zealand

ISBN: (纸本)9798400708435

Communication efficiency is crucial for accelerating distributed deep neural network (DNN) training. All-reduce, a vital communication primitive, is responsible for reducing model parameters in distributed DNN training. However, most existing All-reduce algorithms, designed for traditional electrical interconnect systems, fall short due to bandwidth limitations. Optical interconnects, with superior bandwidth, low transmission delay, and less power consumption, emerge as viable alternatives. We propose Wrht (Wavelength Reused Hierarchical Tree), an efficient scheme for implementing the All-reduce operation in optical interconnect systems. Wrht leverages wavelength-division multiplexing (WDM) to minimize the communication time in distributed data-parallel DNN training. We calculate the required wavelengths, minimum communication steps, and optimal communication time, considering optical communication constraints. Simulations with real-world DNN models indicate that Wrht notably reduces communication time. On average, compared with three conventional All-reduce algorithms, Wrht achieves reductions of 65.23%, 43.81%, and 82.22% respectively in optical interconnect systems, and 61.23% and 55.51% compared with two algorithms in electrical systems. This highlights Wrht's potential to enhance communication efficiency in DNN training using optical interconnects.

关键词： Optical interconnects All-reduce distributed training DNN WDM

来源：评论

学校读者我要写书评

暂无评论

Workload Balancing Model for Heterogeneous IoT Nodes Architectures 2

Workload Balancing Model for Heterogeneous IoT Nodes Archite...

引用

2nd International Conference on Electrical Engineering and Automatic Control (ICEEAC)

作者： Aswad, Mustafa Kh Beetro, Fozia Al-Badree Keshlaf, Ayad Ali Libyan Ctr Elect Syst Programming & Aviat Res Res Dept Sintif Commtiee Tripoli Libya Univ Zawia Fac Engn Zawia Libya Sabrahta Univ Comp Engn & IT Dept Fac Engn Sabrahta Libya

ISBN: (纸本)9798350349740;9798350349757

Heterogeneous IoT architectures are evolving rapidly and different challenged are faced with the traditional IoT architectures including the performance time of real-time IoT application. parallel computing programming technique could enhance the performance and efficiency for distributed systems and multicore processors as well as the IoT systems. However, parallel computing, presents certain difficulties and constraints, including synchronization, communication, security concerns, and load balancing. In this regard, a novel IoT workload balancing model for heterogeneous IoT architectures is presented in this paper. This model is intended to reduce the execution time of large systems by redistributing part of their functions to other involved IoT nodes. An experiment has been conducted to evaluate the actual real load for each IoT node and tried to rebalance the load using the proposed model. The results were encouraging as the performance time was reduced by about one third on two cores.

关键词： parallel computing multi-core technology big data IoT swarm system

来源：评论

学校读者我要写书评

暂无评论

distributed Economic MPC for Dynamically Coupled Linear systems: A Lyapunov-Based Approach

引用

IEEE TRANSACTIONS ON systems MAN CYBERNETICS-systems 2023年第3期53卷 1408-1419页

作者： Dai, Li Zhou, Tianyi Qiang, Zhiwen Sun, Zhongqi Xia, Yuanqing Beijing Inst Technol Sch Automat Beijing 100081 Peoples R China

This article develops a distributed economic model predictive control (EMPC) method which is applied in a group of interconnected linear subsystems subject to unknown bounded disturbances. Multiple subsystems are coupled through the dynamics, and the control objective is to optimize some general performance criteria of the whole system which may take economic considerations into account. First, a two operation modes EMPC optimization problem is formulated, which incorporates the constraints derived from the Lyapunov technique. In the first mode, each subsystem focuses on the optimization of the economic performance while maintaining the state in a certain region. In the second mode, the system states are steered to a neighborhood of a steady state by making use of the Lyapunov-based constraints. Furthermore, a consensus alternating direction method of multipliers (ADMM) is adopted to solve the model predictive control optimization problems with a coupled predicted model constraint in a distributed way. By introducing consensus constraints, the resulting local optimization problem does not depend on real-time optimal solutions from neighboring subsystems and allows subsystems to solve it in parallel. Moreover, the closed-loop system is ensured to be input-to-state stable (ISS) with respect to the disturbances. To demonstrate the effectiveness of the algorithm, we conduct numerical simulations on a thermal power interconnected system.

关键词： Coupled dynamics distributed control economic model predictive control (EMPC) robust control

来源：评论

学校读者我要写书评

暂无评论

FTPipeHD: A Fault-Tolerant Pipeline-parallel distributed Training Approach for Heterogeneous Edge Devices

引用

IEEE TRANSACTIONS ON MOBILE COMPUTING 2024年第4期23卷 3200-3212页

作者： Chen, Yuhao Yang, Qianqian He, Shibo Shi, Zhiguo Chen, Jiming Guizani, Mohsen Zhejiang Univ State Key Lab Ind Control Technol Hangzhou 310027 Peoples R China Mohamed Bin Zayed Univ Artificial Intelligence MBZ Machine Learning Dept Abu Dhabi 108100 U Arab Emirates

With the increasing proliferation of Internet-of-Things (IoT) devices, there is a growing trend towards distributing the power of deep learning (DL) among edge devices rather than centralizing it at the cloud. To deploy deep and complex models at edge devices with limited resources, model partitioning of deep neural network (DNN) models has been widely studied. However, most of the existing literature only considers distributing the inference model while still training the model at the cloud. In this paper, we propose FTPipeHD, a novel DNN training approach that trains DNN models across distributed heterogeneous devices with the fault-tolerance mechanism. To accelerate the training with the time-varying computing power of each device, we optimize the partition points dynamically according to real-time computing capacities. We also propose a novel weight redistribution approach that replicates the weights to both the neighboring nodes and the central node periodically, which combats the failure of multiple devices during training while incurring limited communication costs. Our numerical results demonstrate that FTPipeHD is 6.8 times faster in training than the state-of-the-art method when the computing capacity of the best device is 10 times greater than the worst one. It is also shown that the proposed method is able to accelerate the training even with the existence of device failures.

关键词： Training Computational modeling Servers Data models Load modeling Fault tolerant systems Fault tolerance distributed training edge training fault tolerance

来源：评论

学校读者我要写书评

暂无评论

Improving concurrency and memory usage in distributed operating systems for lightweight manycores via cooperative time-sharing lightweight tasks

引用

JOURNAL OF parallel AND distributed COMPUTING 2023年第1期174卷 2-18页

作者： Souto, Joao Vicente Castro, Marcio Fed Univ Santa Catarina UFSC Postgrad Program Comp Sci PPGCC Florianopolis Santa Catarina Brazil

Lightweight manycore processors arise to reconcile performance, energy efficiency, and scalability requirements on a single chip. Operating systems (OSes) for these processors feature a distributed design, where isolated OS instances cooperate to mitigate programmability and portability issues coming from their architectural intricacies. Currently, OS services often resort to traditional execution flow abstractions (processes or threads) to implement small, periodic, or asynchronous functionalities. Although these abstractions considerably simplify the system design, they have a non-negotiable impact on the limited on-chip memories. Due to the memory restrictions, we argue that OS-level abstractions can be reshaped to reduce the OS memory footprint without introducing considerable overhead. In this context, we propose a complementary OS-level execution engine that supports cooperative time-sharing lightweight tasks that share a unique execution stack and features task synchronization via control flow and dependency graphs. This solution is orthogonal to the underlying execution support and provides numerous OS-level execution flows with reduced memory consumption. We implemented our engine in a distributed OS and executed experiments on a lightweight manycore. Our results show that it has the following advantages when compared to the classical thread abstraction: (i) it provides 63.2x more execution flows per MB of memory;(ii) it features less overhead to manage execution flows and system calls;(iii) it improves core utilization;and (iv) it exhibits competitive results on real-world applications.(c) 2022 Elsevier Inc. All rights reserved.

关键词： Lightweight manycores distributed operating systems Asymmetric microkernel Memory constraints

来源：评论

学校读者我要写书评

暂无评论

A Memory-Constraint-Aware List Scheduling Algorithm for Memory-Constraint Heterogeneous Muti-Processor System

引用

IEEE TRANSACTIONS ON parallel AND distributed systems 2023年第4期34卷 1082-1099页

作者： Yao, Yu Song, Yukun Huang, Ying Ni, Wei Zhang, Duoli Hefei Univ Technol Sch Microelect Hefei Anhui 230601 Peoples R China

An effective scheduling algorithm is vital for the execution efficiency of applications on Heterogeneous Muti-Processor System (HMPS), especially Memory-Constraint Heterogeneous Muti-Processor System (MCHMPS). Stringent local and external memory constraints have significant impact on the execution performance of applications executed on MCHMPS, predictability is also a critical factor for task scheduling on MCHMPS. Therefore, a novel list scheduling algorithm termed Memory-constraint-aware Improved Predict Priority and Optimistic Processor Selection Scheduling (MIPPOSS), essentially a heuristic search optimization algorithm, is proposed in this paper. In MIPPOSS, a predictive approach is applied for task prioritization and processor selection, and a novel memory-constraint-aware approach is employed in the processor selection phase. MIPPOSS has polynomial complexity and produces better results for application scheduling on target architecture. Randomly generated DAGs and 3 real-world applications experiments, including Cybershake, LIGO, and Montage, show that MIPPOSS outperforms the other five competing algorithms by a large margin.

关键词： Task analysis Scheduling Memory management Costs Computational modeling Scheduling algorithms time complexity Heterogeneous systems memory constraints parallel computing list scheduling

来源：评论

学校读者我要写书评

暂无评论

Fault-resilient control of parallel PV inverters using multi-agent twin-delayed deep deterministic policy gradient approach

引用

INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS 2024年第7期52卷 3230-3254页

作者： Malik, Azra Haque, Ahteshamul Kurukuru, V. S. Bharath Mekhilef, Saad Jamia Millia Islamia Dept Elect Engn Adv Power Elect Res Lab New Delhi India Silicon Austria Labs GmbH Res Div Power Elect Villach Austria Swinburne Univ Technol Sch Software & Elect Engn Melbourne Vic Australia

Grid-tied photovoltaic system (GTPS) are widely favored due to their inherent benefits. The interface of parallel power electronic converters in these systems is rapidly increasing. The inverter in GTPS has a very important role in power conversion, transfer, and control. However, their challenging working conditions contribute to susceptibility to power switching device failures. Traditional fault diagnosis and tolerant approaches exhibit limitations in control efficiency, model dependency, and slow recovery from fault. This study proposes a multi-agent twin delayed deep deterministic policy gradient (MATD3PG) configuration for intelligent parallel inverter control, fault diagnosis, and fault-tolerant operation in GTPS structure. By leveraging the multi-agent reinforcement learning (RL) framework, an optimal control of the parallel inverter can be achieved, encompassing fault-tolerant operation using MATLAB Simulink/PLECS. The major advantage the given technique offers is that it carries out optimum fault-tolerant operation without causing the system derating. Experimental findings demonstrate that the proposed fault-tolerant model based on RL traditional methods is able to ensure continuous supply to the connected loads even during fault events. Further, the transition time from fault occurrence to recovery is found to be 6 ms, which is quite less compared with the fault-tolerant techniques presented in literature. Through real-time fault diagnosis-based results, the proposed approach ensures precise tracking of reference currents, quicker response times, uninterrupted supply, and smooth transition to the post-fault operation mode. This work considers fault-resilient control of parallel inverters using multi-agent twin-delayed deep deterministic policy gradient approach in grid-connected photovoltaic (PV) system, such that it is implemented in control as well as fault diagnostic framework. The agents for each inverter communicate and coordinate with each other to

关键词： deep reinforcement learning (DRL) distributed generation (DG) grid-tied photovoltaic systems (GTPS) insulated gate bipolar transistors (IGBTs)

来源：评论

学校读者我要写书评

暂无评论

Assessing the Scalability and Privacy of Energy Communities by Using a Large-Scale distributed and parallel real-time Optimization

引用

IEEE ACCESS 2022年 10卷 69771-69787页

作者： Dolatabadi, Mohammad Siano, Pierluigi Soroudi, Alireza Vali E Asr Univ Rafsanjan Dept Math Rafsanjan *** Iran Univ Salerno Dept Management & Innovat Syst I-84084 Fisciano Italy Univ Johannesburg Dept Elect & Elect Engn Sci ZA-2006 Johannesburg South Africa Univ Coll Dublin Sch Elect & Elect Engn Dublin D04 V1W8 4 Ireland

In the context of the energy transition, energy communities are gaining increasing attention all over the world, in recent years. By participating in an energy community, prosumers may take a leading role in the energy transition and improve the self-consumption of renewable energy produced inside the community. Prosumers can carry out energy exchanges inside the energy community and provide ancillary services to the system operators, thus contributing to improve the efficiency and stability of the grid. A novel scalable, privacy-preserving, and real-time distributed parallel optimization is proposed to manage a large-scale energy community, considering energy exchanges inside the community according to the model of virtual self-consumption and the provision of ancillary services. The proposed method preserves the privacy of prosumers and allows the assessment of the impact of energy exchanges on the ancillary services provided by an energy community. Simulation results confirmed that the proposed method is superior in terms of privacy if compared with the equivalent centralized optimization and that it has a convergence rate higher than that of the splitting conic solver (SCS).

关键词： Optimization Privacy real-time systems Scalability Renewable energy sources Mathematical models Convergence Energy communities PV-battery systems distributed optimization prosumers

来源：评论

学校读者我要写书评

暂无评论

Cybersecurity in distributed Industrial Digital Twins: Threats, Defenses, and Key Takeaways 1

Cybersecurity in Distributed Industrial Digital Twins: Threa...

引用

1st International workshop on distributed Digital Twins, DiDit 2024

作者： Abdullahi, Sani M. Zare, Ashkan Lazarova-Molnar, Sanja Maersk Mc-Kinney Moller Institute University of Southern Denmark Campusvej 55 Odense5230 Denmark Institute AIFB Karlsruhe Institute of Technology Kaiserstr. 89 Karlsruhe76133 Germany

distributed Digital Twins are designed to enhance the intelligence, predictability, and optimization of industrial assets by actively engaging, synchronizing, and collaborating with their physical counterparts, i.e., the systems they model, in near real time. This interoperability allows for seamless connections between real systems and their virtual counterparts, thereby facilitating the flow of data while aggregating vital information for comprehensive insights across large entities. However, the constant exchange of data and dependency on the information technology and operations technology process integrations in these complex distributed systems give rise to various cyber-security challenges. These include threats to data, unauthorized accesses, as well as threats to the integrity and reliability of the digital tools and the services they offer, among others. In this paper, we discuss the relevant cyber-threats within distributed Digital Twins ecosystems, which we then analyze while outlining different strategies to mitigate such threats. As a result, we present key takeaways toward a secure and reliable Digital Twin platform. Finally, different challenges are raised to highlight the status quo on the security of Digital Twins and areas for improvement. © 2024 Copyright for this paper by its authors.

关键词： Cyber attacks

来源：评论

学校读者我要写书评

暂无评论

parallel Topology-aware Mesh Simplification on Terrain Trees

引用

ACM TRANSACTIONS ON SPATIAL ALGORITHMS AND systems 2024年第2期10卷 1-39页

作者： Song, Yunting Fellegara, Riccardo Iuricich, Federico De Floriani, Leila Univ Maryland 4600 River Rd Riverdale MD 20737 USA Inst Software Technol German Aerosp Ctr DLR Lilienthalpl 7 D-38108 Braunschweig Germany Clemson Univ McAdams Hall 305 Clemson SC 29631 USA

We address the problem of performing a topology-aware simplification algorithm on a compact and distributed data structure for triangle meshes, the Terrain trees. Topology-aware operators have been defined to coarsen a Triangulated Irregular Network (TIN) without affecting the topology of its underlying terrain, i.e., without modifying critical features of the terrain, such as pits, saddles, peaks, and their connectivity. However, their scalability is limited for large-scale meshes. Our proposed algorithm uses a batched processing strategy to reduce both the memory and time requirements of the simplification process, and thanks to the spatial decomposition on the basis of Terrain trees, it can be easily parallelized. Also, since a Terrain tree after the simplification process becomes less compact and efficient, we propose an efficient post-processing step for updating hierarchical spatial decomposition. Our experiments on real-world TINs, derived from topographic and bathymetric LiDAR data, demonstrate the scalability and efficiency of our approach. Specifically, topology-aware simplification on Terrain trees uses 40% less memory and half the time compared to the most compact and efficient connectivity-based data structure for TINs. Furthermore, the parallel simplification algorithm on the Terrain trees exhibits a 12x speedup with an OpenMP implementation. The quality of the output mesh is not significantly affected by the distributed and parallel simplification strategy of Terrain trees, and we obtain similar quality levels compared to the global baseline method.

关键词： Terrain simplification edge contraction spatial indexes topological methods shared memory processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：