The computational requirements of artificialintelligence workloads are growing exponentially. In addition, more and more compute is moved towards the edge due to latency or localization constraints. At the same time,...
详细信息
ISBN:
(纸本)9798350304831
The computational requirements of artificialintelligence workloads are growing exponentially. In addition, more and more compute is moved towards the edge due to latency or localization constraints. At the same time, Dennard scaling has ended and Moore's law is winding down. These trends created an opportunity for specialized accelerators including field-programmable gate arrays (FPGAs), but the poor support and usability of today's tools prevents FPGAs from being deployed at scale for deep neural network (DNN) inference applications. In this work, we propose an organic compiler - DOSA - that drastically lowers the barrier for deploying FPGAs. DOSA builds on the operation set architecture concept and integrates the DNN accelerator components generated by existing DNN-to-FPGA frameworks to produce an overall efficient solution. DOSA starts from DNNs represented in the community standard ONNX and automatically implements model- and data-parallelism, based on the performance targets and resource footprints provided by the user. Deploying a DNN using DOSA on 9 FPGAs exhibits a speedup of up to 52 times compared to a CPU and 18 times compared to a GPU.
With the increasing requirement for personalized customization service, discrete manufacturing workshop, as the parts processing unit in manufacturing system, is expected for more agile and fast adaptation to environm...
详细信息
With the increasing requirement for personalized customization service, discrete manufacturing workshop, as the parts processing unit in manufacturing system, is expected for more agile and fast adaptation to environment changes, dynamically handling production tasks according to resource conditions. Simultaneously, distributed artificial intelligence system (e.g. multiagent manufacturing system and the holonic manufacturing system) has been considered as an important approach for developing industrial applications to solve the problems of complexity, uncertainty, and dynamic in the modern manufacturing environment. But the lack of universality and the difficulty in deployment have restricted the use of distributed artificial intelligence in actual industrial sites. For this issue, a new concept of agent computing node is proposed in this paper to enable the realization of multiagent manufacturing system. Adaptation layer, information development layer, and intelligent analysis layer are investigated for standardizing the configuration mode of agent computing node. Cooperating agent computing node with the radio frequency identification-based dynamic recognition technology for workpiece machining process is presented in this paper, and a practical approach for multiagent manufacturing system is considered, which can apply the functions regarding to deployment of dynamic scheduling and plug-and-play. A laboratory discrete manufacturing workshop system is used as a case study to prove the feasibility of this approach. In addition, a verification in industry is carried out, and the result proves the universality of this approach.
Since the 1970s, most airlines have incorporated computerized support for managing disruptions during flight schedule execution. However, existing platforms for airline disruption management (ADM) employ monolithic sy...
详细信息
Since the 1970s, most airlines have incorporated computerized support for managing disruptions during flight schedule execution. However, existing platforms for airline disruption management (ADM) employ monolithic system design methods that rely on the creation of specific rules and requirements through explicit optimization routines, before a system that meets the specifications is designed. Thus, current platforms for ADM are unable to readily accommodate additional system complexities resulting from the introduction of new capabilities, such as the introduction of unmanned aerial systems, operations, and infrastructure, to the system. To this end, historical data on airline scheduling and operations recovery are used to develop a system of artificial neural networks (ANNs), which describe a predictive transfer function model (PTFM) for promptly estimating the recovery impact of disruption resolutions at separate phases of flight schedule execution during ADM. Furthermore, this paper provides a modular approach for assessing and executing the PTFM by employing a parallel ensemble method to develop generative routines that amalgamate the system of ANNs. Our modular approach ensures that current industry standards for tardiness in flight schedule execution during ADM are satisfied, while accurately estimating appropriate time-based performance metrics for the separate phases of flight schedule execution.
The increasing computational complexity of DNNs achieved unprecedented successes in various areas such as machine vision and natural language processing (NLP), e.g., the recent advanced Transformer has billions of par...
详细信息
The increasing computational complexity of DNNs achieved unprecedented successes in various areas such as machine vision and natural language processing (NLP), e.g., the recent advanced Transformer has billions of parameters. However, as large-scale DNNs significantly exceed GPU's physical memory limit, they cannot be trained by conventional methods such as data parallelism. Pipeline parallelism that partitions a large DNN into small subnets and trains them on different GPUs is a plausible solution. Unfortunately, the layer partitioning and memory management in existing pipeline parallel systems are fixed during training, making them easily impeded by out-of-memory errors and the GPU under-utilization. These drawbacks amplify when performing neural architecture search (NAS) such as the evolved Transformer, where different network architectures of Transformer needed to be trained repeatedly. vPipe is the first system that transparently provides dynamic layer partitioning and memory management for pipeline parallelism. vPipe has two unique contributions, including (1) an online algorithm for searching a near-optimal layer partitioning and memory management plan, and (2) a live layer migration protocol for re-balancing the layer distribution across a training pipeline. vPipe improved the training throughput of two notable baselines (Pipedream and GPipe) by 61.4-463.4 percent and 24.8-291.3 percent on various large DNNs and training settings.
This study presents an innovative aggregation scheme for model-agnostic, local, heterogeneous data models within the domain of Federated Learning. The proposed approach imposes minimal constraints on local models, onl...
详细信息
ISBN:
(纸本)9798350331325
This study presents an innovative aggregation scheme for model-agnostic, local, heterogeneous data models within the domain of Federated Learning. The proposed approach imposes minimal constraints on local models, only necessitating local model parameters and distances from local data centroids for a particular query. These requirements facilitate the design of privacy-preserving learning systems. We introduce a system architecture based on federated interpolation to operationalize the proposed scheme. The accuracy of our proposed scheme is evaluated using two distinct real-world datasets. We compare our results to the extreme case of a single-client scenario having complete access to all data points. Our findings indicate that, on average, federated interpolation maintains robust accuracy, experiencing a slight reduction of less than 10% compared to the single-client model with full data access.
Conventional machine learning techniques are conducted in a centralized manner. Recently, the massive volume of generated wireless data, the privacy concerns and the increasing computing capabilities of wireless end-d...
详细信息
Conventional machine learning techniques are conducted in a centralized manner. Recently, the massive volume of generated wireless data, the privacy concerns and the increasing computing capabilities of wireless end-devices have led to the emergence of a promising decentralized solution, termed as Wireless Federated Learning (WFL). In this first of the two parts letter, we present the application of WFL in the sixth generation of wireless networks (6G), which is envisioned to be an integrated communication and computing platform. After analyzing the key concepts of WFL, we discuss the core challenges of WFL imposed by the wireless (or mobile communication) environment. Finally, we shed light to the future directions of WFL, aiming to compose a constructive integration of FL into the future wireless networks.
The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning...
详细信息
The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on recent work and open research challenges in developing trustworthy autonomous systems and deploying human-centred AI systems that aim to support societal good.
Surgical planning is a preponderant step in the management of operating theaters, which becomes more and more solicited;considering the multiplicity and the complexity of the human and material components, which inter...
详细信息
ISBN:
(纸本)9783031298561;9783031298578
Surgical planning is a preponderant step in the management of operating theaters, which becomes more and more solicited;considering the multiplicity and the complexity of the human and material components, which intervene there;in order to face the various disturbances hindering the normal course of the surgical activity. It is a subject widely discussed in the literature with the realization of several solutions and applications, but which remain globally incompatible with the realities of the surgical process. For this reason, we propose a daily surgical planning realized with a multi-agent system (MAS) based on distributed artificial intelligence (DAI). We describe some basic architectural entities of MAS in relation with the surgical planning, before presenting their application on a real case of the orthopedic surgery department B4 of the CHU Hassan II of Fez-Morocco. The objective of this work is to elaborate a daily, dynamic and real time surgical program answering the various possible and frequent disturbances altering the process of the operating theater.
Federated Learning (FL) has emerged as a privacy-preserving distributed learning framework which enables IoT devices to collaboratively train machine learning models via sharing model parameters. However, inefficiency...
详细信息
ISBN:
(纸本)9798350310900
Federated Learning (FL) has emerged as a privacy-preserving distributed learning framework which enables IoT devices to collaboratively train machine learning models via sharing model parameters. However, inefficiency due to frequent parameters transmissions significantly reduces FL performance. Existing acceleration algorithms for speeding up FL training consist of two main types including local update and parameter compression which consider the trade-offs between communication and computation/precision respectively. Jointly considering these two trade-offs and adaptively balancing their impacts on convergence have remained unresolved. To solve the problem, we propose an efficient adaptive federated optimization (EAFO) algorithm to improve the efficiency of FL in resource-constrained IoT environments, which minimizes the learning error by the joint consideration of two variables consisting of the local update and parameter compression. The EAFO enables FL to adaptively adjust two variables and balance trade-offs among computation, communication, and precision. The experiment results illustrate the high effectiveness of the proposed EAFO algorithm, which can achieve higher accuracies faster compared with the state-of-the-art algorithms.
The proliferation of edge computing technologies has boosted the development of new applications for a plethora of edge devices. However, many applications face privacy issues and bandwidth limitations. To solve these...
详细信息
The proliferation of edge computing technologies has boosted the development of new applications for a plethora of edge devices. However, many applications face privacy issues and bandwidth limitations. To solve these limitations, we propose a collaborative learning framework on the edges, named CLONE, which is steered by the real-world data sets collected from a large electric vehicle (EV) company and a grocery store of a shopping mall, respectively. We categorize two application scenarios for CLONE, i.e., CLONE in the training stage (CLONE_training) and CLONE in the inference stage (CLONE_inference). As to CLONE_training, we choose the failure prediction of EV battery and associated components as the first use case. While as for CLONE_inference, customer tracking in a grocery store is selected as another case study. In this work, the goal of the CLONE is to support real-time training and inference for connected vehicles and marketing intelligence services. Our experimental results on the EV data show that CLONE is able to reduce model training time without sacrificing algorithm performance. Furthermore, the experimental results on the video data from the grocery store reveal that CLONE is a useful approach to solve the multitarget multicamera tracking problem in a collaborative fashion.
暂无评论