Many real-world deep learning applications are sensitive to the size and latency of the models, especially when inference is performed on low-power devices. In many scenarios, it is possible to optimize DNNs to meet l...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Many real-world deep learning applications are sensitive to the size and latency of the models, especially when inference is performed on low-power devices. In many scenarios, it is possible to optimize DNNs to meet latency and energy budgets without sacrificing too much accuracy. NetAdapt, a newly developed tool, tackles this issue by optimizing the network through the removal of layers based on practical latency measurements specific to the target device. In this work, we improve two stages of NetAdapt's algorithm. First, we enhance the candidate models' search procedure by improving the mechanism that interpolates latency measurements for configurations not available in the look-up table. Secondly and more importantly in terms of novelty, we study the potential benefits of integrating No-training NAS techniques into the hardware-aware optimization algorithm to avoid the fine-tuning and evaluation stages with minimum accuracy loss. The experimental evaluation shows that the optimizations applied can accelerate NetAdapt's algorithm up to 37x for a large network and 4x for a small one.
This article describes the possibility of parallelcomputing in networks-on-chip (NoCs). The NoC is considered as a MPSoC, which is a multiprocessor system whose processors are combined into a single network. The prob...
详细信息
In this paper, we focus on the cloud-edge collaborative network, where a task is decomposed into a set of functions and could be offloaded to different computing nodes, which is referred to as Function Computation Off...
详细信息
ISBN:
(纸本)9789819708000;9789819708017
In this paper, we focus on the cloud-edge collaborative network, where a task is decomposed into a set of functions and could be offloaded to different computing nodes, which is referred to as Function Computation Offloading (FCO). One of the most important problems in FCO is to schedule the functions in computing nodes to achieve low latency and high reliability. We formulate FCO scheduling in the Cloud-edge Collaborative network as mixed-integer nonlinear programming. The objective is to minimise the end-to-end delay of a task while satisfying the latency and reliability constraints. To solve the problem, we propose an efficient mechanism to decide the redundancy of functions according to the reliability requirements. Then, we deploy the non-redundant functions on the computing nodes. Finally, we present a Reinforcement Learning (RL) to learn the scheduling policy of the redundant functions to further reduce the end-to-end delay of the task. Simulation results show that our proposed algorithm can significantly reduce tasks' completion time by about 13-26% with fewer iterations compared with other alternatives.
The rapid evolution of IIoT (Industrial Internet of Things) in computing has brought about numerous security concerns, among which is the looming threat of False Data Injection (FDI) attacks. To address these attacks,...
详细信息
ISBN:
(纸本)9798350391558;9798350379990
The rapid evolution of IIoT (Industrial Internet of Things) in computing has brought about numerous security concerns, among which is the looming threat of False Data Injection (FDI) attacks. To address these attacks, a study introduces a novel approach called MLBT-FDIA-IIoT (Fault Data Injection Attack Detection in IIoT using parallel Physics-Informed Neural networks with Giza Pyramid Construction Optimization algorithm). This method makes use of real-time sensor data for attack detection. The data is preprocessed using Distributed Set-Membership Fusion Filtering (DSMFF) to remove noise. Then, it is fed into a neural network for classification. Specifically, parallel Physics-Informed Neural networks (PPINN) are used to distinguish between normal operations and False Data Injection Attacks (FDIAs). However, PPINN lacks optimization methods for accurate detection. To address this, the study proposes the Giza Pyramid Construction Optimization algorithm (GPCOA). This algorithm optimizes the PPINN classifier to detect attacks with more precision. The proposed MLBT-FDIA-IIoT method is implemented using MATLAB and evaluates various metrics such as accuracy, recall, and precision. The results demonstrate significant improvements compared to existing techniques such as MLT-FDI-IIoT, FDIA-FDAS-IIoT, and DCDD-IIoT-FDIA.
作者:
Liu, KangkangChen, NingjiangGuangxi Univ
Coll Comp & Elect Informat Nanning Peoples R China Guangxi Univ
Educ Dept Guangxi Zhuang Autonomous Reg Key Lab Parallel Distributed & Intelligent Comp Nanning Peoples R China
With the increasing performance of deep convolutional neural networks, they have been widely used in many computer vision tasks. However, a huge convolutional neural network model requires a lot of memory and computin...
详细信息
ISBN:
(纸本)9798350349184;9798350349191
With the increasing performance of deep convolutional neural networks, they have been widely used in many computer vision tasks. However, a huge convolutional neural network model requires a lot of memory and computing resources, which makes it difficult to meet the requirements of low latency and reliability of edge computing when the model is deployed locally on resource-limited devices in edge environments. Quantization is a kind of model compression technology, which can effectively reduce model size, calculation cost and inference delay, but the quantization noise will cause the accuracy of the quantization model to decrease. Aiming at the problem of precision loss caused by model quantization, this paper proposes a post-training quantization method based on scale optimization. By reducing the influence of redundant parameters in the model on the quantization parameters in the process of model quantization, the scale factor optimization is realized to reduce the quantization error and thus improve the accuracy of the quantized model, reduce the inference delay and improve the reliability of edge applications. The experimental results show that under different quantization strategies and different quantization bit widths, the proposed method can improve the accuracy of the quantized model, and the absolute accuracy of the optimal quantization model is improved by 1.36%. The improvement effect is obvious, which is conducive to the application of deep neural network in edge environment.
The active millimeter-wave scanner plays an increasingly pivotal role in public safety by employing a non-contact method to detect contraband concealed beneath human clothing. However, millimeter-wave images encounter...
详细信息
The proceedings contain 53 papers. The topics discussed include: cloud-enabled blood bank management for an efficient healthcare system;a face forgery video detection model based on knowledge distillation;design of a ...
ISBN:
(纸本)9798350391954
The proceedings contain 53 papers. The topics discussed include: cloud-enabled blood bank management for an efficient healthcare system;a face forgery video detection model based on knowledge distillation;design of a sharing system based on privacy-preserving personal data;optimizing software evolution: navigating the landscape through concept location;DGBot: a DeGlobalizing graph transformer model for bot detection;radio number of the cartesian product of stars and middle graph of cycles;image denoising based on Swin transformer residual Conv U-Net;a social network analysis of user-organized community on digital music platform;differential game and simulation of supply chain joint promotion considering spillover effect;and analysis of subjective evaluation of ai speech synthesis emotional expressiveness.
Containers are widely deployed in clouds. There are two common container architectures: operating system-level (OS-level) container and virtual machine-level (VM-level) container. Typical examples are runc and Kata. I...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
Containers are widely deployed in clouds. There are two common container architectures: operating system-level (OS-level) container and virtual machine-level (VM-level) container. Typical examples are runc and Kata. It is well known that VM-level containers provide better isolation than OS-level containers, but at a higher overhead. Although there are quantitative analyses of the performance gap between these two container architectures, they rarely discuss the performance gap under the constrained resources provisioned to containers. Since the high-density deployment of containers is demanding in the cloud, each container is provisioned with limited resources specified by the cgroup mechanism. In this paper, we provide an in-depth analysis of the storage and network (two key aspects) performance differences between runc and Kata under varying resource constraints. We identify configuration implications that are crucial to performance and find that some of them are not exposed by the Kata interfaces. Based on that, we propose a profiling tool to automatically offer configuration suggestions for optimizing container performance. Our evaluation shows that the auto-generated configuration can improve the performance of MySQL by up to 107% in the TPCC benchmark compared with the default Kata setup.
The paper details an Ethereum blockchain platform for smart grid energy trading, employing smart contracts and security measures like access control. It separates front-end and back-end, supporting secure integration ...
详细信息
A scalable bandwidth-adaptive on-chip storage network architecture is proposed to address the severe data conflict and low bus parallelism in existing multi-level storage, Crossbar, and NoC architectures in edge accel...
详细信息
暂无评论