Mobile edge computing is conducive to artificial intelligence computing near terminals, in which Deep Neural Networks (dnns) should be partitioned to allocate tasks partially to the edge for execution to reduce latenc...
详细信息
Mobile edge computing is conducive to artificial intelligence computing near terminals, in which Deep Neural Networks (dnns) should be partitioned to allocate tasks partially to the edge for execution to reduce latency and save energy. Most of the existing studies assume that the tasks are of the same type or the computing resources of the server are the same. In real life, Mobile Devices (MDs) and Edge Servers (ESs) are heterogeneous in type and computing resources, it is challenging to find the optimal partition point for each dnn and offload it to an appropriate ES. To fill this gap, we propose a partitioning-and-offloading scheme for the heterogeneous tasks-server system to reduce the overall system latency and energy consumption on dnn inference. The scheme has four steps. First, it establishes a partitioning and task offloading model for adaptive dnn model. Second, to reduce the solution space, the scheme designs a Partition Point Retain (PPR) algorithm. After that, the scheme gives an Optimal Partition Point (OPP) Algorithm to find the optimal partition point with the minimum cost for each ES corresponding to each MD. Based on the partition points, an offloading of dnn tasks for each MD is presented to finish the whole scheme. Simulations show that the proposed scheme reduces the total cost by 77.9% and 59.9% on average compared to Only-Local and Only-Server respectively in the heterogeneous edge computing environment.
Deep Neural Networks (dnns) have emerged as the preferred solution for Internet of Things (IoT) applications, owing to their remarkable performance capabilities. However, the inherent complexity of dnns presents signi...
详细信息
ISBN:
(纸本)9789819608041;9789819608058
Deep Neural Networks (dnns) have emerged as the preferred solution for Internet of Things (IoT) applications, owing to their remarkable performance capabilities. However, the inherent complexity of dnns presents significant challenges for IoT devices that are constrained by limited computational power and battery life. To adeptly navigate the demands of intricate inference tasks, edge computing is leveraged, enabling collaborative inference of dnns between IoT devices and edge servers. However, existing research rarely focus simultaneously on the power consumption of IoT devices, the latency of collaborative inference and the cost of edge servers. Moreover, current research seldom takes into account the deployment of multiple dnn applications on IoT devices, a critical factor for adapting to increasingly complex edge-end collaborative environments. This research focuses on optimizing the inference power consumption of multiple dnn applications deployed on IoT devices in larger-scale edge-end collaboration environments, under the constraints of maximum End-to-End latency and the cost of edge servers. To address this issue, we propose the Greedy Genetic Algorithm, which leverages a combination of greedy strategy and Genetic Algorithm. The performance of our proposed method is extensively evaluated through experiments, demonstrating its superiority in achieving lower inference power consumption with fewer iterations compared to existing solutions.
暂无评论