The advent of computingpowernetwork (CPN) has opened up vast opportunities for machine learning inference, yet the challenge of reducing high operational cost due to intensive computations and the sheer volume of in...
详细信息
The advent of computingpowernetwork (CPN) has opened up vast opportunities for machine learning inference, yet the challenge of reducing high operational cost due to intensive computations and the sheer volume of inference tasks cannot be overlooked. scheduling inference tasks for mitigating operational cost involves various challenges, such as migrating tasks under unpredictable CPN status, making time-coupled decisions for resource provisioning, and selecting computing sites based on dynamic electricity prices. To address these issues, we introduce CPN-Inference, a novel and flexible inference framework built upon CPN. Specifically, we formulate a time-varying integer program problem that aims to minimize long-term cost, involving switching cost, operational cost, communication cost, queuing cost, and accuracy loss. We also propose a group of polynomial-time online algorithms for supporting the formulated problem by solving delicately constructed subproblems based on the inputs predicted via online learning. Furthermore, our algorithms are proven for their competitive ratio, showcasing the performance gap between our approach and the offline optimum. A testbed is constructed to evaluate inference performance on real devices. Our comprehensive evaluations, based on datasets from real systems, demonstrate that our algorithms outperform multiple alternatives, by achieving an average cost reduction of 35%.
暂无评论