Data intensive batch processing scientific workflow is a typical application model in the era of big data. A reasonable scheduling method can improve the resource utilization rate and reduce the rental cost on the pre...
详细信息
Tensor algebra, the main component of several popular machine learning techniques, benefits from modern accelerators due to the massive parallelism and data reuse available. To achieve the benefits, however, optimizin...
详细信息
ISBN:
(纸本)9798350397390
Tensor algebra, the main component of several popular machine learning techniques, benefits from modern accelerators due to the massive parallelism and data reuse available. To achieve the benefits, however, optimizing the dataflow is crucial: prior works showed that 19x energy savings are possible by tuning the dataflow. This optimization is challenging because: (1) the optimization space for modern chip architectures with several levels of memory and multiple levels of spatial processing is vast, and (2) distinct tensor computations follow different memory access and reuse patterns. In this manuscript, we algebraically analyze the possible reuse when executing tensor workloads on an accelerator. Based on our analysis, we develop several principles that significantly reduce the dataflow optimization space even for modem, complex chip architectures. Moreover, these principles are transferable to various tensor workloads with different memory access patterns. Compared to prior work, our techniques can find dataflow for typical tensor workloads up to 800x faster and with up to 1.9x better energy-delay products.
The existing scheduling strategies do not have enough granularity in the division of execution units, and can not flexibly use the attack feedback information. Lack of security to some extent. Therefore, this paper pr...
详细信息
In recent years, Kubernetes (K8s) has become a dominant resource management and scheduling system in the cloud. In practical scenarios, short-running cloud workloads are usually scheduled through different scheduling ...
详细信息
In recent years, Kubernetes (K8s) has become a dominant resource management and scheduling system in the cloud. In practical scenarios, short-running cloud workloads are usually scheduled through different scheduling algorithms provided by Kubernetes. For example, artificial intelligence (AI) workloads are scheduled through different Volcano scheduling algorithms, such as GANG_MRP, GANG_LRP, and GANG_BRA. One key challenge is that the selection of scheduling algorithms has considerable impacts on job performance results. However, it takes a prohibitively long time to select the optimal algorithm because applying one algorithm in one single job may take a few minutes to complete. This poses the urgent requirement of a simulator that can quickly evaluate the performance impacts of different algorithms, while also considering scheduling-related factors, such as cluster resources, job structures and scheduler configurations. In this paper, we design and implement a Kubernetes simulator called K8sSim, which incorporates typical Kubernetes and Volcano scheduling algorithms for both generic and AI workloads, and provides an accurate simulation of their scheduling process in real clusters. We use real cluster traces from Alibaba to evaluate the effectiveness of K8sSim, and the evaluation results show that (i) compared to the real cluster, K8sSim can accurately evaluate the performance of different scheduling algorithms with similar CloseRate (a novel metric we define to intuitively show the simulation accuracy), and (ii) it can also quickly obtain the scheduling results of different scheduling algorithms by accelerating the scheduling time by an average of 38.56x.
The Internet of Things (IoT) is rapidly evolving, and this has supported the adoption of a new computing paradigm that moves processing power to the network's edge. The job must be assigned to the computer nodes, ...
详细信息
In this paper, we consider the on-line single machine scheduling problem with release dates and submodular rejection penalties. We are given a single machine and a sequence of jobs that arrive on-line and must be imme...
详细信息
Flex-route feeder transit is a promising way to improve the flexibility of first/last mile service. In this paper, we propose an optimal scheduling strategy which can provide immediate feedback to enhance the attracti...
详细信息
In the current world of big data analytics and huge processing request from the client server system the service requirement from the computing cores has been growing exponentially. In this paper we proposed a novel i...
详细信息
Regarding the trade in computation time and optimisation in each task scheduling, it is challenging for a single scheduling method to achieve the optimal scheduler optimisation while minimizing the very worst computat...
详细信息
In the Integrated Space and Onboard Network, transmission demands arrive randomly. In addition, the routing algorithm calculate the end-to-end paths the average occupied bandwidth over a period of time, causing microb...
详细信息
暂无评论