Matrix computation is considered to be the core of many machine learning and graph algorithm workloads. In traditional single-node age, numerical analysis platforms like R and Matlab provide matrix programming model n...
详细信息
ISBN:
(纸本)9781509036820
Matrix computation is considered to be the core of many machine learning and graph algorithm workloads. In traditional single-node age, numerical analysis platforms like R and Matlab provide matrix programming model natively. As data is increasingly scaled up in the Big Data era, there is an increasing demand to seamlessly integrate large-scale matrix computation into distributed data-parallel computing systems. Therefore a variety of matrix computation libraries have been implemented on these distributed computing platforms such as MPI, Hadoop and Spark. However, a specific matrix-based algorithm has quite different performance over different platforms and it is very challenging for data scientists to specify the platform or combination of platforms for a given algorithm workflow to achieve the best performance. To solve this problem, in this paper, we put forward a time-cost based scheduling framework that can automatically specify the best platforms for the matrix operations and schedule the execution workflow. We have implemented a system prototype which using R as the user language and MPI, R and Spark as the backend computingplatforms. The experimental results show that our time-cost based model has good accuracy with less than 10% error rate on average. Moreover, the scheduling framework built on it achieves efficient performance in applications.
Recent emerging applications from a wide range of scientific domains often require a very large number of loosely coupled tasks to be efficiently processed. To support such applications effectively, all the available ...
详细信息
ISBN:
(纸本)9781479980062
Recent emerging applications from a wide range of scientific domains often require a very large number of loosely coupled tasks to be efficiently processed. To support such applications effectively, all the available resources from different types of computingplatforms such as supercomputers, grids, and clouds need to be utilized. However, exploiting heterogeneous resources from the platforms for multiple loosely coupled many-task applications is challenging, since the performance of an application can vary significantly depending on which platform is used to run it, and which applications co-run in the same node with it. In this paper, we analyze the platform and co-runner affinities of many-task applications in distributed computing platforms. We perform a comprehensive experimental study using four different platforms, and five many-task applications. We then present a two-level scheduling algorithm, which distributes the resources of different platforms to each application based on the platform affinity in the first level, and maps tasks of the applications to computing nodes based on the co-runner affinity for each platform in the second level. Finally, we evaluate the performance of our scheduling algorithm, using a trace-based simulator. Our simulation results demonstrate that our scheduling algorithm can improve the performance up to 30.0%, compared to a baseline scheduling algorithm.
暂无评论