In this paper, we construct forecasting models (multivariate long short-term memory recurrent neural networks and multiple linear regression) for the resourceusage prediction of four MapReduce applica-tions and appli...
详细信息
In this paper, we construct forecasting models (multivariate long short-term memory recurrent neural networks and multiple linear regression) for the resourceusage prediction of four MapReduce applica-tions and applications executed within the Spark framework. We have evaluated the impact of a sample size to prediction accuracy. Also, we propose a phase modelling approach for read/write-intensive applications. Our results show that models based on long short-term memory recurrent neural networks exhibit a higher accuracy than multiple linear regression models and the intensive characteristics of a resource are closely related to the prediction accuracy of forecasting models. We investigated the hyper parameter tuning of such models and showed that a randomly initialised, shallow, well-tuned network may outperform deeper models that use stacked autoencoder initialisation. Furthermore, multivariate long short-term memory recurrent neural network models are more sensitive to sample size than multiple linear regression models. We show that an LSTM model trained in a specific machine may be used to predict the resourceusage in another machine. (C) 2020 Elsevier B.V. All rights reserved.
暂无评论