Support vector machine (SVM) is a powerful binary classification statistical learning tool. In real applications, streaming data are common, which arrive in batches and have unbounded cumulative size. Because of the m...
详细信息
Support vector machine (SVM) is a powerful binary classification statistical learning tool. In real applications, streaming data are common, which arrive in batches and have unbounded cumulative size. Because of the memory constraints of one single computer, the classical SVM solving the entire data together is unsuitable. Furthermore, the non-smoothness of hinge loss in SVM also poses high computational complexity. To overcome these issues, we first develop a convolution smoothing approach that achieves smooth and convex approximation to SVM. Then an onlineupdating SVM is proposed, in which the estimators are renewed with current data and historical summary statistics. In theory, we prove that the convolution smoothing SVM achieves adequate approximation to SVM, and they are asymptotically equivalent in inference. Furthermore, the onlineupdating SVM achieves the same efficiency as the classical SVM applying to the entire dataset. Numerical experiments on both synthetic and real data also validate our new methods.
Modal is a good alternative of mean. Streaming datasets are often encountered in modern data analysis, where a series of data sets becomes available sequentially and its cumulative data size is unbounded. Then the tra...
详细信息
Modal is a good alternative of mean. Streaming datasets are often encountered in modern data analysis, where a series of data sets becomes available sequentially and its cumulative data size is unbounded. Then the traditional modal regression computing the entire data together is not adaptive for such data. To address this issue, an online renewable modal regression learning is proposed, in which the modal regression estimator is renewed with current data and Hessian matrix of historical data. In theory, the estimation consistency and asymptotic normality of the renewable estimator are established, which leads to the oracle property. Numerical experiments are also included to confirm the good performance of the proposed method.
Composite quantile regression (CQR) has advantages in robustness and high estimation efficiency. In modern statistical learning, we often encounter streaming data sets with unbounded cumulative data sizes. However, li...
详细信息
Composite quantile regression (CQR) has advantages in robustness and high estimation efficiency. In modern statistical learning, we often encounter streaming data sets with unbounded cumulative data sizes. However, limited computer memory and non-smoothness of CQR objective function pose challenges to methods and algorithms. An interesting issue is how to implement CQR in the streaming data setting. To address this issue, this article first constructs a smooth CQR, and then an online renewable CQR procedure is proposed. In theory, the oracle property of the proposed renewable estimator is established, which gives theoretical guarantees. Numerical experiments also confirm the proposed methods.
This paper concerns quantile regression for streaming data, where large amounts of data arrive batch by batch. Limited memory and non-smoothness of quantile regression loss all pose challenges in both computation and ...
详细信息
This paper concerns quantile regression for streaming data, where large amounts of data arrive batch by batch. Limited memory and non-smoothness of quantile regression loss all pose challenges in both computation and theoretical development. To address the challenges, we first introduce a convex smooth quantile loss, which is infinitely differentiable and converges to the quantile loss uniformly. Then an online renewable framework is proposed, in which the quantile regression estimator is renewed with current data and summary statistics of historical data. In theory, the estimation consistency and asymptotic normality of the renewable estimator are established without any restriction on the total number of data batches, which leads to the oracle property, and gives theoretical guarantee that the new method is adaptive to the situation where streaming data sets arrive perpetually. Numerical experiments on both synthetic and real data verify the theoretical results and illustrate the good performance of the new method. (c) 2023 Elsevier B.V. All rights reserved.
The design of an energy management strategy for a hybrid electric vehicle typically requires an estimate of requested power from the driver. If the driving cycle is not known a priori, stochastic method such as a Mark...
详细信息
The design of an energy management strategy for a hybrid electric vehicle typically requires an estimate of requested power from the driver. If the driving cycle is not known a priori, stochastic method such as a Markov chain driver model (MCDM) must be employed. For tracked vehicles, steering power, which is related to the vehicle angular velocity, is a significant component of the driver demand. In this paper, a three-dimensional MCDM incorporating angular velocity for a tracked vehicle is proposed. Based on the nearest-neighborhood method (NNM), an online transition probability matrix (TPM) updatingalgorithm is implemented for the three-dimensional MCDM. Simulation results show that the TPM is able to update online when the driving cycle is available. Moreover, the older and recent observations can be weighted appropriately by adjusting a forgetting factor.
暂无评论