Consider a linearregression model yi = x(i)(T) beta + e(i), i = 1,2,..., n, where {e(i)} are independent identically distributed (iid) random variables with zero mean and known variance sigma(2). Based on the maximum...
详细信息
Consider a linearregression model yi = x(i)(T) beta + e(i), i = 1,2,..., n, where {e(i)} are independent identically distributed (iid) random variables with zero mean and known variance sigma(2). Based on the maximum Lq-likelihood estimator (MLqE) and the penalized likelihood estimator (PLE), we introduce a new parametric estimator which is called penalized Lq-likelihood estimator (PLqE). We investigate its Oracle properties and influence function. Simulation results support the validity of our approach. Furthermore, it is shown that the PLqE is robust, while the PLE is not.
The estimation of variance function plays an extremely important role in statistical inference of the regressionmodels. In this paper we propose a variance modelling method for constructing the variance structure via...
详细信息
The estimation of variance function plays an extremely important role in statistical inference of the regressionmodels. In this paper we propose a variance modelling method for constructing the variance structure via combining the exponential polynomial modelling method and the kernel smoothing technique. A simple estimation method for the parameters in heteroscedastic linear regression models is developed when the covariance matrix is unknown diagonal and the variance function is a positive function of the mean. The consistency and asymptotic normality of the resulting estimators are established under some mild assumptions. In particular, a simple version of bootstrap test is adapted to test misspecification of the variance function. Some Monte Carlo simulation studies are carried out to examine the finite sample performance of the proposed methods. Finally, the methodologies are illustrated by the ozone concentration dataset.
With the rise of third parties in the machine learning pipeline, the service provider in "Machine Learning as a Service" (MLaaS), or external data contributors in online learning, or the retraining of existi...
详细信息
With the rise of third parties in the machine learning pipeline, the service provider in "Machine Learning as a Service" (MLaaS), or external data contributors in online learning, or the retraining of existing models, the need to ensure the security of the resulting machine learning models has become an increasingly important topic. The security community has demonstrated that without transparency of the data and the resulting model, there exist many potential security risks, with new risks constantly being discovered. In this paper, we focus on one of these security risks - poisoning attacks. Specifically, we analyze how attackers may interfere with the results of regression learning by poisoning the training datasets. To this end, we analyze and develop a new poisoning attack algorithm. Our attack, termed Nopt, in contrast with previous poisoning attack algorithms, can produce larger errors with the same proportion of poisoning data-points. Furthermore, we also significantly improve the state-of-the-art defense algorithm, termed TRIM, proposed by Jagielsk et al. (IEEE S&P 2018), by incorporating the concept of probability estimation of clean data-points into the algorithm. Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors arising from the poisoning dataset through optimizing ensemble models. We highlight that the time complexity of TRIM had not been estimated;however, we deduce from their work that TRIM can take exponential time complexity in the worst-case scenario, in excess of Proda's logarithmic time. The performance of both our proposed attack and defense algorithms is extensively evaluated on four real-world datasets of housing prices, loans, health care, and bike sharing services. We hope that our work will inspire future research to develop more robust learning algorithms immune to poisoning attacks.
Time studies of harvesting and skidding tree-length logs in Aleppo pine (Pinus halepensis L.) natural coastal forests of Chalkidiki area in northern Greece were carried out to formulate linear regression models and to...
详细信息
Time studies of harvesting and skidding tree-length logs in Aleppo pine (Pinus halepensis L.) natural coastal forests of Chalkidiki area in northern Greece were carried out to formulate linear regression models and to evaluate productivity. The harvesting system consisted of a feller with chainsaw for felling, delimbing and crosscutting, and a four wheel drive farm tractor, with a 74 kW engine, equipped with a special winch attached to the tractor three point hitch for the extraction of tree length logs. Operational factors such as distance, slope, volume and the time required for harvesting and extracting tree length logs were measured and recorded. The results illustrate that the calibrated linear regression models show strong correlation between the time needed for harvesting operations and the extraction distance from the stump to the forest road.
We propose a new system identification method, called Sign-Perturbed Sums (SPS), for constructing non-asymptotic confidence regions under mild statistical assumptions. SPS is introduced for linear regression models, i...
详细信息
We propose a new system identification method, called Sign-Perturbed Sums (SPS), for constructing non-asymptotic confidence regions under mild statistical assumptions. SPS is introduced for linear regression models, including but not limited to FIR systems, and we show that the SPS confidence regions have exact confidence probabilities, i.e., they contain the true parameter with a user-chosen exact probability for any finite data set. Moreover, we also prove that the SPS regions are star convex with the Least-Squares (LS) estimate as a star center. The main assumptions of SPS are that the noise terms are independent and symmetrically distributed about zero, but they can be nonstationary, and their distributions need not be known. The paper also proposes a computationally efficient ellipsoidal outer approximation algorithm for SPS. Finally, SPS is demonstrated through a number of simulation experiments.
The paper considers a new family of explicit or fully operational two-stage Stein or hierarchial information (2SHI) estimators for linear regression models, and provides an expression for the difference between the ri...
详细信息
The paper considers a new family of explicit or fully operational two-stage Stein or hierarchial information (2SHI) estimators for linear regression models, and provides an expression for the difference between the risks of these estimators and the usual Stein-rule estimator when the variance of the disturbance is small. The condition under which the 2SHI estimators have smaller average MSE than the Stein-rule estimator is also given.
Bayesian influence measures for linear regression models have been developed mostly for normal regressionmodels with noninformative prior distributions for the unknown parameters. In this work we extend existing resu...
详细信息
Bayesian influence measures for linear regression models have been developed mostly for normal regressionmodels with noninformative prior distributions for the unknown parameters. In this work we extend existing results in several directions. First, we review influence measures for the ordinary normal regression model under conjugate prior distributions in unified framework. Second, we consider elliptical regressionmodels with noninformative prior distributions for the model parameters and investigate the influence of a given subset of observations on the posterior distributions of the location and scale parameters. We found that these influence measures are Bayesian versions of classical counterparts to identify outliers or influential observations. Finally, we show that departures from normality within the multivariate elliptical family of distributions only affect the posterior distribution of the scale parameter. (C) 2000 Elsevier Science B.V. All rights reserved.
The paper considers a class of 2SHI estimators for the linear regression models and provides some results regarding the dominance in quadratic loss of this class over the OLS and usual Stein-rule estimators.
The paper considers a class of 2SHI estimators for the linear regression models and provides some results regarding the dominance in quadratic loss of this class over the OLS and usual Stein-rule estimators.
In this paper, the Schwarz Information Criterion (SIC) is proposed to locate a change point in the simple linearregression model, as well as in the multiple linearregression model. The method is then applied to a fi...
详细信息
In this paper, the Schwarz Information Criterion (SIC) is proposed to locate a change point in the simple linearregression model, as well as in the multiple linearregression model. The method is then applied to a financial data set, and a change point is successfully detected.
作者:
Feng, ZhenghuiZhang, JunChen, QianXiamen Univ
Sch Econ Xiamen 361005 Peoples R China Xiamen Univ
Wang Yanan Inst Studies Econ Xiamen 361005 Peoples R China Shenzhen Univ
Coll Math & Stat Shenzhen Hong Kong Joint Res Ctr Appl Stat Sci Inst Stat Sci Shenzhen Peoples R China Shenzhen Univ
Coll Math & Stat Shenzhen Peoples R China
We consider estimations and hypothesis test for linearregression measurement error models when the response variable and covariates are measured with additive distortion measurement errors, which are unknown function...
详细信息
We consider estimations and hypothesis test for linearregression measurement error models when the response variable and covariates are measured with additive distortion measurement errors, which are unknown functions of a commonly observable confounding variable. In the parameter estimation and testing part, we first propose a residual-based least squares estimator under unrestricted and restricted conditions. Then, to test a hypothesis on the parametric components, we propose a test statistic based on the normalized difference between residual sums of squares under the null and alternative hypotheses. We establish asymptotic properties for the estimators and test statistics. Further, we employ the smoothly clipped absolute deviation penalty to select relevant variables. The resulting penalized estimators are shown to be asymptotically normal and have the oracle property. In the model checking part, we suggest two test statistics for checking the validity of linear regression models. One is a score-type test statistic and the other is a model- adaptive test statistic. The quadratic form of the scaled test statistic is asymptotically chi-squared distributed under the null hypothesis and follows a noncentral chi-squared distribution under local alternatives that converge to the null hypothesis. We also conduct simulation studies to demonstrate the performance of the proposed procedure and analyze a real example for illustration.
暂无评论