Rates of convergence for nearest neighbor estimation are established in a general framework in terms of metric covering numbers of the underlying space. Our first result is to find explicit finite sample upper bounds ...
详细信息
Rates of convergence for nearest neighbor estimation are established in a general framework in terms of metric covering numbers of the underlying space. Our first result is to find explicit finite sample upper bounds for the classical independent and identically distributed (i.i.d.) random sampling problem in a separable metric space setting. The convergence rate is a function of the covering numbers of the support of the distribution. For example, for bounded subsets of R(r), the convergence rat is O(1/n(2/r)). Our main result is to extend the problem to allow samples drawn from a completely arbitrary random process in a separable metric space and to examine the performance in terms of the individual sample sequences. We show that for every sequence of samples the asymptotic time-average of nearest neighbor risks equals twice the time-average of the conditional Bayes risks of the sequence. Finite sample upper bounds under arbitrary sampling are again obtained in terms of the covering numbers of the underlying space. In particular, for bounded subsets of R(r) the convergence rate of the time-averaged risk is O(1/n(2/r)). We then establish a consistency result for k(n)-nearest neighbor estimation under arbitrary sampling and prove a convergence rate matching established rates for i.i.d. sampling. Finally, we show how our arbitrary sampling results lead to some classical i.i.d. sampling results and in fact extend them to stationary sampling. Our framework and results are quite general while the proof techniques are surprisingly elementary.
Let {(X(n), Y-n)}(n greater than or equal to 1) be a sequence of i.i.d. bi-variate vectors. In this article, we study the possible limit distributions of U-n(h)(t), the so-called conditional U-statistics, introduced b...
详细信息
Let {(X(n), Y-n)}(n greater than or equal to 1) be a sequence of i.i.d. bi-variate vectors. In this article, we study the possible limit distributions of U-n(h)(t), the so-called conditional U-statistics, introduced by Stute.((10)) They are estimators of functions of the form m(h)(t) = E{h(Y-1,...,Y-k) \ X(1) = t(1),...,X(k) = t(k)}, t = (t(1),...,t(k)) is an element of R(k), where E \h\ < infinity. Here t is fixed. In case t(1) = ... = t(k) = t (say), we describe the limiting random variables as multiple Wiener integrals with respect to P-t, the conditional distribution of Y, given X = t. When t(i), 1 less than or equal to i less than or equal to k, are not all equal, we introduce and use a slightly generalized version of a multiple Wiener integral.
Abstract. Developments in the vast and growing literatures on nonparametric and semiparametric statistical estimation are reviewed. The emphasis is on useful methodology rather than statistical properties for their ow...
详细信息
Kernel estimators of an unknown multivariate regression function are investigated. A bandwidth-selection rule is considered, which can be formulated in terms of cross validation. Under mild assumptions on the kernel a...
详细信息
Kernel estimators of an unknown multivariate regression function are investigated. A bandwidth-selection rule is considered, which can be formulated in terms of cross validation. Under mild assumptions on the kernel and the unknown regression function, it is seen that this rule is asymptotically optimal.
A class of maximum penalized likelihood estimators (MPLE) of the density function f is constructed, through the use of a rather general roughness-penalty functional. This class contains all the density estimates in th...
详细信息
A class of maximum penalized likelihood estimators (MPLE) of the density function f is constructed, through the use of a rather general roughness-penalty functional. This class contains all the density estimates in the literature that arise as solutions to MPLE problems with penalties on f1/2. In addition, the flexibility of the penalty functional permits the construction of new spline estimates with improved performance at the peaks and valleys of the density curves. The consistency of the estimators in probability and a.s., in the Lp(R)− norms, p=1,2,∞, in the Hellinger metric and Sobolev norms is established in a unified manner. A class of penalty functionals is identified which leads to estimators which approach the optimal rates of convergence predicted in Farrell (1972). Based on the above estimates, a class of MPLE regression estimators is introduced which has the appealing property of reducing to the classical nonparametricregression estimates when a smoothing parameter goes to zero. Finally, a theoretically justifiable and numerically efficient method for a data based choice of the smoothing parameter is proposed for further study. A number of numerical examples and graphs are presented.
暂无评论