This paper offers a new method for testing one-sided hypotheses in discrete multivariate data models. One-sided alternatives mean that there are restrictions on the multidimensional parameter space. The focus is on mo...
详细信息
This paper offers a new method for testing one-sided hypotheses in discrete multivariate data models. One-sided alternatives mean that there are restrictions on the multidimensional parameter space. The focus is on models dealing with ordered categorical data. In particular, applications are concerned with R x C contingency tables. The method has advantages over other general approaches. All tests are exact in the sense that no large sample theory or large sample distribution theory is required. Testing is unconditional although its execution is done conditionally, section by section, where a section is determined by marginal totals. This eliminates any potential nuisance parameter issues. The power of the tests is more robust than the power of the typical linear tests often recommended. Furthermore, computer programs are available to carry out the tests efficiently regardless of the sample sizes or the order of the contingency tables. Both censored data and uncensored data models are discussed.
Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in tota...
详细信息
Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in total. If the unknown prevalence is low and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all n samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable T is less than an observed screening time C, a data structure known as current status data. Given sufficient variation in the observed C values, it is possible to estimate the distribution function F of T nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of F based on group-tested current status data for groups of size k where the group tests positive if and only if any individual's unobserved T is less than the corresponding observed C. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of F(t). Potential applications include testing for the presence of various diseases in pooled samples where interest focuses on the age-at-incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss connections to other work in the literature, as well as potential extensions.
An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), which induce a pre-specified benchmark response in a doseresponse experiment. ...
详细信息
An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), which induce a pre-specified benchmark response in a doseresponse experiment. In such settings, representations of the risk are traditionally based on a parametric doseresponse model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating BMDs, based on an isotonic doseresponse estimator for quantal-response data. We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. Copyright (c) 2012 John Wiley & Sons, Ltd.
作者:
Yu, TaoLi, PengfeiQin, JingNatl Univ Singapore
Dept Stat & Appl Probabil Block S16Level 76 Sci Dr 2 Singapore 117546 Singapore Univ Waterloo
Dept Stat & Actuarial Sci 200 Univ Ave West Waterloo ON N2L 3G1 Canada NIAID
NIH 6700B Rockledge Dr Bethesda MD 20892 USA
In this paper, we propose a method for estimating the probability density functions in a two-sample problem where the ratio of the densities is monotone. This problem has been widely identified in the literature, but ...
详细信息
In this paper, we propose a method for estimating the probability density functions in a two-sample problem where the ratio of the densities is monotone. This problem has been widely identified in the literature, but effective solution methods, in which the estimates should be probability densities and the corresponding density ratio should inherit monotonicity, are unavailable. If these conditions are not satisfied, the applications of the resultant density estimates might be limited. We propose estimates for which the ratio inherits the monotonicity property, and we explore their theoretical properties. One implication is that the corresponding receiver operating characteristic curve estimate is concave. Through numerical studies, we observe that both the density estimates and the receiver operating characteristic curve estimate from our method outperform those resulting directly from kernel density estimates, particularly when the sample size is relatively small.
Single-index models are becoming increasingly popular in many scientific applications as they offer the advantages of flexibility in regression modeling as well as interpretable covariate effects. In the context of su...
详细信息
Classifier scores in many diagnostic devices, such as computer-aided diagnosis systems, are usually on an arbitrary scale, the meaning of which is unclear. Calibration of classifier scores to a meaningful scale such a...
详细信息
ISBN:
(纸本)9781628415063
Classifier scores in many diagnostic devices, such as computer-aided diagnosis systems, are usually on an arbitrary scale, the meaning of which is unclear. Calibration of classifier scores to a meaningful scale such as the probability of disease is potentially useful when such scores are used by a physician or another algorithm. In this work, we investigated the properties of two methods for calibrating classifier scores to probability of disease. The first is a semiparametric method in which the likelihood ratio for each score is estimated based on a semiparametric proper receiver operating characteristic model, and then an estimate of the probability of disease is obtained using the Bayes theorem assuming a known prevalence of disease. The second method is nonparametric in which isotonic regression via the pool-adjacent-violators algorithm is used. We employed the mean square error (MSE) and the Brier score to evaluate the two methods. We evaluate the methods under two paradigms: (a) the dataset used to construct the score-to-probability mapping function is used to calculate the performance metric (MSE or Brier score) (resubstitution);(b) an independent test dataset is used to calculate the performance metric (independent). Under our simulation conditions, the semiparametric method is found to be superior to the nonparametric method at small to medium sample sizes and the two methods appear to converge at large sample sizes. Our simulation results also indicate that the resubstitutionbias may depend on the performance metric and for the semiparametricmethod, the resubstitutionbias is small when a reasonable number of cases (>100 cases per class) are available.
We consider the finite sample performance of a new nonparametric method for bioassay and benchmark analysis in risk assessment, which averages isotonic MLEs based on disjoint subgroups of dosages, and whose asymptotic...
详细信息
We consider the finite sample performance of a new nonparametric method for bioassay and benchmark analysis in risk assessment, which averages isotonic MLEs based on disjoint subgroups of dosages, and whose asymptotic behavior is essentially optimal (Bhattacharya and Lin, Stat Probab Lett 80: 1947-1953, 2010). It is compared with three other methods, including the leading kernel-based method, called DNP, due to Dette et al. (J Am Stat Assoc 100: 503-510, 2005) and Dette and Scheder (J Stat Comput Simul 80(5): 527-544, 2010). In simulation studies, the present method, termed NAM, outperforms the DNP in the majority of cases considered, although both methods generally do well. In small samples, NAM and DNP both outperform the MLE.
The problem of constructing k-monotone regression is to find a vector z is an element of R-n with the lowest square error of approximation to a given vector y is an element of R-n (not necessary k-monotone) under cond...
详细信息
ISBN:
(纸本)9783319930312;9783319930305
The problem of constructing k-monotone regression is to find a vector z is an element of R-n with the lowest square error of approximation to a given vector y is an element of R-n (not necessary k-monotone) under condition of k-monotonicity of z. The problem can be rewritten in the form of a convex programming problem with linear constraints. The paper proposes two different approaches for finding a sparse k-monotone regression (Frank-Wolfe-type algorithm and k-monotone pooladjacentviolatorsalgorithm). A software package for this problem is developed and implemented in R. The proposed algorithms are compared using simulated data.
暂无评论