检索结果-内蒙古大学图书馆

Characterizing the optimal solutions to the isotonic regression problem for identifiable functionals

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS 2022年第3期74卷 489-514页

作者： Jordan, Alexander I. Muhlemann, Anja Ziegel, Johanna F. Heidelberg Inst Theoret Studies Computat Stat CST Grp Schloss Wolfsbrunnenweg 35 D-69118 Heidelberg Germany Univ Bern Inst Math Stat & Actuarial Sci Alpeneggstr 22 CH-3012 Bern Switzerland

In general, the solution to a regression problem is the minimizer of a given loss criterion and depends on the specified loss function. The nonparametric isotonic regression problem is special, in that optimal solutions can be found by solely specifying a functional. These solutions will then be minimizers under all loss functions simultaneously as long as the loss functions have the requested functional as the Bayes act. For the functional, the only requirement is that it can be defined via an identification function, with examples including the expectation, quantile, and expectile functionals. Generalizing classical results, we characterize the optimal solutions to the isotonic regression problem for identifiable functionals by rigorously treating these functionals as set-valued. The results hold in the case of totally or partially ordered explanatory variables. For total orders, we show that any solution resulting from the pool-adjacent-violators algorithm is optimal.

关键词： Order-restricted optimization problems Partial order Simultaneous optimality pool-adjacent-violators algorithm Consistent loss functions

来源：评论

学校读者我要写书评

暂无评论

Maximum Likelihood Estimation for Shape-restricted Single-index Hazard Models

引用

Journal of Data Science 2023年第4期21卷 681-695页

作者： Qin, Jing Sun, Yifei Yuan, Ao Huang, Chiung-Yu Biostatistics Research Branch National Institute of Allergy and Infectious Diseases Maryland United States Department of Biostatistics Mailman School of Public Health Columbia University New York United States Department of Biostatistics Bioinformatics & Biomathematics Georgetown University Washington DC United States Department of Epidemiology & Biostatistics University of California San Francisco California United States

Single-index models are becoming increasingly popular in many scientific applications as they offer the advantages of flexibility in regression modeling as well as interpretable covariate effects. In the context of survival analysis, the single-index hazards models are natural extensions of the Cox proportional hazards models. In this paper, we propose a novel estimation procedure for single-index hazard models under a monotone constraint of the index. We apply the profile likelihood method to obtain the semiparametric maximum likelihood estimator, where the novelty of the estimation procedure lies in estimating the unknown monotone link function by embedding the problem in isotonic regression with exponentially distributed random variables. The consistency of the proposed semiparametric maximum likelihood estimator is established under suitable regularity conditions. Numerical simulations are conducted to examine the finitesample performance of the proposed method. An analysis of breast cancer data is presented for illustration. © 2023 The Author(s).

关键词： isotonic regression pool-adjacent-violators algorithm profile likelihood semiparametric estimation

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL FALSE DISCOVERY RATE CONTROL FOR LARGE SCALE MULTIPLE TESTING WITH AUXILIARY INFORMATION

引用

ANNALS OF STATISTICS 2022年第2期50卷 807-857页

作者： Cao, Hongyuan Chen, Jun Zhang, Xianyang Florida State Univ Dept Stat Tallahassee FL 32306 USA Mayo Clin Dept Quantitat Hlth Sci Rochester MN USA Texas A&M Univ Dept Stat College Stn TX 77843 USA

Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploiting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses a priori, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of p-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.

关键词： EM algorithm false discovery rate isotonic regression local false discovery rate multiple testing pool-adjacent-violators algorithm

来源：评论

学校读者我要写书评

暂无评论

algorithms for Sparse k-Monotone Regression 15th

引用

15th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR)

作者： Sidorov, Sergei P. Faizliev, Alexey R. Gudkov, Alexander A. Mironov, Sergei, V Saratov NG Chernyshevskii State Univ Saratov Russia

ISBN: (纸本)9783319930312;9783319930305

The problem of constructing k-monotone regression is to find a vector z is an element of R-n with the lowest square error of approximation to a given vector y is an element of R-n (not necessary k-monotone) under condition of k-monotonicity of z. The problem can be rewritten in the form of a convex programming problem with linear constraints. The paper proposes two different approaches for finding a sparse k-monotone regression (Frank-Wolfe-type algorithm and k-monotone pool adjacent violators algorithm). A software package for this problem is developed and implemented in R. The proposed algorithms are compared using simulated data.

关键词： Greedy algorithms pool-adjacent-violators algorithm Isotonic regression Monotone regression Frank-Wolfe type algorithm

来源：评论

学校读者我要写书评

暂无评论

Density estimation in the two-sample problem with likelihood ratio ordering

引用

BIOMETRIKA 2017年第1期104卷 141-152页

作者： Yu, Tao Li, Pengfei Qin, Jing Natl Univ Singapore Dept Stat & Appl Probabil Block S16Level 76 Sci Dr 2 Singapore 117546 Singapore Univ Waterloo Dept Stat & Actuarial Sci 200 Univ Ave West Waterloo ON N2L 3G1 Canada NIAID NIH 6700B Rockledge Dr Bethesda MD 20892 USA

In this paper, we propose a method for estimating the probability density functions in a two-sample problem where the ratio of the densities is monotone. This problem has been widely identified in the literature, but effective solution methods, in which the estimates should be probability densities and the corresponding density ratio should inherit monotonicity, are unavailable. If these conditions are not satisfied, the applications of the resultant density estimates might be limited. We propose estimates for which the ratio inherits the monotonicity property, and we explore their theoretical properties. One implication is that the corresponding receiver operating characteristic curve estimate is concave. Through numerical studies, we observe that both the density estimates and the receiver operating characteristic curve estimate from our method outperform those resulting directly from kernel density estimates, particularly when the sample size is relatively small.

关键词： Contractivity Greatest convex minorant Likelihood ratio ordering pool-adjacent-violators algorithm Smoothed likelihood Weighted isotonic regression

来源：评论

学校读者我要写书评

暂无评论

Bootstrap Confidence Intervals for Large-scale Multivariate Monotonic Regression Problems

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2016年第3期45卷 1025-1040页

作者： Sysoev, Oleg Grimvall, Anders Burdakov, Oleg Linkoping Univ Dept Comp & Informat Sci S-58183 Linkoping Sweden Linkoping Univ Dept Math S-58183 Linkoping Sweden

Recently, the methods used to estimate monotonic regression (MR) models have been substantially improved, and some algorithms can now produce high-accuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in large-scale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms is illustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.

关键词： Big data Bootstrap Confidence intervals Monotonic regression pool-adjacent-violators algorithm 62G08 62G09

来源：评论

学校读者我要写书评

暂无评论

Misclassified group-tested current status data

引用

BIOMETRIKA 2016年第4期103卷 801-815页

作者： Petito, L. C. Jewell, N. P. Univ Calif Berkeley Sch Publ Hlth Div Biostat 101 Haviland Hall Berkeley CA 94720 USA

Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in total. If the unknown prevalence is low and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all n samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable T is less than an observed screening time C, a data structure known as current status data. Given sufficient variation in the observed C values, it is possible to estimate the distribution function F of T nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of F based on group-tested current status data for groups of size k where the group tests positive if and only if any individual's unobserved T is less than the corresponding observed C. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of F(t). Potential applications include testing for the presence of various diseases in pooled samples where interest focuses on the age-at-incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss connections to other work in the literature, as well as potential extensions.

关键词： Current status data Expectation-maximization algorithm Group testing pool-adjacent-violators algorithm

来源：评论

学校读者我要写书评

暂无评论

Nonparametric Benchmark Dose Estimation with Continuous Dose-Response Data

引用

SCANDINAVIAN JOURNAL OF STATISTICS 2015年第3期42卷 713-731页

作者： Lin, Lizhen Piegorsch, Walter W. Bhattacharya, Rabi Univ Texas Austin Dept Stat & Data Sci Austin TX 78712 USA Univ Arizona Program Stat Tucson AZ 85721 USA Univ Arizona Dept Math Tucson AZ 85721 USA

We propose a new method for risk-analytic benchmark dose (BMD) estimation in a dose-response setting when the responses are measured on a continuous scale. For each dose level d, the observation X(d) is assumed to follow a normal distribution: N((d),sigma 2). No specific parametric form is imposed upon the mean (d), however. Instead, nonparametric maximum likelihood estimates of (d) and sigma are obtained under a monotonicity constraint on (d). For purposes of quantitative risk assessment, a hybrid' form of risk function is defined for any dose d as R(d) = P[X(d) < c], where c > 0 is a constant independent of d. The BMD is then determined by inverting the additional risk functionR(A)(d) = R(d) - R(0) at some specified value of benchmark response. Asymptotic theory for the point estimators is derived, and a finite-sample study is conducted, using both real and simulated data. When a large number of doses are available, we propose an adaptive grouping method for estimating the BMD, which is shown to have optimal mean integrated squared error under appropriate designs.

关键词： benchmark analysis benchmark dose bootstrap confidence limits dose-response analysis isotonic regression model uncertainty pool-adjacent-violators algorithm quantitative responses quantitative risk assessment

来源：评论

学校读者我要写书评

暂无评论

Investigation of Methods for Calibration of Classifier Scores to Probability of Disease

Investigation of Methods for Calibration of Classifier Score...

引用

Conference on Medical Imaging - Image Perception, Observer Performance, and Technology Assessment

作者： Chen, Weijie Sahiner, Berkman Samuelson, Frank Pezeshk, Aria Petrick, Nicholas US FDA Div Imaging Diagnost & Software Reliabil Off Sci & Engn Labs Ctr Devices & Radiol Hlth Silver Spring MD 20993 USA

ISBN: (纸本)9781628415063

Classifier scores in many diagnostic devices, such as computer-aided diagnosis systems, are usually on an arbitrary scale, the meaning of which is unclear. Calibration of classifier scores to a meaningful scale such as the probability of disease is potentially useful when such scores are used by a physician or another algorithm. In this work, we investigated the properties of two methods for calibrating classifier scores to probability of disease. The first is a semiparametric method in which the likelihood ratio for each score is estimated based on a semiparametric proper receiver operating characteristic model, and then an estimate of the probability of disease is obtained using the Bayes theorem assuming a known prevalence of disease. The second method is nonparametric in which isotonic regression via the pool-adjacent-violators algorithm is used. We employed the mean square error (MSE) and the Brier score to evaluate the two methods. We evaluate the methods under two paradigms: (a) the dataset used to construct the score-to-probability mapping function is used to calculate the performance metric (MSE or Brier score) (resubstitution);(b) an independent test dataset is used to calculate the performance metric (independent). Under our simulation conditions, the semiparametric method is found to be superior to the nonparametric method at small to medium sample sizes and the two methods appear to converge at large sample sizes. Our simulation results also indicate that the resubstitutionbias may depend on the performance metric and for the semiparametricmethod, the resubstitutionbias is small when a reasonable number of cases (>100 cases per class) are available.

关键词： classifier evaluation scale transformation likelihood ratio pool-adjacent-violators algorithm

来源：评论

学校读者我要写书评

暂无评论

Bootstrap estimation of the variance of the error term in monotonic regression models

引用

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION 2013年第4期83卷 625-638页

作者： Sysoev, O. Grimvall, A. Burdakov, O. Linkoping Univ Dept Comp & Informat Sci SE-58183 Linkoping Sweden Linkoping Univ Dept Math SE-58183 Linkoping Sweden

The variance of the error term in ordinary regression models and linear smoothers is usually estimated by adjusting the average squared residual for the trace of the smoothing matrix (the degrees of freedom of the predicted response). However, other types of variance estimators are needed when using monotonic regression (MR) models, which are particularly suitable for estimating response functions with pronounced thresholds. Here, we propose a simple bootstrap estimator to compensate for the over-fitting that occurs when MR models are estimated from empirical data. Furthermore, we show that, in the case of one or two predictors, the performance of this estimator can be enhanced by introducing adjustment factors that take into account the slope of the response function and characteristics of the distribution of the explanatory variables. Extensive simulations show that our estimators perform satisfactorily for a great variety of monotonic functions and error distributions.

关键词： uncertainty estimation bootstrap monotonic regression pool-adjacent-violators algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：