检索结果-内蒙古大学图书馆

Divergence based feature selection for multimodal class densities

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1996年第2期18卷 218-223页

作者： Novovicova, J Pudil, P Kittler, J UNIV SURREY DEPT ELECTR & ELECT ENGNGUILDFORD GU2 5XHSURREYENGLAND

A new feature selection procedure based on the Kullback J-divergence between two class conditional density functions approximated by a finite mixture of parameterized densities of a special type is presented. This procedure is suitable especially for multimodal data. Apart from finding a feature subset of any cardinality without involving any search procedure, it also simultaneously yields a pseudo-Bayes decision rule. Its performance is tested on real data.

关键词： feature selection feature ordering mixture distribution maximum likelihood em algorithm Kullback J-divergence

来源：评论

学校读者我要写书评

暂无评论

Markov-normal analysis of iterative simulations before their convergence

引用

JOURNAL OF ECONOMETRICS 1996年第1期75卷 69-78页

作者： Liu, CH Rubin, DB HARVARD UNIV DEPT STATCAMBRIDGEMA 02138 LUCENT TECHNOL MURRAY HILLNJ 07974

Iterative simulation techniques are becoming standard tools in Bayesian statistics, a notable example being the Gibbs sampler, whose draws form a Markov chain. Standard practice is to run the simulation until convergence is approached in the sense of the draws appearing to be stationary. At this point, the set of stationary draws can be used to provide an estimate of the target distribution. However, when the distributions involved are normal and the draws form a Markov chain, the target distribution can be reliably estimated by maximum likelihood (ML) using draws before their convergence to the target distribution. This fact suggests that the normal-based ML estimates can be exploited to estimate the mean and covariance matrix of an approximately normal target distribution before convergence is reached, and that these estimates call be used to define a restarting distribution for the simulation. Here, we describe the needed technology and explore its relevance to practice. The tentative conclusion is that the Markov-Normal restarting procedure can be computationally advantageous when the target distribution is nearly normal, especially in massively parallel or distributed computing environments where many sequences can be run for the same effective cost as one sequence.

关键词： em algorithm Gibbs sampler Markov chain Monte Carlo the multivariate t distribution time series

来源：评论

学校读者我要写书评

暂无评论

ATM network design and optimization: A multirate loss network framework

引用

IEEE-ACM TRANSACTIONS ON NETWORKING 1996年第4期4卷 531-543页

作者： Mitra, D Morrison, JA Ramakrishnan, KG Lucent Technol. AT&T Bell Labs. Murray Hill NJ USA

Asynchronous transfer mode (ATM) network design and optimization at the call-level may be formulated in the framework of multirate, circuit-switched, loss networks with effective bandwidth encapsulating cell-level behavior. Each service supported on the ATM network is characterized by a rate or bandwidth requirement. Future networks will be characterized by links with very large capacities in circuits and by many rates. Various asymptotic results are given to reduce the attendant complexity of numerical calculations. A central element is a uniform asymptotic approximation (UAA) for link analyses. Moreover, a unified hybrid approach is given which allows asymptotic and nonasymptotic methods of calculations to be used cooperatively. Network loss probabilities are obtained by solving fixed-point equations. A canonical problem of route and logical network design is considered. An optimization procedure is proposed, which is guided by gradients obtained by solving a system of equations for implied costs. A novel application of the em. algorithm gives an efficient technique for calculating implied costs with changing traffic conditions. Finally, we report numerical results obtained by the software package TALISMAN, which incorporates the theoretical results. The network considered has eight nodes, 20 links, six services, and as many as 160 routes.

关键词： effective bandwidth asymptotic approximations fixed-point equations revenue maximization implied costs steepest ascent routing em algorithm flow bounds Erlang bounds

来源：评论

学校读者我要写书评

暂无评论

On computing the expected Fisher information matrix for state-space model parameters

引用

STATISTICS & PROBABILITY LETTERS 1996年第4期26卷 347-355页

作者： Cavanaugh, JE Shumway, RH UNIV MISSOURI DEPT STATCOLUMBIAMO 65211 UNIV CALIF DAVIS DIV STATDAVISCA 95616

A general, recursive algorithm is presented for computing the expected Fisher information matrix for state-space model parameters. Simulation results are featured where known Fisher information matrices corresponding to simple state-space models are estimated using both observed and expected information matrices. The accuracy of the two approaches is compared.

关键词： em algorithm Kalman filter recursive algorithm time series

来源：评论

学校读者我要写书评

暂无评论

Prediction in repeated-measures models with engineering applications

引用

TECHNOMETRICS 1996年第1期38卷 25-36页

作者： Liski, EP Nummi, T Department of Mathematical Sciences University of Tampere Tampere SF-33101 Finland

This article focuses on the problem of predicting future measurements on a statistical unit given past measurements on the same and other similar units. We introduce a conditional predictor that uses the information contained in previous measurements. The prediction technique is based on the iterative em algorithm, but a noniterative variant is also provided. We use the sample-reuse methodology to select an appropriate predictor. The technique is illustrated in three engineering applications. The first considers prediction in the context of marketing for bucking in automatic forest harvesters. In fatigue-crack-growth data, the interest is in predicting the future crack-growth development of the test unit, and the third application concerns evaluation of pulp from the point of view of its papermaking potential.

关键词： conditional prediction em algorithm mixed effects parsimonious covariance structure

来源：评论

学校读者我要写书评

暂无评论

A competing risks analysis of presenting AIDS diagnoses trends

引用

BIOMETRICS 1996年第1期52卷 211-225页

作者： Fusaro, RE Bacchetti, P Jewell, NP UNIV CALIF BERKELEY SCH OPTOMETRY BERKELEY CA 94720 USA UNIV CALIF SAN FRANCISCO DEPT EPIDEMIOL & BIOSTAT SAN FRANCISCO CA 94143 USA UNIV CALIF BERKELEY DIV BIOSTAT BERKELEY CA 94720 USA UNIV CALIF BERKELEY DEPT STAT BERKELEY CA 94720 USA

The proportions of gay men presenting with various AIDS diagnoses display temporal trends. In particular, the proportion of initial diagnoses reported as Kaposi's sarcoma (KS) has declined over time. Epidemiologists have hypothesized that (a) KS may require a cofactor, whose prevalence has declined over time, or (b) KS may have a shorter incubation period than other presenting diagnoses. We examine whether this latter hypothesis, considered in a competing risks framework, could account for the observed decline in KS. We nonparametrically estimate the relevant cause-specific hazard functions from the doubly-censored data of the San Francisco City Clinic Cohort by maximizing a roughness penalized likelihood using an em algorithm. These estimates suggest that differences in the underlying cause-specific hazard functions account for a substantial portion of the observed diagnoses trends.

关键词： AIDS, competing risks doubly-censored data em algorithm HIV disease Kaposi's sarcoma penalized likelihood Pneumocystis carinii pneumonia

来源：评论

学校读者我要写书评

暂无评论

Missing data methods in PCA and PLS: Score calculations with incomplete observations

引用

CHemOMETRICS AND INTELLIGENT LABORATORY SYSTemS 1996年第1期35卷 45-65页

作者： Nelson, PRC Taylor, PA MacGregor, JF MCMASTER UNIV DEPT CHEM ENGNHAMILTONON L8S 4L8CANADA

A very important problem in industrial applications of PCA and PLS models, such as process modelling or monitoring, is the estimation of scores when the observation vector has missing measurements. The alternative of suspending the application until all measurements are available is usually unacceptable. The problem treated in this work is that of estimating scores from an existing PCA or PLS model when new observation vectors are incomplete. Building the model with incomplete observations is not treated here, although the analysis given in this paper provides considerable insight into this problem. Several methods for estimating scores from data with missing measurements are presented, and analysed: a method, termed single component projection, derived from the NIPALS algorithm for model building with missing data;a method of projection to the model plane;and data replacement by the conditional mean. Expressions are developed for the error in the scores calculated by each method. The error analysis is illustrated using simulated data sets designed to highlight problem situations. A larger industrial data set is also used to compare the approaches. In general, all the methods perform reasonable well with moderate amounts of missing data (up to 20% of the measurements). However, in extreme cases where critical combinations of measurements are missing, the conditional mean replacement method is generally superior to the other approaches.

关键词： PCA PLS missing data NIPALS algorithm em algorithm

来源：评论

学校读者我要写书评

暂无评论

Discriminant analysis and density estimation on the finite d-dimensional grid

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 1996年第1期22卷 27-51页

作者： Granville, V UNIV CAMBRIDGE STAT LAB CAMBRIDGE ENGLAND

A new theoretical point of view is discussed here in the framework of density estimation. The discrete multivariate true density is viewed as a finite dimensional continuous random vector governed by a Markov random field structure. Estimating the density is then a problem of maximizing a conditional likelihood under a Bayesian framework. This maximization problem is expressed as a constrained optimization problem and is solved by an iterative fixed point algorithm. However, for time efficiency reasons, we have been interested in an approximate estimate (f) over cap = B pi of the true density f, where B is a stochastic matrix and pi is the raw histogram. This estimate is obtained by developing (f) over cap as a function of pi around the uniform histogram pi(0), using multivariate Taylor expansions for implicit functions ((f) over cap is actually an implicit function of pi). The discrete setting of the problem allows us to get a simple analytical form for B. Although the approach is original, our density estimator is actually nothing else than a penalized maximum likelihood estimator. However, it appears to be more general than those proposed in the literature (Scott et al., 1980;Simonoff, 1983;Thompson and Tapia, 1990). In a second step, we investigate the discrimination problem on the same space, using the theory previously developed for density estimation. We also introduce an adaptive bandwidth depending on the k-nearest neighbours and we have chosen to optimize the leaving-one-out criterion. We have always kept in mind the practical implementation on a computer. Our final classification algorithm compares favourably in terms of error rate and time efficiency with other algorithms tested, including multinormal IMSL, nearest-neighbour, and convex hull classifiers. Comparisons were performed on satellite images.

关键词： kernel density estimates adaptive bandwidth nearest neighbours leaving-one-out Markov random held Gibbs potential penalized likelihood em algorithm sparse consistency

来源：评论

学校读者我要写书评

暂无评论

The marking for bucking under uncertain in automatic harvesting of forests

引用

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS 1996年第0期46卷 373-385页

作者： Liski, EP Nummi, T University of Tampere Finland

The marking for bucking is the problem of converting a single tree stem into logs in such a way that the total stem value according to a given price list for logs is maximized. Proper marking for bucking is of crucial importance in harvesting of forests. The annual production of saw timber in Scandinavia is tens of millions m(3). Since, furthermore, sawn timber is by far the most valuable forest product in Scandinavia, any improvement in the marking for bucking procedure Will yield a large profit. To solve the marking for bucking problem optimally one has to know the whole tree stem. However, it is not economically feasible to run the whole stem through the measuring device before cutting. Therefore, it is a normal situation under computer-based marking for bucking that the first cutting decisions are made before the whole stem is known. In this paper the forest harvesting process will be considered under a general growth curve model useful especially for repeated-measures data. The main objective is to predict the unknown portion y(n(2)) of the current stem, which will be estimated from the stem data y(1), y(2), ... , y(n-1) on the previously processed n - 1 stems and from the known diameter values y(n(1)) on the current stem. Then a predictor of y(n(2)), say (y) over cap(n(2)), jointly with the known party y(n(1)) is used in marking for bucking. It turns out that this technique can improve radically the efficiency of harvesting, so that our results provide important knowledge for developing automatic bucking systems of modern harvesters.

关键词： em algorithm maximum likelihood parsimonious covariance structures root mean square error

来源：评论

学校读者我要写书评

暂无评论

Pattern-mixture models for multivariate incomplete data with covariates

引用

BIOMETRICS 1996年第1期52卷 98-111页

作者： Little, RJA Wang, YX UNIV CALIF LOS ANGELES DEPT PSYCHOLLOS ANGELESCA 90024

Pattern-mixture models stratify incomplete data by the pattern of missing values and formulate distinct models within each stratum. Pattern-mixture models are developed for analyzing a random sample on continuous variables y((1)), Y-(2) when values of y((2)) are nonrandomly missing. Methods for scalar y((1)) and y((2)) are here generalized to vector y((1)) and y((2)) with additional fixed covariates x. Parameters in these models are identified by alternative assumptions about the missing-data mechanism. Models may be underidentified (in which case additional assumptions are needed), just-identified, or overidentified. Maximum likelihood and Bayesian methods are developed for the latter two situations, using the em and Sem algorithms, direct and iterative simulation methods. The methods are illustrated on a data set involving alternative dosage regimens for the treatment of schizophrenia using haloperidol and on a regression example. Sensitivity to alternative assumptions about the missing-data mechanism is assessed, and the new methods are compared with complete-case analysis and maximum likelihood for a probit selection model.

关键词： attrition em algorithm Gibbs' sampling informative drop-outs missing values nonrandom missing data nonresponse selection bias

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：