Molecular techniques allow the survey of a large number of linked polymorphic loci in random samples from diploid populations. However, the gametic phase of haplotypes is usually unknown when diploid individuals are h...
详细信息
Molecular techniques allow the survey of a large number of linked polymorphic loci in random samples from diploid populations. However, the gametic phase of haplotypes is usually unknown when diploid individuals are heterozygous at more than one locus. To overcome this difficulty, we implement an expectation-maximization (em) algorithm leading to maximum-likelihood estimates of molecular haplotype frequencies under the assumption of Hardy-Weinberg proportions. The performance of the algorithm is evaluated for simulated data representing both DNA sequences and highly polymorphic loci with different levels of recombination. As expected, the em algorithm is found to perform best for large samples, regardless of recombination rates among loci. To ensure finding the global maximum likelihood estimate, the em algorithm should be started from several initial conditions. The present approach appears to be useful for the analysis of nuclear DNA sequences or highly variable loci. Although the algorithm, in principle, can accommodate an arbitrary number of loci, there are practical limitations because the computing time grows exponentially with the number of polymorphic loci.
We develop the non-parametric maximum likelihood estimator (MLE) of the full M-bh capture-recapture model which utilizes both initial capture and recapture data and permits both heterogeneity (h) between animals and b...
详细信息
We develop the non-parametric maximum likelihood estimator (MLE) of the full M-bh capture-recapture model which utilizes both initial capture and recapture data and permits both heterogeneity (h) between animals and behavioural (b) response to capture. Our MLE procedure utilizes non-parametric maximum likelihood estimation of mixture distributions (Lindsay, 1983;Lindsay and Roeder, 1992) and the em algorithm (Dempster et al., 1977). Our MLE estimate provides the first non-parametric estimate of the bivariate capture-recapture distribution. Since non-parametric maximum likelihood estimation exists for submodels M-h (allowing heterogeneity only), M-b (allowing behavioural response only) and M-0 (allowing no changes), we develop maximum likelihood-based model selection, specifically the Akaike information criterion (AIC) (Akaike, 1973). The AIC procedure does well in detecting behavioural response but has difficulty in detecting heterogeneity.
Solvability of the nonlinear emS (estimate, maximize, smooth) equations in the nonnegative quadrant is established by the use of the Brouwer fixed point theorem and a priori estimates from Perron-Frobenius theory. Exi...
详细信息
Solvability of the nonlinear emS (estimate, maximize, smooth) equations in the nonnegative quadrant is established by the use of the Brouwer fixed point theorem and a priori estimates from Perron-Frobenius theory. Existence of solutions and of an a priori estimate are also proven for a generalization of the emS equations. The a priori estimates illustrate the quantification shortcomings of the emS algorithm and should be carefully considered both before applying the algorithm and in the choice of smoothing.
In survival analysis, the most frequently used parametric survival models are the exponential and the Weibull distribution. A random effects model based on a generalized Weibull distribution is proposed for censored c...
详细信息
In survival analysis, the most frequently used parametric survival models are the exponential and the Weibull distribution. A random effects model based on a generalized Weibull distribution is proposed for censored correlated observations. The specific individual effects play the role of the random effects part. A conceptually simple and very useful algorithm using the generalized linear model is given, to apply random effects Weibull models to data, and calculate the asymptotic variance-covariance matrix. The model is applied to two real data sets and the results are compared with previous work.
A continuous-time. non-linear filtering problem is considered in which both signal and observation processes are Markov chains. New finite-dimensional filters and smoothers are obtained for the state of the signal, fo...
详细信息
A continuous-time. non-linear filtering problem is considered in which both signal and observation processes are Markov chains. New finite-dimensional filters and smoothers are obtained for the state of the signal, for the number of jumps from one state to another, for the occupation time in any state of the signal, and for joint occupation times of the two processes. These estimates are then used in the expectation maximization algorithm to improve the parameters in the model. Consequently, our filters and model are adaptive, or self-tuning.
Many stochastic process models for environmental data sets assume a process of relatively simple structure which is in some sense partially observed. That is, there is an underlying process (X(n), n greater than or eq...
详细信息
Many stochastic process models for environmental data sets assume a process of relatively simple structure which is in some sense partially observed. That is, there is an underlying process (X(n), n greater than or equal to 0) or (X(t), t greater than or equal to 0) for which the parameters are of interest and physically meaningful, and an observable process (Y-n, n greater than or equal to 0) or (Y-t, t greater than or equal to 0) which depends on the X process but not otherwise on those parameters. Examples are wide ranging: the Y process may be the X process with missing observations;the Y process may be the X process observed with a noise component;the X process might constitute a random environment for the Y process, as with hidden Markov models;the Y process might be a lower dimensional function or reduction of the X process. In principle, maximum likelihood estimation for the X process parameters can be carried out by some form of the em algorithm applied to the Y process data. In the paper we review some current methods for exact and approximate maximum likelihood estimation. We illustrate some of the issues by considering how to estimate the parameters of a stochastic Nash cascade model for runoff. In the case of k reservoirs, the outputs of these reservoirs form a k dimensional vector Markov process, of which only the kth coordinate process is observed usually at a discrete sample of time points.
A two-stage hierarchical model for analysis of discrete data with extra-Poisson variation is examined. The model consists of a Poisson distribution with a mixing lognormal distribution for the mean. A method of approx...
详细信息
A two-stage hierarchical model for analysis of discrete data with extra-Poisson variation is examined. The model consists of a Poisson distribution with a mixing lognormal distribution for the mean. A method of approximate maximum likelihood estimation of the parameters is proposed. The method uses the em algorithm and approximations to facilitate its implementation are derived. Approximate standard errors of the estimates are. provided and a numerical example is used to illustrate the method.
This is Part II of a series concerning the PLS kernel algorithm for data sets with many variables and few objects. Here the issues of cross-validation and missing data are investigated. Both partial and full cross-val...
详细信息
This is Part II of a series concerning the PLS kernel algorithm for data sets with many variables and few objects. Here the issues of cross-validation and missing data are investigated. Both partial and full cross-validation are evaluated in terms of predictive residuals and speed and are illustrated on real examples. Two related approaches to the solution of the missing data problem are presented. One is a full em algorithm and the second a reduced em algorithm which applies when the number of missing values is small. The two examples are multivariate calibration data sets. The first set consists of UV-visible data measured on mixtures of four metal ions. The second example consists of FT-IR measurements on mixtures consisting of four different organic substances.
A one-year birth cohort from Northern Finland has been followed up since 1966. As a part of this study, we are in this paper concerned with analysing the progression of myopia (nearsightness) up to the age of 20 years...
详细信息
A one-year birth cohort from Northern Finland has been followed up since 1966. As a part of this study, we are in this paper concerned with analysing the progression of myopia (nearsightness) up to the age of 20 years. The random coefficient regression model was chosen for the analysis because of the large individual variation in the development of myopia. Maximum likelihood estimates for the parameters in the model were obtained via the expectation maximization (em) algorithm. It is shown how the estimated model can be used to predict future observations for an individual using the previously recorded refractive error measurements as well as other relevant data on the patient in question.
The problem of estimating the lifetime distribution based on data from independently and identically distributed stationary renewal processes is addressed. The data are incomplete. A nonparametric maximum likelihood e...
详细信息
The problem of estimating the lifetime distribution based on data from independently and identically distributed stationary renewal processes is addressed. The data are incomplete. A nonparametric maximum likelihood estimate of the lifetime distribution is derived using the em algorithm. The missing information principle is used to estimate the standard error of the estimated distribution. The methodology is applied to a problem in the nursing profession where nurses withdraw from active service for a period of time before returning to take up post at a later date. It is important that nurse manpower planners accurately predict this pattern of return. The data analysed are from the Northern Ireland nursing profession.
暂无评论