The log-linear modelling capabilities of existing statistical software permit a generalization of Holt's indirect estimation method by allowing several different selection curve models to be fitted to catch data f...
详细信息
The log-linear modelling capabilities of existing statistical software permit a generalization of Holt's indirect estimation method by allowing several different selection curve models to be fitted to catch data from an arbitrary number of mesh sizes. This also facilitates the use of formal inferential procedures such as model selection, assessment of relative fishing powers, estimation of standard errors, and permits inclusion of knowledge regarding the shape of the population length distribution. This methodology is equally applicable to estimation of hook selectivity. (C) 1997 International Council for the Exploration of the Sea.
Low expected frequencies in tests associated to log-linear models building are treated with the aim df providing a methodology, useful for nonstatistician users, to analyse multivariate contingency tables. A procedure...
详细信息
Low expected frequencies in tests associated to log-linear models building are treated with the aim df providing a methodology, useful for nonstatistician users, to analyse multivariate contingency tables. A procedure that reproduces the decisions of a statistical analyst studying a multivariate contingency table and confronted with low expected frequencies is provided, using the Bayesian information criterion to select a variable over which the aggregation should be done, and the entropy of Shannon to decide which categories should be aggregated. Prior opinions and knowledge about the feasibility of aggregation of categories within the context where the data have been collected are included in the system. The procedure has some user friendly techniques oriented to nonstatisticians, and it allowed greater efficiency when there are several multivariate tables to be analysed using some variables that can be included in different log-linear models. (C) 1999 Elsevier Science B.V. All rights reserved.
One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogenei...
详细信息
One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual 'capture' probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modelling approaches to capture-recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co-workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log-linear model;however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and Various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the 'size' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and the Bayesian contexts.
In 1960, Cohen introduced the kappa coefficient to measure chance-corrected nominal scale agreement between two raters. Since then, numerous extensions and generalizations of this inter-rater agreement measure have be...
详细信息
In 1960, Cohen introduced the kappa coefficient to measure chance-corrected nominal scale agreement between two raters. Since then, numerous extensions and generalizations of this inter-rater agreement measure have been proposed in the literature. This paper reviews and critiques various approaches to the study of interrater agreement, for which the relevant data comprise either nominal or ordinal categorical ratings from multiple raters. It presents a comprehensive compilation of the main statistical approaches to this problem, descriptions and characterizations of the underlying models, and discussions of related statistical methodologies for estimation and confidence-interval construction. The emphasis is on various practical scenarios and designs that underlie the development of these measures, and the interrelationships between them.
This article first illustrates the use of mosaic displays for the analysis of multiway contingency tables. We then introduce several extensions of mosaic displays designed to integrate graphical methods for categorica...
详细信息
This article first illustrates the use of mosaic displays for the analysis of multiway contingency tables. We then introduce several extensions of mosaic displays designed to integrate graphical methods for categorical data with those used for quantitative data. The scatterplot matrix shows all pairwise (bivariate marginal) views of a set of variables in a coherent display. One analog for categorical data is a matrix of mosaic displays showing some aspect of the bivariate relation between all pairs of variables. The simplest case shows the bivariate marginal relation for each pair of variables. Another case shows the conditional relation between each pair, with all other variables partialled out. For quantitative data this represents (a) a visualization of the conditional independence relations studied by graphical models, and (b) a generalization of partial residual plots. The conditioning plot, or coplot, shows a collection of partial views of several quantitative variables, conditioned by the values of one or more other variables. A direct analog of the coplot for categorical data is an array of mosaic plots of the dependence among two or more variables, stratified by the values of one or more given variables. Each such panel then shows the partial associations among the foreground variables;the collection of such plots shows how these associations change as the given variables vary.
Capture-recapture methods are used to estimate the incidence of a disease, using a multiple-source registry. Usually, log-linear methods are used to estimate population size, assuming that not all sources of notificat...
详细信息
Capture-recapture methods are used to estimate the incidence of a disease, using a multiple-source registry. Usually, log-linear methods are used to estimate population size, assuming that not all sources of notification are dependent. Where there are categorical covariates, a stratified analysis can be performed. The multinomial legit model has occasionally been used. In this paper, the authors compare log-linear and legit models with and without covariates, and use simulated data to compare estimates from different models. The crude estimate of population size is biased when the sources are not independent. Analyses adjusting for covariates produce less biased estimates. In the absence of covariates, or where all covariates are categorical, the log-linear model and the legit model are equivalent. The log-linear model cannot include continuous variables. To minimize potential bias in estimating incidence, covariates should be included in the design and analysis of multiple-source disease registries.
We apply some log-linear modelling methods, which have been proposed for treating non-ignorable non-response, to some data on voting intention from the British General Election Survey. We find that, although some non-...
详细信息
We apply some log-linear modelling methods, which have been proposed for treating non-ignorable non-response, to some data on voting intention from the British General Election Survey. We find that, although some non-ignorable non-response models fit the data very well, they may generate implausible point estimates and predictions. Some explanation is provided for the extreme behaviour of the maximum likelihood estimates for the most parsimonious model. We conclude that point estimates for such models must be treated with great caution. To allow for the uncertainty about the non-response mechanism we explore the use of profile likelihood inference and find the likelihood surfaces to be very flat and the interval estimates to be very wide. To reduce the width of these intervals we propose constraining confidence regions to values where the parameters governing the non-response mechanism are plausible and study the effect of such constraints on inference. We find that the widths of these intervals are reduced but remain wide.
Background The measure of efficacy is optimally performed by randomized controlled trials. However, low specificity of the judgement criteria is known to bias toward lower estimation, while low sensitivity increases t...
详细信息
Background The measure of efficacy is optimally performed by randomized controlled trials. However, low specificity of the judgement criteria is known to bias toward lower estimation, while low sensitivity increases the required sample size. A common technique for ensuring good specificity without a drop in sensitivity is to use several diagnostic tests in parallel, with each of them being specific. This approach is similar to the more general situation of case-counting from multiple data sources, and this paper explores the application of the capture-recapture method for the analysis of the estimates of efficacy. Method An illustration of this application is derived from a study on the efficacy of pertussis vaccines where the outcome was based on greater than or equal to 21 days of cough confirmed by at least one of three criteria performed independently for each subject: bacteriology, serology, or epidemiological link. log-linear methods were applied to these data considered as three sources of information. Results The best model considered the three simple effects and an interaction term between bacteriology and epidemiological linkage. Among the 801 children experiencing greater than or equal to 21 days of cough, it was estimated that 93 cases were missed, leading to a corrected total of 413 confirmed cases. The relative vaccine efficacy estimated from the same model was 1.50 (95% confidence interval: 1.24-1.82), similar to the crude estimate of 1.59 and confirming better protection afforded by one of the two vaccines. Conclusion This method allows supporting analysis to interpret primary estimates of vaccine efficacy.
Discrete-time discrete-state Markov chain models can be used to describe individual change in categorical variables. But when the observed states are subject to measurement error, the observed transitions between two ...
详细信息
Discrete-time discrete-state Markov chain models can be used to describe individual change in categorical variables. But when the observed states are subject to measurement error, the observed transitions between two points in rime will be partially spurious. Latent Markov models make it possible to separate true change from measurement error. The standard latent Markov model is, however, rather limited when the aim is to explain individual differences in the probability of occupying a particular state at a particular point in time. This paper presents a flexible logit regression approach which allows to regress the latent states occupied at the various points in time on both time-constant and time-varying covariates. The regression approach combines features of causal log-linear models and latent class models with explanatory variables. In an application pupils' interest in physics at different points in time is explained by the time-constant covariate sex and the time-varying covariate physics grade. Results of both the complete and partially observed data are presented.
Prevalence estimates for psychiatric disturbance in mothers and their 2 1/2 year old children and the variations in prevalence associated with marriage quality, social class and the child's developmental level are...
详细信息
Prevalence estimates for psychiatric disturbance in mothers and their 2 1/2 year old children and the variations in prevalence associated with marriage quality, social class and the child's developmental level are presented. It was found that both the mother's psychiatric disturbance, and more specifically, depression were associated most strongly with child disturbance, poor marriage quality and low child developmental level. For toddler disturbance, the strong associations were with type and severity of disturbance in mother, social class, marriage quality and low developmental quotient.
暂无评论