检索结果-内蒙古大学图书馆

model-based classification via mixtures of multivariate t-distributions

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2011年第1期55卷 520-529页

作者： Andrews, Jeffrey L. McNicholas, Paul D. Subedi, Sanjeena Univ Guelph Dept Math & Stat Guelph ON N1G 2W1 Canada

A novel model-based classification technique is introduced based on mixtures of multivariate t-distributions. A family of four mixture models is defined by constraining, or not, the covariance matrices and the degrees of freedom to be equal across mixture components. Parameters for each of the resulting four models are estimated using a multicycle expectation-conditional maximization algorithm, where convergence is determined using a criterion based on the Aitken acceleration. A straightforward, but very effective, technique for the initialization of the unknown component memberships is introduced and compared with a popular, more sophisticated, initialization procedure. This novel four-member family is applied to real and simulated data, where it gives good classification performance, even when compared with more established techniques. (C) 2010 Elsevier B.V. All rights reserved.

关键词： classification Food authenticity Mixture models model-based classification Multivariate t-distributions

来源：评论

学校读者我要写书评

暂无评论

model-based classification of radar images

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2000年第5期46卷 1842-1854页

作者： Chiang, HC Moses, RL Potter, LC Ohio State Univ Dept Elect Engn Columbus OH 43210 USA

A Bayesian approach is presented for model-based classification of images with application to synthetic-aperture radar Posterior probabilities are computed for candidate hypotheses using physical features estimated from sensor data along with features predicted from these hypotheses. The likelihood scoring allows propagation of uncertainty arising in both the sensor data and object models. The Bayesian classification, including the determination of a correspondence between unordered random features, is shown to be tractable, yielding a classification algorithm, a method for estimating error rates, and a tool for evaluating performance sensitivity, The radar image features used for classification are point locations with an associated vector of physical attributes;the attributed features are adopted from a parametric model of high-frequency radar scattering. With the emergence of wideband sensor technology, these physical features expand interpretation of radar imagery to access the frequency- and aspect-dependent scattering information carried in the image phase.

关键词： model-based classification parametric modeling point correspondence radar image analysis

来源：评论

学校读者我要写书评

暂无评论

model-based classification via Mixtures of Multivariate t-Factor Analyzers

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2012年第4期41卷 510-523页

作者： Steane, Michelle A. McNicholas, Paul D. Yada, Rickey Y. Univ Guelph Dept Math & Stat Guelph ON N1G 2W1 Canada Univ Guelph Dept Food Sci Guelph ON N1G 2W1 Canada

A model-based classification technique is developed, based on mixtures of multivariate t-factor analyzers. Specifically, two related mixture models are developed and their classification efficacy studied. An AECM algorithm is used for parameter estimation, and convergence of these algorithms is determined using Aitken's acceleration. Two different techniques are proposed for model selection: the BIC and the ICL. Our classification technique is applied to data on red wine samples from Italy and to fatty acid measurements on Italian olive oils. These results are discussed and compared to more established classification techniques;under this comparison, our mixture models give excellent classification performance.

关键词： Mixture models model-based classification Multivariate t-distribution t-Factor analyzers

来源：评论

学校读者我要写书评

暂无评论

model-based classification using latent Gaussian mixture models

引用

JOURNAL OF STATISTICAL PLANNING AND INFERENCE 2010年第5期140卷 1175-1181页

作者： McNicholas, Paul D. Univ Guelph Dept Math & Stat Guelph ON N1G 2W1 Canada

A novel model-based classification technique is introduced based on parsimonious Gaussian mixture models (PGMMs). PGMMs, which were introduced recently as a model-based clustering technique, arise from a generalization of the mixtures of factor analyzers model and are based on a latent Gaussian mixture model. In this paper, this mixture modelling structure is used for model-based classification and the particular area of application is food authenticity. model-based classification is performed by jointly modelling data with known and unknown group memberships within a likelihood framework and then estimating parameters, including the unknown group memberships, within an alternating expectation-conditional maximization framework. model selection is carried out using the Bayesian information criteria and the quality of the maximum a posteriori classifications is summarized using the misclassification rate and the adjusted Rand index. This new model-based classification technique gives excellent classification performance when applied to real food authenticity data on the chemical properties of olive oils from nine areas of Italy. (C) 2009 Elsevier B.V. All rights reserved.

关键词： classification Factor analysis Food authenticity Mixture models model-based classification model-based clustering Parsimonious Gaussian mixture models (PGMMs)

来源：评论

学校读者我要写书评

暂无评论

A robust approach to model-based classification based on trimming and constraints Semi-supervised learning in presence of outliers and label noise

引用

ADVANCES IN DATA ANALYSIS AND classification 2020年第2期14卷 327-354页

作者： Cappozzo, Andrea Greselin, Francesca Murphy, Thomas Brendan Univ Milano Bicocca Dept Stat & Quantitat Methods Milan Italy Univ Coll Dublin Sch Math & Stat Dublin Ireland Univ Coll Dublin Insight Res Ctr Dublin Ireland

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers and data with incorrect labels, can strongly undermine the classifier performance, especially if the training size is small. The present work introduces a robust modification to the model-based classification framework, employing impartial trimming and constraints on the ratio between the maximum and the minimum eigenvalue of the group scatter matrices. The proposed method effectively handles noise presence in both response and exploratory variables, providing reliable classification even when dealing with contaminated datasets. A robust information criterion is proposed for model selection. Experiments on real and simulated data, artificially adulterated, are provided to underline the benefits of the proposed method.

关键词： model-based classification Label noise Outliers detection Impartial trimming Eigenvalues restrictions Robust estimation

来源：评论

学校读者我要写书评

暂无评论

A comparison of model-based and regression classification techniques applied to near infrared spectroscopic data in food authentication studies

引用

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS 2007年第2期89卷 102-115页

作者： Toher, Deirdre Downey, Gerard Murphy, Thomas Brendan Trinity Coll Dublin Sch Comp Sci & Stat Dept Stat Dublin 2 Ireland Ashtown Food Res Ctr Dublin 15 Ireland

classification methods can be used to classify samples of unknown type into known types. Many classification methods have been proposed in the chemometrics, statistical and computer science literature. model-based classification methods have been developed from a statistical modelling viewpoint. This approach allows for uncertainty in the classification procedure to be quantified using probabilities. Linear discriminant analysis and quadratic discriminant analysis are particular model-based classification methods. Partial least squares discriminant analysis is commonly used in food authentication studies based on spectroscopic data. This method uses partial least squares regression with a binary outcome variable for two-group classification problems. In this paper, model-based classification is compared to partial least squares discriminant analysis for its ability to correctly classify pure and adulterated honey samples when the honey has been extended by three different adulterants. Two model selection criteria are examined: the Bayesian Information Criterion and 5-fold cross validation. The methods are compared using the classification performance and the interpretability of the results. In addition, since the percentage of adulterated samples in any given sample set is unlikely to be known in a real-life setting, the ability of updating procedures within model-based clustering to accurately predict the adulterated samples, even when the proportion of pure to adulterated samples in the training data is grossly unrepresentative of the true situation, is studied in detail. The performance of both model-based and partial least squares discriminant analysis is found to be robust to the composition of the training data and to model selection method. The Bayesian Information Criterion is shown to be more robust than 5-fold cross validation as a model selection method, especially when the training data set is very small and unrepresentative of the entire data set. (c) 2

关键词： food authenticity NIR spectroscopy classification model-based classification partial least squares regression

来源：评论

学校读者我要写书评

暂无评论

model-based Clustering and classification Using Mixtures of Multivariate Skewed Power Exponential Distributions

引用

JOURNAL OF classification 2023年第1期40卷 145-167页

作者： Dang, Utkarsh J. Gallaugher, Michael P. B. Browne, Ryan P. McNicholas, Paul D. Carleton Univ Dept Hlth Sci Ottawa ON Canada Baylor Univ Dept Stat Sci Waco TX USA Univ Waterloo Dept Stat & Actuarial Sci Waterloo ON Canada McMaster Univ Dept Math & Stat Hamilton ON Canada

Families of mixtures of multivariate power exponential (MPE) distributions have already been introduced and shown to be competitive for cluster analysis in comparison to other mixtures of elliptical distributions, including mixtures of Gaussian distributions. A family of mixtures of multivariate skewed power exponential distributions is proposed that combines the flexibility of the MPE distribution with the ability to model skewness. These mixtures are more robust to variations from normality and can account for skewness, varying tail weight, and peakedness of data. A generalized expectation-maximization approach, which combines minorization-maximization and optimization based on accelerated line search algorithms on the Stiefel manifold, is used for parameter estimation. These mixtures are implemented both in the unsupervised and semi-supervised classification frameworks. Both simulated and real data are used for illustration and comparison to other mixture families.

关键词： Generalized expectation-maximization algorithm Mixture models model-based classification model-based clustering Multivariate skewed power exponential distribution

来源：评论

学校读者我要写书评

暂无评论

Stacking model-based Classifiers for Dealing With Multiple Sets of Noisy Labels

引用

BIOMETRICAL JOURNAL 2025年第2期67卷 e70042页

作者： Montani, Giulia Cappozzo, Andrea Data Reply srl Turin Italy Univ Cattolica Sacro Cuore Dept Stat Sci Milan Italy

Supervised learning in presence of multiple sets of noisy labels is a challenging task that is receiving increasing interest in the ever-evolving landscape of healthcare analytics. Such an issue arises when multiple annotators are tasked to manually label the same training samples, potentially giving rise to discrepancies in class assignments among the supplied labels with respect to the ground truth. Commonly, the labeling process is entrusted to a small group of domain experts, and different level of experience and subjectivity may result in noisy training labels. To solve the classification task leveraging on the availability of multiple data annotators, we introduce a novel ensemble methodology constructed combining model-based classifiers separately trained on single sets of noisy labels. Eigenvalue Decomposition Discriminant Analysis is employed for the definition of the base learners, and six distinct averaging strategies are proposed to combine them. Two solutions necessitate a priori information, such as the partial knowledge of the ground truth labels or the annotators' level of expertise. Differently, the remaining four approaches are entirely data-driven. A simulation study and an application on real data showcase the improved predictive performance of our proposal, while also demonstrating the ability of automatically inferring annotators' expertise level as a by-product of the learning process.

关键词： ensemble models label noise model-based classification multiple labels supervised learning

来源：评论

学校读者我要写书评

暂无评论

Improving model choice in classification: an approach based on clustering of covariance matrices

引用

STATISTICS AND COMPUTING 2024年第3期34卷 100-100页

作者： Rodriguez-Vitores, David Matran, Carlos Univ Valladolid Dept Stat & Operat Res Paseo Belen 7 Valladolid 47011 Spain Univ Valladolid IMUVA Paseo Belen 7 Valladolid 47011 Spain

This work introduces a refinement of the Parsimonious model for fitting a Gaussian Mixture. The improvement is based on the consideration of clusters of the involved covariance matrices according to a criterion, such as sharing Principal Directions. This and other similarity criteria that arise from the spectral decomposition of a matrix are the bases of the Parsimonious model. We show that such groupings of covariance matrices can be achieved through simple modifications of the CEM (classification Expectation Maximization) algorithm. Our approach leads to propose Gaussian Mixture models for model-based clustering and discriminant analysis, in which covariance matrices are clustered according to a parsimonious criterion, creating intermediate steps between the fourteen widely known parsimonious models. The added versatility not only allows us to obtain models with fewer parameters for fitting the data, but also provides greater interpretability. We show its usefulness for model-based clustering and discriminant analysis, providing algorithms to find approximate solutions verifying suitable size, shape and orientation constraints, and applying them to both simulation and real data examples.

关键词： Parsimonious model Gaussian mixture model Bayesian information criterion model-based classification EM algorithm

来源：评论

学校读者我要写书评

暂无评论

Gaussian mixture models for the classification of high-dimensional vibrational spectroscopy data

引用

JOURNAL OF CHEMOMETRICS 2010年第11-12期24卷 719-727页

作者： Jacques, Julien Bouveyron, Charles Girard, Stephane Devos, Olivier Duponchel, Ludovic Ruckebusch, Cyril Univ Lille 1 LASIR CNRS UMR 8516 F-59655 Villeneuve Dascq France Univ Lille 1 Lab Paul Painleve CNRS UMR 8524 F-59655 Villeneuve Dascq France Univ Paris 01 Lab SAMM F-75231 Paris 05 France INRIA Rhone Alpes MISTIS Grenoble France Lab Jean Kuntzmann Grenoble France

In this work, a family of generative Gaussian models designed for the supervised classification of high-dimensional data is presented as well as the associated classification method called High-Dimensional Discriminant Analysis (HDDA). The features of these Gaussian models are as follows: i) the representation of the input density model is smooth;ii) the data of each class are modeled in a specific subspace of low dimensionality;iii) each class may have its own covariance structure;iv) model regularization is coupled to the classification criterion to avoid data over-fitting. To illustrate the abilities of the method, HDDA is applied on complex high-dimensional multi-class classification problems in mid-infrared and near-infrared spectroscopy and compared to state-of-the-art methods. Copyright (C) 2010 John Wiley & Sons, Ltd.

关键词： model-based classification high-dimensional gaussian model generative model vibrational spectroscopy

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：