BackgroundHistopathology images of tumor biopsies present unique challenges for applying machine learning to the diagnosis and treatment of cancer. The pathology slides are high resolution, often exceeding 1GB, have n...
详细信息
BackgroundHistopathology images of tumor biopsies present unique challenges for applying machine learning to the diagnosis and treatment of cancer. The pathology slides are high resolution, often exceeding 1GB, have non-uniform dimensions, and often contain multiple tissue slices of varying sizes surrounded by large empty regions. The locations of abnormal or cancerous cells, which may constitute a small portion of any given tissue sample, are not annotated. Cancer image datasets are also extremely imbalanced, with most slides being associated with relatively common cancers. Since deep representations trained on natural photographs are unlikely to be optimal for classifying pathology slide images, which have different spectral ranges and spatial structure, we here describe an approach for learning features and inferring representations of cancer pathology slides based on sparse *** show that conventional transfer learning using a state-of-the-art deep learning architecture pre-trained on ImageNet (RESNET) and fine tuned for a binary tumor/no-tumor classification task achieved between 85% and 86% accuracy. However, when all layers up to the last convolutional layer in RESNET are replaced with a single feature map inferred via a sparse coding using a dictionary optimized for sparse reconstruction of unlabeled pathology slides, classification performance improves to over 93%, corresponding to a 54% error *** conclude that a feature dictionary optimized for biomedical imagery may in general support better classification performance than does conventional transfer learning using a dictionary pre-trained on natural images.
Neurobiological studies have shown that neurons in the primary visual cortex (V1) may employ sparse presentations to represent stimuli. We describe a network model for sparse coding which includes input layer, base fu...
详细信息
Neurobiological studies have shown that neurons in the primary visual cortex (V1) may employ sparse presentations to represent stimuli. We describe a network model for sparse coding which includes input layer, base functional layer and output layer. We simulated standard sparse coding and sparse coding based on fast independent component analysis (ICA). The duration of training bases, the convergence speed of objective function and the sparsity of coefficient matrix were compared, respectively. The results show that sparse coding based on fast ICA is more effective than standard sparse coding.
作者:
Lu, XiaoqiangWang, YulongYuan, YuanChinese Acad Sci
Xian Inst Opt & Precis Mech State Key Lab Transient Opt & Photon Ctr Opt Imagery Anal & Learning OPTIMAL Xian 710119 Peoples R China Hubei Univ
Fac Math & Comp Sci Wuhan 430062 Peoples R China
sparse coding is a promising theme in computer vision. Most of the existing sparse coding methods are based on either l(0) or l(1) penalty, which often leads to unstable solution or biased estimation. This is because ...
详细信息
sparse coding is a promising theme in computer vision. Most of the existing sparse coding methods are based on either l(0) or l(1) penalty, which often leads to unstable solution or biased estimation. This is because of the nonconvexity and discontinuity of the l(0) penalty and the over-penalization on the true large coefficients of the l(1) penalty. In this paper, sparse coding is interpreted from a novel Bayesian perspective, which results in a new objective function through maximum a posteriori estimation. The obtained solution of the objective function can generate more stable results than the l(0) penalty and smaller reconstruction errors than the l(1) penalty. In addition, the convergence property of the proposed algorithm for sparse coding is also established. The experiments on applications in single image super-resolution and visual tracking demonstrate that the proposed method is more effective than other state-of-the-art methods.
In this paper, we propose a novel method based on sparse coding and classifier ensemble for tackling image categorization problem under the framework of multi-instance learning (MIL). Specifically, a dictionary is lea...
详细信息
In this paper, we propose a novel method based on sparse coding and classifier ensemble for tackling image categorization problem under the framework of multi-instance learning (MIL). Specifically, a dictionary is learned from the instances of all the training bags. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is also represented one feature vector which is achieved via sparse representations of all instances within the bag. Thus, the MIL problem is converted to a single-instance learning problem that can be solved by well-know single-instance learning methods, such as support vector machines (SVMs). Two strategies are used to improve classification performance: first, the component classifiers are obtained by repeatedly using the above method with dictionaries of different sizes;second, the result of classifier ensemble is used for prediction. Experimental results on the COREL data sets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods. (c) 2012 Elsevier B.V. All rights reserved.
Existing color sampling-based alpha matting methods use the compositing equation to estimate alpha at a pixel from the pairs of foreground (F) and background (B) samples. The quality of the matte depends on the select...
详细信息
Existing color sampling-based alpha matting methods use the compositing equation to estimate alpha at a pixel from the pairs of foreground (F) and background (B) samples. The quality of the matte depends on the selected (F, B) pairs. In this paper, the matting problem is reinterpreted as a sparse coding of pixel features, wherein the sum of the codes gives the estimate of the alpha matte from a set of unpaired F and B samples. A non-parametric probabilistic segmentation provides a certainty measure on the pixel belonging to foreground or background, based on which a dictionary is formed for use in sparse coding. By removing the restriction to conform to (F, B) pairs, this method allows for better alpha estimation from multiple F and B samples. The same framework is extended to videos, where the requirement of temporal coherence is handled effectively. Here, the dictionary is formed by samples from multiple frames. A multi-frame graph model, as opposed to a single image as for image matting, is proposed that can be solved efficiently in closed form. Quantitative and qualitative evaluations on a benchmark dataset are provided to show that the proposed method outperforms the current stateoftheart in image and video matting.
An important approach in visual neuroscience considers how the function of the early visual system relates to the statistics of its natural input. Previous work has shown how the classical receptive fields and the org...
详细信息
An important approach in visual neuroscience considers how the function of the early visual system relates to the statistics of its natural input. Previous work has shown how the classical receptive fields and the organization (topography) of the primary visual cortex can be viewed as efficient coding of natural images. Here, we extend the framework by considering how the responses of complex cells could be efficiently coded by a higher-order neural layer. This leads to the sparse coding of contours in natural images, and can explain certain extra-classical properties of receptive fields. (C) 2002 Elsevier Science B.V. All rights reserved.
The task of matching observations of the same person in disjoint views captured by non-overlapping cameras is known as the person re-identification problem. It is challenging owing to low-quality images, inter-object ...
详细信息
The task of matching observations of the same person in disjoint views captured by non-overlapping cameras is known as the person re-identification problem. It is challenging owing to low-quality images, inter-object occlusions, and variations in illumination, viewpoints and poses. Unlike previous approaches that learn Mahalanobis-like distance metrics, we propose a novel approach based on dictionary learning that takes the advances of sparse coding of discriminatingly and cross-view invariantly encoding features representing different people. Firstly, we propose a robust and discriminative feature extraction method of different feature levels. The feature representations are projected to a lower computation common subspace. Secondly, we learn a single cross-view invariant dictionary for each feature level for different camera views and a fusion strategy is utilized to generate the final matching results. Experimental statistics show the superior performance of our approach by comparing with state-of-the-art methods on two publicly available benchmark datasets VIPeR and PRID 2011.
In this paper, we examine the problem of learning sparse representations of visual patterns in the context of artificial and biological vision systems. There are a myriad of strategies for sparse coding that often res...
详细信息
In this paper, we examine the problem of learning sparse representations of visual patterns in the context of artificial and biological vision systems. There are a myriad of strategies for sparse coding that often result in similar feature properties for the learned feature set. Typically this results in a bank of Gabor-like or edge filters that are sensitive to a range of distinct angular and radial frequencies. The theory and experimentation that is presented in this paper serves to provide a better understanding of a number of specific properties related to low-level feature learning. This includes close examination of the role of phase pairing in complex cells, the role of depth information and its relationship to variation of intensity and chroma, and deriving hybrid features that borrow from both analytic forms and statistical methods. Together, these specific examples provide context for more general discussion of effective strategies for feature learning. In particular, we make the case that imposing additional constraints on mechanisms for feature learning inspired by biological vision systems can be useful in guiding constrained optimization towards convergence, or specific desirable computational properties for representation of visual input in artificial vision systems. (C) 2015 Elsevier B.V. All rights reserved.
Computer animation researchers have been extensively investigating 3D facial-expression synthesis for decades. However, flexible, robust production of realistic 3D facial expressions is still technically challenging. ...
详细信息
Computer animation researchers have been extensively investigating 3D facial-expression synthesis for decades. However, flexible, robust production of realistic 3D facial expressions is still technically challenging. A proposed modeling framework applies sparse coding to synthesize 3D expressive faces, using specified coefficients or expression examples. It also robustly recovers facial expressions from noisy and incomplete data. This approach can synthesize higher-quality expressions in less time than the state-of-the-art techniques.
We use shift-invariant Non-negative Matrix Factorization (NMF) for decomposing continuous-valued time series into a number of characteristic primitives, i.e. the basis vectors, and their activations, which results in ...
详细信息
We use shift-invariant Non-negative Matrix Factorization (NMF) for decomposing continuous-valued time series into a number of characteristic primitives, i.e. the basis vectors, and their activations, which results in a model-independent and fully data driven parts-based representation. We interpret the basis vectors as short parts of motion that are shared between all trajectories in the data set, and the activations as onset times of those parts. The extension of the shift-invariant NMF by a new competition term between adjacent activations allows to gain temporally isolated activation events, which further supports this interpretation. We show that the resulting sparse and compact representation can be used for the prediction of motion trajectories, and that it can be beneficial for classification, because it allows the application of simple standard classification models with few parameters. In this paper we show that basis vectors can be extracted, which can be interpreted as short motion segments. We present results on trajectory prediction, and show that the sparse representation can be used for classification of trajectories of a single joint, like the one of a hand, obtained by motion capturing. (C) 2013 Elsevier B.V. All rights reserved.
暂无评论