检索结果-内蒙古大学图书馆

Uncertain LDA: Including Observation Uncertainties in Discriminative Transforms

ieee TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2016年第7期38卷 1479-88页

作者： Saeidi, Rahim Astudillo, Ramon Fernandez Kolossa, Dorothea Aalto Univ Dept Signal Proc & Acoust Espoo Uusimaa Finland INESC ID Spoken Language Syst Lab Lisbon Portugal Ruhr Univ Bochum Inst Commun Acoust Univ Str 150 Bochum Nrw Germany

Linear discriminant analysis (LDA) is a powerful technique in pattern recognition to reduce the dimensionality of data vectors. It maximizes discriminability by retaining only those directions that minimize the ratio of within-class and between-class variance. In this paper, using the same principles as for conventional LDA, we propose to employ uncertainties of the noisy or distorted input data in order to estimate maximally discriminant directions. We demonstrate the efficiency of the proposed uncertain LDA on two applications using state-of-the-art techniques. First, we experiment with an automatic speech recognition task, in which the uncertainty of observations is imposed by real-world additive noise. Next, we examine a full-scale speaker recognition system, considering the utterance duration as the source of uncertainty in authenticating a speaker. the experimental results show that when employing an appropriate uncertainty estimation algorithm, uncertain LDA outperforms its conventional LDA counterpart.

关键词： Uncertainty linear discriminant analysis LDA speaker recognition speech recognition

来源：评论

学校读者我要写书评

暂无评论

Sparsifying Neural Network Connections for Face recognition

Sparsifying Neural Network Connections for Face Recognition

引用

ieee conference on computer vision and pattern recognition

作者： Yi Sun Xiaogang Wang Xiaoou Tang SenseTime Group Department of Electronic Engineering The Chinese University of Hong Kong Department of Information Engineering The Chinese University of Hong Kong

ISBN: (纸本)9781467388528

this paper proposes to learn high-performance deep ConvNets with sparse neural connections, referred to as sparse ConvNets, for face recognition. the sparse ConvNets are learned in an iterative way, each time one additional layer is sparsified and the entire model is re-trained given the initial weights learned in previous iterations. One important finding is that directly training the sparse ConvNet from scratch failed to find good solutions for face recognition, while using a previously learned denser model to properly initialize a sparser model is critical to continue learning effective features for face recognition. this paper also proposes a new neural correlation-based weight selection criterion and empirically verifies its effectiveness in selecting informative connections from previously learned models in each iteration. When taking a moderately sparse structure (26%-76% of weights in the dense model), the proposed sparse ConvNet model significantly improves the face recognition performance of the previous state-of-the-art DeepID2+ models given the same training data, while it keeps the performance of the baseline model with only 12% of the original parameters.

关键词： Facial recognition Connections Connections iterative methods Societies

来源：评论

学校读者我要写书评

暂无评论

Robust Multi-Image Based Blind Face Hallucination

Robust Multi-Image Based Blind Face Hallucination

引用

ieee conference on computer vision and pattern recognition (cvpr)

作者： Jin, Yonggang Bouganis, Christos-Savvas Univ Bristol Bristol BS8 1TH Avon England Imperial Coll London London England

ISBN: (纸本)9781467369640

this paper proposes a robust multi-image based blind face hallucination framework to super-resolve LR faces. the proposed framework first estimates both blurring kernel and transformations of multiple LR faces by robust deblurring and registration in PCA subspace. A patch-wise mixture of probabilistic PCA prior is then incorporated for face super-resolution. Previous work on face SR using PCA prior can be viewed as special cases of the framework. Experimental results in both simulated and real LR sequences demonstrate very promising performance of the proposed method.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Rotating Your Face Using Multi-task Deep Neural Network

Rotating Your Face Using Multi-task Deep Neural Network

引用

ieee conference on computer vision and pattern recognition (cvpr)

作者： Yim, Junho Jung, Heechul Yoo, Byungln Choi, Changkyu Parke, Dusik Kim, Junmo Korea Adv Inst Sci & Technol Sch Elect Engn Daejeon South Korea Samsung Adv Inst Technol Suwon Gyeonggi Do South Korea

ISBN: (纸本)9781467369640

Face recognition under viewpoint and illumination changes is a difficult problem, so many researchers have tried to solve this problem by producing the pose- and illumination- invariant feature. Zhu et al. [26] changed all arbitrary pose and illumination images to the frontal view image to use for the invariant feature. In this scheme, preserving identity while rotating pose image is a crucial issue. this paper proposes a new deep architecture based on a novel type of multitask learning, which can achieve superior performance in rotating to a target-pose face image from an arbitrary pose and illumination image while preserving identity. the target pose can be controlled by the user's intention. this novel type of multi-task model significantly improves identity preservation over the single task model. By using all the synthesized controlled pose images, called Controlled Pose Image (CPI), for the poseillumination- invariant feature and voting among the multiple face recognition results, we clearly outperform the state-of-the-art algorithms by more than 4 similar to 6% on the MultiPIE dataset.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Partial Optimality by Pruning for MAP-Inference with General Graphical Models

引用

ieee TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2016年第7期38卷 1370-82页

作者： Swoboda, Paul Shekhovtsov, Alexander Kappes, Joerg Hendrik Schnoerr, Christoph Savchynskyy, Bogdan Heidelberg Univ Image & Pattern Anal Grp IPA Speyerer Str 6 D-69115 Heidelberg Germany Graz Univ Technol Inst Comp Graph & Vis ICG Inffeldgasse 16 A-8010 Graz Austria Heidelberg Univ Heidelberg Collaboratory Image Proc HCI Speyerer Str 6 D-69115 Heidelberg Germany

We consider the energy minimization problem for undirected graphical models, also known as MAP-inference problem for Markov random fields which is NP-hard in general. We propose a novel polynomial time algorithm to obtain a part of its optimal non-relaxed integral solution. Our algorithm is initialized with variables taking integral values in the solution of a convex relaxation of the MAP-inference problem and iteratively prunes those, which do not satisfy our criterion for partial optimality. We show that our pruning strategy is in a certain sense theoretically optimal. Also empirically our method outperforms previous approaches in terms of the number of persistently labelled variables. the method is very general, as it is applicable to models with arbitrary factors of an arbitrary order and can employ any solver for the considered relaxed problem. Our method's runtime is determined by the runtime of the convex relaxation solver for the MAP-inference problem.

关键词： MAP-inference Markov random fields energy minimization persistency partial optimality local polytope

来源：评论

学校读者我要写书评

暂无评论

Robust multiple homography estimation: An ill-solved problem

Robust multiple homography estimation: An ill-solved problem

引用

ieee conference on computer vision and pattern recognition, cvpr 2015

作者： Szpak, Zygmunt L. Chojnacki, Wojciech Van Den Hengel, Anton School of Computer Science University of Adelaide SA5005 Australia

ISBN: (纸本)9781467369640

the estimation of multiple homographies between two piecewise planar views of a rigid scene is often assumed to be a solved problem. We show that contrary to popular opinion various crucial aspects of the task have not been adequately emphasised. We are motivated by a growing body of literature in robust multi-structure estimation that purports to solve the multi-homography estimation problem but in fact does not. We demonstrate that the estimation of multiple homographies is an ill-solved problem by deriving new constraints that a set of mutually compatible homographies must satisfy, and by showing that homographies estimated with prevailing methods fail to satisfy the requisite constraints on real-world data. We also explain why incompatible homographies imply inconsistent epipolar geometries. the arguments and experiments presented in this paper signal the need for a new generation of robust multi-structure estimation methods that have the capacity to enforce constraints on projective entities such as homography matrices. © 2015 ieee.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

A Semantic Occlusion Model for Human Pose Estimation from a Single Depth Image

A Semantic Occlusion Model for Human Pose Estimation from a ...

引用

ieee conference on computer vision and pattern recognition (cvpr)

作者： Rafi, Umer Gall, Juergen Leibe, Bastian Rhein Westfal TH Aachen Aachen Germany Univ Bonn Bonn Germany

ISBN: (纸本)9781467367592

Human pose estimation from depth data has made significant progress in recent years and commercial sensors estimate human poses in real-time. However, state-of-the-art methods fail in many situations when the humans are partially occluded by objects. In this work, we introduce a semantic occlusion model that is incorporated into a regression forest approach for human pose estimation from depth data. the approach exploits the context information of occluding objects like a table to predict the locations of occluded joints. In our experiments on synthetic and real data, we show that our occlusion model increases the joint estimation accuracy and outperforms the commercial Kinect 2 SDK for occluded joints.

关键词： Context Joints Semantics three-dimensional displays Training Training data

来源：评论

学校读者我要写书评

暂无评论

Associating neural word embeddings with deep image representations using Fisher Vectors

Associating neural word embeddings with deep image represent...

引用

ieee conference on computer vision and pattern recognition, cvpr 2015

作者： Klein, Benjamin Lev, Guy Sadeh, Gil Wolf, Lior Blavatnik School of Computer Science Tel Aviv University Israel

ISBN: (纸本)9781467369640

In recent years, the problem of associating a sentence with an image has gained a lot of attention. this work continues to push the envelope and makes further progress in the performance of image annotation and image search by a sentence tasks. In this work, we are using the Fisher Vector as a sentence representation by pooling the word2vec embedding of each word in the sentence. the Fisher Vector is typically taken as the gradients of the log-likelihood of descriptors, with respect to the parameters of a Gaussian Mixture Model (GMM). In this work we present two other Mixture Models and derive their Expectation-Maximization and Fisher Vector expressions. the first is a Laplacian Mixture Model (LMM), which is based on the Laplacian distribution. the second Mixture Model presented is a Hybrid Gaussian-Laplacian Mixture Model (HGLMM) which is based on a weighted geometric mean of the Gaussian and Laplacian distribution. Finally, by using the new Fisher Vectors derived from HGLMMs to represent sentences, we achieve state-of-the-art results for both the image annotation and the image search by a sentence tasks on four benchmarks: Pascal1K, Flickr8K, Flickr30K, and COCO. © 2015 ieee.

关键词： Image annotation

来源：评论

学校读者我要写书评

暂无评论

Reflectance Hashing for Material recognition

Reflectance Hashing for Material Recognition

引用

ieee conference on computer vision and pattern recognition

作者： Hang Zhang Kristin Dana Ko Nishino Dept. of Electr. & Comput. Eng. Rutgers Univ. Piscataway NJ USA

ISBN: (纸本)9781467369657

We introduce a novel method for using reflectance to identify materials. Reflectance offers a unique signature of the material but is challenging to measure and use for recognizing materials due to its high-dimensionality. In this work, one-shot reflectance of a material surface which we refer to as a reflectance disk is capturing using a unique optical camera. the pixel coordinates of these reflectance disks correspond to the surface viewing angles. the reflectance has class-specific stucture and angular gradients computed in this reflectance space reveal the material class. these reflectance disks encode discriminative information for efficient and accurate material recognition. We introduce a framework called reflectance hashing that models the reflectance disks with dictionary learning and binary hashing. We demonstrate the effectiveness of reflectance hashing for material recognition with a number of real-world materials.

关键词： Reflectance hashing surface properties(materials) pixel coordinates disks cultivators

来源：评论

学校读者我要写书评

暂无评论

CNN based common approach to handwritten character recognition of multiple scripts 13

CNN based common approach to handwritten character recogniti...

引用

13th International conference on Document Analysis and recognition, ICDAR 2015

作者： Maitra, Durjoy Sen Bhattacharya, Ujjwal Parui, Swapan K. Computer Vision and Pattern Recognition Unit Indian Statistical Institute 203 B. T. Road Kolkata-108 India

ISBN: (纸本)9781479918058

there are many scripts in the world, several of which are used by hundreds of millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. However, convolutional neural network (CNN) has recently been used as an efficient unsupervised feature vector extractor. Although such a network can be used as a unified framework for both feature extraction and classification, it is more efficient as a feature extractor than as a classifier. In the present study, we performed certain amount of training of a 5-layer CNN for a moderately large class character recognition problem. We used this CNN trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems. In each case, a distinct Support Vector Machine (SVM) was used as the corresponding classifier. In particular, the CNN of the present study is trained using samples of a standard 50-class Bangla basic character database and features have been extracted for 5 different 10-class numeral recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script. recognition accuracies are comparable with the state-of-the-art. © 2015 ieee.

关键词： Character recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：