检索结果-内蒙古大学图书馆

作者： Bernal-Marin, Miguel Bayro-Corrochano, Eduardo CINVESTAV Unidad Guadalajara Dept. Electrical Engineering and Computer Science Jalisco Mexico

This paper presents the application of 2D and 3D Hough Transforms together with conformal geometric algebra to build 3D geometric maps using the geometric entities of lines and planes. Among several existing techniques for robot self-localization, a new approach is proposed for map matching in the Hough domain. The geometric Hough representation is formulated in such a way that one can easily relate it to the conformal geometric algebra framework;thus, the detected lines and planes can be used for algebra-of-incidence computations to find geometric constraints, useful when perceiving special configurations in 3D visual space for exploration, navigation, relocation and obstacle avoidance. We believe that this work is very useful for 2D and 3D geometric pattern recognition in robot vision tasks. © 2011 Elsevier B.V. All rights reserved.

关键词： Hough transforms

来源：评论

学校读者我要写书评

暂无评论

Saliency estimation using a non-parametric low-level vision model

Saliency estimation using a non-parametric low-level vision ...

引用

作者： Murray, Naila Vanrell, Maria Otazu, Xavier Parraga, C. Alejandro Computer Vision Center Computer Science Department Universitat Autònoma de Barcelona Bellaterra Barcelona Spain

ISBN: (纸本)9781457703942

Many successful models for predicting attention in a scene involve three main steps: convolution with a set of filters, a center-surround mechanism and spatial pooling to construct a saliency map. However, integrating spatial information and justifying the choice of various parameter values remain open problems. In this paper we show that an efficient model of color appearance in human vision, which contains a principled selection of parameters as well as an innate spatial pooling mechanism, can be generalized to obtain a saliency model that outperforms state-of-the-art models. Scale integration is achieved by an inverse wavelet transform over the set of scale-weighted center-surround responses. The scale-weighting function (termed ECSF) has been optimized to better replicate psychophysical data on color appearance, and the appropriate sizes of the center-surround inhibition windows have been determined by training a Gaussian Mixture Model on eye-fixation data, thus avoiding ad-hoc parameter selection. Additionally, we conclude that the extension of a color appearance model to saliency estimation adds to the evidence for a common low-level visual front-end for different visual tasks. © 2011 IEEE.

关键词： Color

来源：评论

学校读者我要写书评

暂无评论

Neighborhood Dependent Approximation by Nonlinear Embedding for Face recognition

引用

16th International Conference on Image Analysis and Processing (ICIAP)

作者： Alex, Ann Theja Asari, Vijayan K. Mathew, Alex Univ Dayton Dept Elect & Comp Engn Comp Vision & Wide Area Surveillance Lab Dayton OH 45469 USA

ISBN: (纸本)9783642240850

Variations in pose, illumination and expression in faces make face recognition a difficult problem. Several researchers have shown that faces of the same individual, despite all these variations, lie on a complex manifold in a higher dimensional space. Several methods have been proposed to exploit this fact to build better recognition systems, but have not succeeded to a satisfactory extent. We propose a new method to model this higher dimensional manifold with available data, and use a reconstruction technique to approximate unavailable data points. The proposed method is tested on Sheffield (previously UMIST) database, Extended Yale Face database B and AT&T (previously ORL) database of faces. Our method outperforms other manifold based methods such as Nearest Manifold and other methods such as PCA, LDA Modular PCA, Generalized 2D PCA and super-resolution method for face recognition using nonlinear mappings on coherent features.

关键词： Face recognition Manifold Learning Nonlinear Embedding

来源：评论

学校读者我要写书评

暂无评论

Proposing a CNN Based Architecture of Mid-level vision for Feeding the WHERE and WHAT Pathways in the Brain

Proposing a CNN Based Architecture of Mid-level Vision for F...

引用

2nd Swarm, Evolutionary and Memetic Computing Conference (SEMCCO 2011)

作者： Das, Apurba Roy, Anirban Ghosh, Kuntal CDAC Kolkata India Techno India Kolkata India Indian Stat Inst Machine Intelligence Unit Kolkata India Indian Stat Inst Ctr Soft Comp Res Kolkata India

ISBN: (纸本)9783642271717

In the central visual pathway originating from the eye, a bridging is required between two hierarchical tasks, that of pixel based information recording by visual pathway at low level on one hand and that of object recognition at high level on the other. Such a bridge which may be designated as a mid-level block-grained integration has here been modeled by a multi-layer flexible cellular neural network (F-CNN). The proposed CNN architecture is validated by different intermediate level tasks involving rigid and deformable pattern recognition. Execution of such tasks by the proposed architecture, it has been shown, is capable of generating valid and significant inputs for the WHERE (dorsal) and WHAT (ventral) pathways in the brain. The model includes the proposal of a feedback (also by CNN architecture) to the lower mid-level from the higher mid-level dorsal and ventral pathways for flexible cell (physiological receptive field) size adjustment in the primary visual cortex towards successful 'where' and 'what' identifications for high-level vision.

关键词： Cellular Neural Network (CNN) dorsal and ventral pathways visual cortical column scale-space attention model

来源：评论

学校读者我要写书评

暂无评论

2,1-norm based regression for classification

2,1-norm based regression for classification

引用

1st Asian Conference on pattern recognition, ACPR 2011

作者： Ren, Chuan-Xian Dai, Dao-Qing Yan, Hong Center of Computer Vision Department of Mathematics Sun Yat-Sen University Guangzhou 510275 China Department of Electric Engineering City University of Hong Kong 83 Tat Chee Avenue Kowloon Hong Kong

ISBN: (纸本)9781457701221

We present a novel classification method formulating an objective model by 2,1-norm based regression. The 2,1-norm based loss function is robust to outliers or the large variations within given data, and the 2,1-norm regularization term selects correlated samples across the whole training set with grouped sparsity. This constrained optimization problem can be efficiently solved by an iterative procedure. Several benchmark data sets including facial images and gene expression data are used for evaluating the robustness and effectiveness of the new proposed algorithm, and the results show the competitive performance. © 2011 IEEE.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Webcam geo-localization using aggregate light levels

Webcam geo-localization using aggregate light levels

引用

2011 IEEE Workshop on Applications of computer vision, WACV 2011

作者： Jacobs, Nathan Miskell, Kylia Pless, Robert University of Kentucky United States Washington University St. Louis United States

ISBN: (纸本)9781424494965

We consider the problem of geo-locating static cameras from long-term time-lapse imagery. This problem has received significant attention recently, with most methods making strong assumptions on the geometric structure of the scene. We explore a simple, robust cue that relates overall image intensity to the zenith angle of the sun (which need not be visible). We characterize the accuracy of geolocation based on this cue as a function of different models of the zenith-intensity relationship and the amount of imagery available. We evaluate our algorithm on a dataset of more than 60 million images captured from outdoor webcams located around the globe. We find that using our algorithm with images sampled every 30 minutes, yields localization errors of less than 100km for the majority of cameras. © 2010 IEEE.

关键词： pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Graph based pattern matching

Graph based pattern matching

引用

2011 8th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2011, Jointly with the 2011 7th International Conference on Natural Computation, ICNC'11

作者： Pawar, Vaishali S. Zaveri, Mukesh A. Information Technology Department MVP s'S KBT College of Engineering Nashik India Computer Engineering Department S. V. National Institute of Technology Surat India

ISBN: (纸本)9781612841816

The Graphs are very powerful and widely used tool for data representation in various fields of science and engineering. Due to their versatile representational power graphs are widely used for dealing with structural information in different domains such as pattern recognition, computer vision, networks, biochemical applications, psycho-sociology, image interpretation, and many others. In many applications, it is necessary to find the similarity between objects. When graphs are used for representation of structured objects, then the problem of measuring object similarity converts into the problem of determining the similarity of graphs, which is also known as graph matching. In order to achieve a good degree of graph matching the most usual way is to search for a graph isomorphism. A lot of work has been done in the area of graph isomorphism between two graphs or sub-graphs. Depending on the nature of the application, the problem turns to either exact or inexact graph isomorphism. In this study work, we are going to discuss the important theoretical foundations of graph matching along with the error-correcting graph matching approach. © 2011 IEEE.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Algorithms for Adaptive Nonlinear pattern recognition

Algorithms for Adaptive Nonlinear Pattern Recognition

引用

Conference on Mathematics of Data/Image pattern Coding, Compression, and Encryption with Applications XIII

作者： Schmalz, Mark S. Ritter, Gerhard X. Hayden, Eric Key, Gary Univ Florida Dept Comp & Informat Sci & Engn Gainesville FL 32611 USA Frontier Technol Inc Altamonte Springs FL 32701 USA

ISBN: (纸本)9780819487469

In Bayesian pattern recognition research, static classifiers have featured prominently in the literature. A static classifier is essentially based on a static model of input statistics, thereby assuming input ergodicity that is not realistic in practice. Classical Bayesian approaches attempt to circumvent the limitations of static classifiers, which can include brittleness and narrow coverage, by training extensively on a data set that is assumed to cover more than the subtense of expected input. Such assumptions are not realistic for more complex pattern classification tasks, for example, object detection using pattern classification applied to the output of computer vision filters. In contrast, we have developed a two step process, that can render the majority of static classifiers adaptive, such that the tracking of input nonergodicities is supported. Firstly, we developed operations that dynamically insert (or resp. delete) training patterns into (resp. from) the classifier's pattern database, without requiring that the classifier's internal representation of its training database be completely recomputed. Secondly, we developed and applied a pattern replacement algorithm that uses the aforementioned pattern insertion/deletion operations. This algorithm is designed to optimize the pattern database for a given set of performance measures, thereby supporting closed-loop, performance-directed optimization. This paper presents theory and algorithmic approaches for the efficient computation of adaptive linear and nonlinear pattern recognition operators that use our pattern insertion/deletion technology - in particular, tabular nearest-neighbor encoding (TNE) and lattice associative memories (LAMs). Of particular interest is the classification of nonergodic datastreams that have noise corruption with time-varying statistics. The TNE and LAM based classifiers discussed herein have been successfully applied to the computation of object classification in hyperspectral re

关键词： pattern recognition Adaptive pattern classification Neural networks

来源：评论

学校读者我要写书评

暂无评论

Towards a practical lipreading system

Towards a practical lipreading system

引用

作者： Zhou, Ziheng Zhao, Guoying Pietikainen, Matti Machine Vision Group Computer Science and Engineering Laboratory University of Oulu P.O. Box 4500 FI-90014 Oulu Finland

ISBN: (纸本)9781457703942

A practical lipreading system can be considered either as subject dependent (SD) or subject-independent (SI). An SD system is user-specific, i.e., customized for some particular user while an SI system has to cope with a large number of users. These two types of systems pose variant challenges and have to be treated differently. In this paper, we propose a simple deterministic model to tackle the problem. The model first seeks a low-dimensional manifold where visual features extracted from the frames of a video can be projected onto a continuous deterministic curve embedded in a path graph. Moreover, it can map arbitrary points on the curve back into the image space, making it suitable for temporal interpolation. Based on the model, we develop two separate strategies for SD and SI lipreading. The former is turned into a simple curve-matching problem while for the latter, we propose a video-normalization scheme to improve the system developed by Zhao et al. We evaluated our system on the OuluVS database and achieved recognition rates more than 20% higher than the ones reported by Zhao et al. in both SD and SI testing scenarios. © 2011 IEEE.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Bilayer Segmentation of Webcam Videos Using Tree-Based Classifiers

引用

IEEE TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2011年第1期33卷 30-42页

作者： Yin, Pei Criminisi, Antonio Winn, John Essa, Irfan Microsoft Corp Redmond WA 98052 USA Microsoft Res Cambridge Cambridge CB3 0FB England Georgia Inst Technol Sch Interact Comp Coll Comp Atlanta GA 30332 USA

This paper presents an automatic segmentation algorithm for video frames captured by a (monocular) webcam that closely approximates depth segmentation from a stereo camera. The frames are segmented into foreground and background layers that comprise a subject (participant) and other objects and individuals. The algorithm produces correct segmentations even in the presence of large background motion with a nearly stationary foreground. This research makes three key contributions: First, we introduce a novel motion representation, referred to as "motons," inspired by research in object recognition. Second, we propose estimating the segmentation likelihood from the spatial context of motion. The estimation is efficiently learned by random forests. Third, we introduce a general taxonomy of tree-based classifiers that facilitates both theoretical and experimental comparisons of several known classification algorithms and generates new ones. In our bilayer segmentation algorithm, diverse visual cues such as motion, motion context, color, contrast, and spatial priors are fused by means of a conditional random field (CRF) model. Segmentation is then achieved by binary min-cut. Experiments on many sequences of our videochat application demonstrate that our algorithm, which requires no initialization, is effective in a variety of scenes, and the segmentation results are comparable to those obtained by stereo systems.

关键词： computer vision image understanding machine learning decision tree random forests boosting motion analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：