检索结果-内蒙古大学图书馆

Boosted Random Ferns for Object Detection

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018年第2期40卷 272-288页

作者： Villamizar, Michael Andrade-Cetto, Juan Sanfeliu, Alberto Moreno-Noguer, Francesc CSIC UPC Inst Robot & Informat Ind Barcelona 08028 Spain Inst Robot & Informat Ind Barcelona 08028 Spain

In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we define binary features on the histogram of oriented gradients-domain (as opposed to intensity-), allowing for a better representation of intra-class variability. Second, both the positions where ferns are evaluated within the sliding window, and the location of the binary features for each fern are not chosen completely at random, but instead we use a boosting strategy to pick the most discriminative combination of them. This is further enhanced by our third contribution, that is to adapt the boosting strategy to enable sharing of binary features among different ferns, yielding high recognition rates at a low computational cost. And finally, we show that training can be performed online, for sequentially arriving images. Overall, the resulting classifier can be very efficiently trained, densely evaluated for all image locations in about 0.1 seconds, and provides detection rates similar to competing approaches that require expensive and significantly slower processing times. We demonstrate the effectiveness of our approach by thorough experimentation in publicly available datasets in which we compare against state-of-the-art, and for tasks of both 2D detection and 3D multi-view estimation.

关键词： image processing and computer vision object detection random ferns boosting online-boosting

来源：评论

学校读者我要写书评

暂无评论

Globally convergent autocalibration using interval analysis

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2004年第12期26卷 1633-1638页

作者： Fusiello, A Benedetti, A Farenzena, M Busti, A Univ Verona Dipartimento Informat I-37134 Verona Italy KLA Tencor Milpitas CA 95035 USA

We address the problem of autocalibration of a moving camera with unknown constant intrinsic parameters. Existing autocalibration techniques use numerical optimization algorithms whose convergence to the correct result cannot be guaranteed, in general. To address this problem, we have developed a method where an interval branch-and-bound method is employed for numerical minimization. Thanks to the properties of Interval Analysis this method converges to the global solution with mathematical certainty and arbitrary accuracy and the only input information it requires from the user are a set of point correspondences and a search interval. The cost function is based on the Huang-Faugeras constraint of the essential matrix. A recently proposed interval extension based on Bernstein polynomial forms has been investigated to speed up the search for the solution. Finally, experimental results are presented.

关键词： image processing and computer vision camera calibration modeling from video interval arithmetic 3D/stereo scene analysis self-calibration

来源：评论

学校读者我要写书评

暂无评论

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023年第10期45卷 11561-11574页

作者： Shepley, Andrew J. Falzon, Greg Kwan, Paul Brankovic, Ljiljana Univ New England Sch Sci & Technol Armidale NSW 2350 Australia Flinders Univ S Australia Adelaide SA 5000 Australia Melbourne Inst Technol Sydney NSW 3000 Australia

Confluence is a novel non-Intersection over Union (IoU) alternative to Non-Maxima Suppression (NMS) in bounding box post-processing in object detection. It overcomes the inherent limitations of IoU-based NMS variants to provide a more stable, consistent predictor of bounding box clustering by using a normalized Manhattan Distance inspired proximity metric to represent bounding box clustering. Unlike Greedy and Soft NMS, it does not rely solely on classification confidence scores to select optimal bounding boxes, instead selecting the box which is closest to every other box within a given cluster and removing highly confluent neighboring boxes. Confluence is experimentally validated on the MS COCO and CrowdHuman benchmarks, improving Average Precision by 0.2--2.7% and 1--3.8% respectively and Average Recall by 1.3--9.3 and 2.4--7.3% when compared against Greedy and Soft-NMS variants. Quantitative results are supported by extensive qualitative analysis and threshold sensitivity analysis experiments support the conclusion that Confluence is more robust than NMS variants. Confluence represents a paradigm shift in bounding box processing, with potential to replace IoU in bounding box regression processes.

关键词： computer vision edge and feature detection feature representation image processing and computer vision machine learning confluence non-maxima suppression object detection deep learning

来源：评论

学校读者我要写书评

暂无评论

Discriminative Optimization: Theory and Applications to computer vision

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019年第4期41卷 829-843页

作者： Vongkulbhisal, Jayakorn De la Torre, Fernando Costeira, Joao P. Carnegie Mellon Univ ECE Dept Pittsburgh PA 15213 USA Univ Lisbon ISR IST Lisbon Portugal Carnegie Mellon Univ Facebook Inc Pittsburgh PA USA Carnegie Mellon Univ Inst Robot Pittsburgh PA USA

Many computer vision problems are formulated as the optimization of a cost function. This approach faces two main challenges: designing a cost function with a local optimum at an acceptable solution, and developing an efficient numerical method to search for this optimum. While designing such functions is feasible in the noiseless case, the stability and location of local optima are mostly unknown under noise, occlusion, or missing data. In practice, this can result in undesirable local optima or not having a local optimum in the expected place. On the other hand, numerical optimization algorithms in high-dimensional spaces are typically local and often rely on expensive first or second order information to guide the search. To overcome these limitations, we propose Discriminative Optimization (DO), a method that learns search directions from data without the need of a cost function. DO explicitly learns a sequence of updates in the search space that leads to stationary points that correspond to the desired solutions. We provide a formal analysis of DO and illustrate its benefits in the problem of 3D registration, camera pose estimation, and image denoising. We show that DO outperformed or matched state-of-the-art algorithms in terms of accuracy, robustness, and computational efficiency.

关键词： Optimization gradient methods iterative methods image processing and computer vision machine learning

来源：评论

学校读者我要写书评

暂无评论

DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2010年第5期32卷 815-830页

作者： Tola, Engin Lepetit, Vincent Fua, Pascal Ecole Polytech Fed Lausanne Comp Vis Lab EPFL IC ISIM CVLab Stn 14 CH-1015 Lausanne Switzerland ISA INRIA Team Paris France INRIA Sophia Antipolis Sophia Antipolis France

In this paper, we introduce a local image descriptor, DAISY, which is very efficient to compute densely. We also present an EM-based algorithm to compute dense depth and occlusion maps from wide-baseline image pairs using this descriptor. This yields much better results in wide-baseline situations than the pixel and correlation-based algorithms that are commonly used in narrow-baseline stereo. Also, using a descriptor makes our algorithm robust against many photometric and geometric transformations. Our descriptor is inspired from earlier ones such as SIFT and GLOH but can be computed much faster for our purposes. Unlike SURF, which can also be computed efficiently at every pixel, it does not introduce artifacts that degrade the matching performance when used densely. It is important to note that our approach is the first algorithm that attempts to estimate dense depth maps from wide-baseline image pairs, and we show that it is a good one at that with many experiments for depth estimation accuracy, occlusion detection, and comparing it against other descriptors on laser-scanned ground truth scenes. We also tested our approach on a variety of indoor and outdoor scenes with different photometric and geometric transformations and our experiments support our claim to being robust against these.

关键词： image processing and computer vision dense depth map estimation local descriptors

来源：评论

学校读者我要写书评

暂无评论

Human-computer interaction for the generation of image processing applications

引用

INTERNATIONAL JOURNAL OF HUMAN-computer STUDIES 2011年第4期69卷 201-219页

作者： Clouard, Regis Renouf, Arnaud Revenu, Marinette ENSICAEN GREYC F-14050 Caen France Univ Caen F-14050 Caen France

The development of customized image processing applications is time consuming and requires high level skills. This paper describes the design of an interactive application generation system oriented towards producing image processing software programs. The description is focused on two models which constitute the core of the human-computer interaction. First, the formulation model identifies and organizes information that is assumed necessary and sufficient for developing image processing applications. This model is represented as a domain ontology which provides primitives for the formulation language. Second, the interaction model defines ways to acquire such information from end-users. The result of the interaction is an application ontology from which a suitable software is generated. This model emphases the gradual emergence of a semantics of the problem through purely symbolic representations. Based on these two models, a prototype system has been implemented to conduct experiments. (C) 2010 Elsevier Ltd. All rights reserved.

关键词： Human-computer interaction Knowledge acquisition Ontology design image processing and computer vision

来源：评论

学校读者我要写书评

暂无评论

Tracking by an Optimal Sequence of Linear Predictors

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2009年第4期31卷 677-692页

作者： Zimmermann, Karel Matas, Jiri Svoboda, Tomas Czech Tech Univ Fac Elect Engn Dept Cybernet Prague 12135 2 Czech Republic

We propose a learning approach to tracking explicitly minimizing the computational complexity of the tracking process subject to user-defined probability of failure (loss-of-lock) and precision. The tracker is formed by a Number of Sequences of Learned Linear Predictors (NoSLLiP). Robustness of NoSLLiP is achieved by modeling the object as a collection of local motion predictors-object motion is estimated by the outlier-tolerant RANSAC algorithm from local predictions. The efficiency of the NoSLLiP tracker stems 1) from the simplicity of the local predictors and 2) from the fact that all design decisions, the number of local predictors used by the tracker, their computational complexity (i.e., the number of observations the prediction is based on), locations as well as the number of RANSAC iterations, are all subject to the optimization (learning) process. All time-consuming operations are performed during the learning stage-tracking is reduced to only a few hundred integer multiplications in each step. On PC with 1xK8 3200+, a predictor evaluation requires about 30 mu s. The proposed approach is verified on publicly available sequences with approximately 12,000 frames with ground truth. Experiments demonstrate superiority in frame rates and robustness with respect to the SIFT detector, Lucas-Kanade tracker, and other trackers.

关键词： image processing and computer vision scene analysis tracking

来源：评论

学校读者我要写书评

暂无评论

Shape Recognition and Pose Estimation for Mobile Augmented Reality

引用

IEEE TRANSACTIONS ON VISUALIZATION AND computer GRAPHICS 2011年第10期17卷 1369-1379页

作者： Hagbi, Nate Bergig, Oriel El-Sana, Jihad Billinghurst, Mark Ben Gurion Univ Negev Visual Media Lab IL-84105 Beer Sheva Israel Univ Canterbury HIT Lab NZ Canterbury New Zealand

Nestor is a real-time recognition and camera pose estimation system for planar shapes. The system allows shapes that carry contextual meanings for humans to be used as Augmented Reality (AR) tracking targets. The user can teach the system new shapes in real time. New shapes can be shown to the system frontally, or they can be automatically rectified according to previously learned shapes. Shapes can be automatically assigned virtual content by classification according to a shape class library. Nestor performs shape recognition by analyzing contour structures and generating projective-invariant signatures from their concavities. The concavities are further used to extract features for pose estimation and tracking. Pose refinement is carried out by minimizing the reprojection error between sample points on each image contour and its library counterpart. Sample points are matched by evolving an active contour in real time. Our experiments show that the system provides stable and accurate registration, and runs at interactive frame rates on a Nokia N95 mobile phone.

关键词： Multimedia information systems artificial augmented and virtual realities image processing and computer vision scene analysis tracking

来源：评论

学校读者我要写书评

暂无评论

GUSignal: An Informatics Tool to Analyze Glucuronidase Gene Expression in Arabidopsis Thaliana Roots

引用

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023年第2期20卷 1073-1080页

作者： Herrera-Romero, Bryan Almeida-Galarraga, Diego Salum, Graciela M. Villalba-Meneses, Fernando Gudino-Gomezjurado, Marco Esteban Univ Invest Tecnol Expt Yachay Escuela Ciencias Biol & Ingn Urcuqui 100115 Ecuador

The uidA gene codifies for a glucuronidase (GUS) enzyme which has been used as a biotechnological tool during the last years. When uidA gene is fused to a gene's promotor region, it is possible to evaluate the activity of this one in response to a stimulus. Arabidopsis thaliana has served as the biological platform to elucidate molecular and regulatory signaling responses in plants. Transgenic lines of A. thaliana, tagged with the uidA gene, have allowed explaining how plants modify their hormonal pathways depending on the environmental conditions. Although the information extracted from microscopic images of these transgenic plants is often qualitative and in many publications is not subjected to quantification, in this paper we report the development of an informatics tool focused on computer vision for processing and analysis of digital images in order to analyze the expression of the GUS signal in A. thaliana roots, which is strongly correlated with the intensity of the grayscale images. This means that the presence of the GUS-induced color indicates where the gene has been actively expressed, such as our statistical analysis has demonstrated after treatment of A. thaliana DR5::GUS with naphtalen-acetic acid (0.0001 mM and 1 mM). GUSignal is a free informatics tool that aims to be fast and systematic during the image analysis since it executes specific and ordered instructions, to offer a segmented analysis by areas or regions of interest, providing quantitative results of the image intensity levels.

关键词： Arabidopsis thaliana digitalization and image capture image processing and computer vision image processing software

来源：评论

学校读者我要写书评

暂无评论

An investigation of the modified direction feature for cursive character recognition

引用

PATTERN RECOGNITION 2007年第2期40卷 376-388页

作者： Blumenstein, Michael Liu, Xin Yu Verma, Brijesh Griffith Univ Sch Informat & Commun Technol Gold Coast Mail Centre Qld 9726 Australia Univ Cent Queensland Sch Informat Technol Rockhampton Qld 4702 Australia

This paper describes and analyses the performance of a novel feature extraction technique for the recognition of segmented/cursive characters that may be used in the context of a segmentation-based handwritten word recognition system. The modified direction feature (MDF) extraction technique builds upon the direction feature (DF) technique proposed previously that extracts direction information from the structure of character contours. This principal was extended so that the direction information is integrated with a technique for detecting transitions between background and foreground pixels in the character image. In order to improve on the DF extraction technique, a number of modifications were undertaken. With a view to describe the character contour more effectively, a re-design of the direction number determination technique was performed. Also, an additional global feature was introduced to improve the recognition accuracy for those characters that were most frequently confused with patterns of similar appearance. MDF was tested using a neural network-based classifier and compared to the DF and transition feature (TF) extraction techniques. MDF outperformed both DF and TF techniques using a benchmark dataset and compared favourably with the top results in the literature. A recognition accuracy of above 89% is reported on characters from the CEDAR dataset. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

关键词： handwritten character recognition pattern recognition image processing and computer vision neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：