检索结果-内蒙古大学图书馆

A unified framework for local visual descriptors evaluation

PATTERN RECOGNITION 2015年第4期48卷 1174-1184页

作者： Kihl, Olivier Picard, David Gosselin, Philippe-Henri Univ Cergy Pontoise ETIS ENSEA CNRS UMR 8051 F-95014 Cergy Pontoise France INRIA Rennes Bretagne Atlantique F-35042 Rennes France

Local descriptors are the ground layer of recognition feature based systems for still images and video. We propose a new framework for the design of local descriptors and their evaluation. This framework is based on the descriptors decomposition in three levels: primitive extraction, primitive coding and code aggregation. With this framework, we are able to explain most of the popular descriptors in the literature such as HOG, HOF or SURF. This framework provides an efficient and rigorous approach for the evaluation of local descriptors, and allows us to uncover the best parameters for each descriptor family. Moreover, we are able to extend usual descriptors by changing the code aggregation or adding new primitive coding method. The experiments are carried out on images (VOC 2007) and videos datasets (KTH, Hollywood2, UCF11 and UCF101), and achieve equal or better performances than the literature. (C) 2014 Elsevier Ltd. All rights reserved.

关键词： image processing and computer vision vision and scene understanding Video analysis image/video retrieval Object recognition Feature representation

来源：评论

学校读者我要写书评

暂无评论

Discriminative Optimization: Theory and Applications to computer vision

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019年第4期41卷 829-843页

作者： Vongkulbhisal, Jayakorn De la Torre, Fernando Costeira, Joao P. Carnegie Mellon Univ ECE Dept Pittsburgh PA 15213 USA Univ Lisbon ISR IST Lisbon Portugal Carnegie Mellon Univ Facebook Inc Pittsburgh PA USA Carnegie Mellon Univ Inst Robot Pittsburgh PA USA

Many computer vision problems are formulated as the optimization of a cost function. This approach faces two main challenges: designing a cost function with a local optimum at an acceptable solution, and developing an efficient numerical method to search for this optimum. While designing such functions is feasible in the noiseless case, the stability and location of local optima are mostly unknown under noise, occlusion, or missing data. In practice, this can result in undesirable local optima or not having a local optimum in the expected place. On the other hand, numerical optimization algorithms in high-dimensional spaces are typically local and often rely on expensive first or second order information to guide the search. To overcome these limitations, we propose Discriminative Optimization (DO), a method that learns search directions from data without the need of a cost function. DO explicitly learns a sequence of updates in the search space that leads to stationary points that correspond to the desired solutions. We provide a formal analysis of DO and illustrate its benefits in the problem of 3D registration, camera pose estimation, and image denoising. We show that DO outperformed or matched state-of-the-art algorithms in terms of accuracy, robustness, and computational efficiency.

关键词： Optimization gradient methods iterative methods image processing and computer vision machine learning

来源：评论

学校读者我要写书评

暂无评论

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023年第10期45卷 11561-11574页

作者： Shepley, Andrew J. Falzon, Greg Kwan, Paul Brankovic, Ljiljana Univ New England Sch Sci & Technol Armidale NSW 2350 Australia Flinders Univ S Australia Adelaide SA 5000 Australia Melbourne Inst Technol Sydney NSW 3000 Australia

Confluence is a novel non-Intersection over Union (IoU) alternative to Non-Maxima Suppression (NMS) in bounding box post-processing in object detection. It overcomes the inherent limitations of IoU-based NMS variants to provide a more stable, consistent predictor of bounding box clustering by using a normalized Manhattan Distance inspired proximity metric to represent bounding box clustering. Unlike Greedy and Soft NMS, it does not rely solely on classification confidence scores to select optimal bounding boxes, instead selecting the box which is closest to every other box within a given cluster and removing highly confluent neighboring boxes. Confluence is experimentally validated on the MS COCO and CrowdHuman benchmarks, improving Average Precision by 0.2--2.7% and 1--3.8% respectively and Average Recall by 1.3--9.3 and 2.4--7.3% when compared against Greedy and Soft-NMS variants. Quantitative results are supported by extensive qualitative analysis and threshold sensitivity analysis experiments support the conclusion that Confluence is more robust than NMS variants. Confluence represents a paradigm shift in bounding box processing, with potential to replace IoU in bounding box regression processes.

关键词： computer vision edge and feature detection feature representation image processing and computer vision machine learning confluence non-maxima suppression object detection deep learning

来源：评论

学校读者我要写书评

暂无评论

Keypoint recognition using randomized trees

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2006年第9期28卷 1465-1479页

作者： Lepetit, Vincent Fua, Pascal Ecole Polytech Fed Lausanne Comp Vis Lab CH-1015 Lausanne Switzerland

In many 3D object-detection and pose-estimation problems, runtime performance is of critical importance. However, there usually is time to train the system, which we will show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-based approach that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem. This shifts much of the computational burden to a training phase, without sacrificing recognition performance. As a results, the resulting algorithm is robust, accurate, and fast-enough for frame-rate performance. This reduction in runtime computational complexity is our first contribution. Our second contribution is to show that, in this context, a simple and fast keypoint detector suffices to support detection and tracking even under large perspective and scale variations. While earlier methods require a detector that can be expected to produce very repeatable results, in general, which usually is very time-consuming, we simply find the most repeatable object keypoints for the specific target object during the training phase. We have incorporated these ideas into a real-time system that detects planar, nonplanar, and deformable objects. It then estimates the pose of the rigid ones and the deformations of the others.

关键词： image processing and computer vision object recognition tracking statistical classifier design and evaluation edge and feature detection

来源：评论

学校读者我要写书评

暂无评论

Wide-Angle image Rectification: A Survey

引用

INTERNATIONAL JOURNAL OF computer vision 2022年第3期130卷 747-776页

作者： Fan, Jinlong Zhang, Jing Maybank, Stephen J. Tao, Dacheng Univ Sydney Fac Engn Sch Comp Sci Darlington NSW 2008 Australia Univ London Dept Comp Sci & Informat Syst Birkbeck Coll London England

Wide field-of-view (FOV) cameras, which capture a larger scene area than narrow FOV cameras, are used in many applications including 3D reconstruction, autonomous driving, and video surveillance. However, wide-angle images contain distortions that violate the assumptions underlying pinhole camera models, resulting in object distortion, difficulties in estimating scene distance, area, and direction, and preventing the use of off-the-shelf deep models trained on undistorted images for downstream computer vision tasks. image rectification, which aims to correct these distortions, can solve these problems. In this paper, we comprehensively survey progress in wide-angle image rectification from transformation models to rectification methods. Specifically, we first present a detailed description and discussion of the camera models used in different approaches. Then, we summarize several distortion models including radial distortion and projection distortion. Next, we review both traditional geometry-based image rectification methods and deep learning-based methods, where the former formulates distortion parameter estimation as an optimization problem and the latter treats it as a regression problem by leveraging the power of deep neural networks. We evaluate the performance of state-of-the-art methods on public datasets and show that although both kinds of methods can achieve good results, these methods only work well for specific camera models and distortion types. We also provide a strong baseline model and carry out an empirical study of different distortion models on synthetic datasets and real-world wide-angle images. Finally, we discuss several potential research directions that are expected to further advance this area in the future.

关键词： computer Imaging vision Pattern Recognition and Graphics Artificial Intelligence image processing and computer vision Pattern Recognition

来源：评论

学校读者我要写书评

暂无评论

Special issue on deep learning for emerging embedded real-time image and video processing systems

引用

JOURNAL OF REAL-TIME image processing 2021年第4期18卷 1167-1171页

作者： Jeon, Gwanggil Chehri, Abdellah Incheon Natl Univ Incheon South Korea Univ Quebec Chicoutimi Chicoutimi PQ Canada

Experiments on public datasets suggest that this method certifies its effectiveness, reaches human-level performance, and outperforms current state-of-the-art methods with 92.8% on the extended Cohn-Kanade (CK+) and 87.0% on FERPLUS. “A locally-processed light-weight deep neural network for detecting colorectal polyps in wireless capsule endoscopes” propose a light-weight DNN model that has the potential of running locally in the WCE [2]. [...]only images indicating potential diseases are transmitted, saving energy on data transmission. Background subtraction is a substantially important video processing task that aims at separating the foreground from a video in order to make the post-processing tasks efficient. [...]several different techniques have been proposed for this task but most of them cannot perform well for the videos having variations in both the foreground and the background. “Background subtraction in videos using LRMF and CWM algorithm,” a novel background subtraction technique is proposed that aims at progressively fitting a particular subspace for the background that is obtained from L1-low rank matrix regularization using the cyclic weighted median algorithm and a certain distribution of a mixture of Gaussian noise for the foreground [3].

关键词： image processing and computer vision Multimedia Information Systems computer Graphics Pattern Recognition Signal image and Speech processing

来源：评论

学校读者我要写书评

暂无评论

Special issue on computational intelligence-based modeling, control and estimation in modern mechatronic systems

引用

NEURAL COMPUTING & APPLICATIONS 2022年第7期34卷 5011-5013页

作者： Wang, Hai Zheng, Jinchuan Lu, Yuqian Ding, Shihong Chaoui, Hicham Murdoch Univ Ctr Water Energy Waste Discipline Engn & Energy Harry Butler Inst Perth WA 6150 Australia Swinburne Univ Technol Fac Sci Engn & Technol Melbourne Vic 3122 Australia Univ Auckland Dept Mech Engn Auckland 1010 New Zealand Jiangsu Univ Sch Elect & Informat Engn Zhenjiang 212013 Jiangsu Peoples R China Carleton Univ Dept Elect Ottawa ON Canada

来源：评论

学校读者我要写书评

暂无评论

Detecting Fall Risk Factors for Toddlers

引用

IEEE PERVASIVE COMPUTING 2011年第1期10卷 82-89页

作者： Na, Hana Qin, Sheng-Feng Wright, David K. Brunel Univ Sch Engn & Design Uxbridge UB8 3PH Middx England

Preventing accidental injuries of toddlers requires thorough, consistent supervision, but this isn't always practical. A proposed vision-based system detects three fall risk factors in the home environment to help caregivers supervise nearby toddlers when they can't give continuous attention to the toddlers. The crucial technical challenge is to differentiate a human from other foreground objects in the images. Unlike previous systems, this one uses multiple dynamic motion cues for human detection, employing cues related to human appearance.

关键词： computer applications computing methodologies health image processing and computer vision implementation life and medical sciences pattern recognition pervasive computing real-time systems Accidental Injury Ubiquitous computing transistor diode logic real-time systems Computing Methodologies health APPLIED computer SCIENCE Medicine home environment toddler Pattern recognition fall risk

来源：评论

学校读者我要写书评

暂无评论

A General Decoupled Learning Framework for Parameterized image Operators

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021年第1期43卷 33-47页

作者： Fan, Qingnan Chen, Dongdong Yuan, Lu Hua, Gang Yu, Nenghai Chen, Baoquan Stanford Univ Comp Sci Dept Stanford CA 94305 USA Univ Sci & Technol China Dept Elect Engn & Informat Sci Hefei 230026 Anhui Peoples R China Microsoft Res Redmond WA 98052 USA Wormpex AI Res Bellevue WA 98004 USA Peking Univ Beijing 100871 Peoples R China

Many different deep networks have been used to approximate, accelerate or improve traditional image operators. Among these traditional operators, many contain parameters which need to be tweaked to obtain the satisfactory results, which we refer to as "parameterized image operators". However, most existing deep networks trained for these operators are only designed for one specific parameter configuration, which does not meet the needs of real scenarios that usually require flexible parameters settings. To overcome this limitation, we propose a new decoupled learning algorithm to learn from the operator parameters to dynamically adjust the weights of a deep network for image operators, denoted as the base network. The learned algorithm is formed as another network, namely the weight learning network, which can be end-to-end jointly trained with the base network. Experiments demonstrate that the proposed framework can be successfully applied to many traditional parameterized image operators. To accelerate the parameter tuning for practical scenarios, the proposed framework can be further extended to dynamically change the weights of only one single layer of the base network while sharing most computation cost. We demonstrate that this cheap parameter-tuning extension of the proposed decoupled learning framework even outperforms the state-of-the-art alternative approaches.

关键词： image processing and computer vision filtering restoration smoothing

来源：评论

学校读者我要写书评

暂无评论

Infinitely divisible cascades to model the statistics of natural images

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2007年第12期29卷 2105-2119页

作者： Chainais, Pierre Univ Blaise Pascal Clermont Ferrand II CNRS UMR LIMOS F-63173 Aubiere France

We propose to model the statistics of natural images, thanks to the large class of stochastic processes called Infinitely Divisible Cascades ( IDCs). IDCs were first introduced in one dimension to provide multifractal time series to model the so- called intermittency phenomenon in hydrodynamical turbulence. We have extended the definition of scalar IDCs from one to N dimensions and commented on the relevance of such a model in fully developed turbulence in [ 1]. In this paper, we focus on the particular 2D case. IDCs appear as good candidates to model the statistics of natural images. They share most of their usual properties and appear to be consistent with several independent theoretical and experimental approaches of the literature. We point out the interest of IDCs for applications to procedural texture synthesis.

关键词： stochastic processes picture/image generation fractals image processing and computer vision statistical image models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：