We investigate the use of the L-infinity cost function in geometric vision problems. This cost function measures the maximum of a set of model-fitting errors, rather than the sum-of-squares, or L-2 cost function that ...
详细信息
ISBN:
(纸本)0769521584
We investigate the use of the L-infinity cost function in geometric vision problems. This cost function measures the maximum of a set of model-fitting errors, rather than the sum-of-squares, or L-2 cost function that is commonly used (in least-squares fitting). We investigate its use in two problems;multiview triangulation and motion recovery from omnidirectional cameras, though the results may also apply to other related problems. It is shown that for these problems the L-infinity cost function is significantly simpler than the L-2 Cost. In particular L-infinity minimization involves finding the minimum of a cost function with a single local (and hence global) minimum on a convex parameter domain. The problem may be recast as a constrained minimization problem and solved using commonly available software. The optimal solution was reliably achieved on problems of small dimension.
Shadow removal is an important computervision task aiming at the detection and successful removal of the shadow produced by an occluded light source and a photorealistic restoration of the image contents. Decades of ...
详细信息
ISBN:
(纸本)9781665448994
Shadow removal is an important computervision task aiming at the detection and successful removal of the shadow produced by an occluded light source and a photorealistic restoration of the image contents. Decades of research produced a multitude of hand-crafted restoration techniques and, more recently, learned solutions from shadowed and shadow free training image pairs. In this work, we propose a single image shadow removal solution via self-supervised learning by using a conditioned mask. We rely on self-supervision and jointly learn deep models to remove and add shadows to images. We derive two variants for learning from paired images and unpaired images, respectively. Our validation on the recently introduced ISTD and USR datasets demonstrate large quantitative and qualitative improvements over the state-of-the-art for both paired and unpaired learning settings.
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
Human risky behavior in driving is an important visual recognition problem. In this paper, we propose a multi-view temporal action localization system based on the grayscale video to achieve action recognition in natu...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Human risky behavior in driving is an important visual recognition problem. In this paper, we propose a multi-view temporal action localization system based on the grayscale video to achieve action recognition in naturalistic driving. Specifically, we adopted SwinTransformer as feature extractor, and a single framework to detect boundary and class at the same time. Also, we improve multiple loss function for explicit constraints of embedded feature distributions. Our proposed framework achieves the overall F1 -score of 0.3154 on A2 dataset.
Foveated vision and two-mode tracking, as inspired by the human oculomotor system, are often used in active vision system. The purpose of this paper is to provide answers to the following basic questions which arise f...
详细信息
ISBN:
(纸本)0818672587
Foveated vision and two-mode tracking, as inspired by the human oculomotor system, are often used in active vision system. The purpose of this paper is to provide answers to the following basic questions which arise from implementations. First, is it beneficial to have foveated vision and what is the optimal size of the foveal window? Second, is there a need for two control mechanisms (smooth pursuit and saccade) for improved performance and how can one efficiently switch between them? In order to do so, a setup is proposed in which these strategies can be evaluated in a systematic manner. It is shown that the fovea appears as a compromise between the tightness of the tracking specifications and computational constraints. Introducing a model for the later and postulating some a priori knowledge of the target behavior, it is possible to compute the size of the fovea in an optimal way. As a by-product, 'smooth-pursuit' can be defined in a natural way, and the use of a two-mode tracking scheme is justified. The second mode, i.e. 'saccadic control', aims at re-centering the target on the fovea so that the smooth pursuit controller can continue to operate. It is shown that a control strategy can indeed be defined so that this objective can be met under appropriate operating conditions.
In this paper we present a practical patternrecognition system that is invariant with respect to translation, scale and rotation of objects. The system is also insensitive to large variations of the threshold used. A...
详细信息
ISBN:
(纸本)0818658274
In this paper we present a practical patternrecognition system that is invariant with respect to translation, scale and rotation of objects. The system is also insensitive to large variations of the threshold used. As feature vectors, Zernike moments are used and we compare them with Hu's seven moment invariants. For a practical machine vision system, three key issues are discussed: pattern normalization, fast computation of Zernike moments, and classification using k-N N rule. As testing results, the system recognizes a set of 62 alphanumeric machine-printed characters with different sizes, at arbitrary orientations, and with different thresholds where the size of the characters varies from 10 × 10 to 512 × 512 pixels.
Symmetry is a pervasive phenomenon presenting itself in all forms and scales in natural and manmade environments. Its detection plays an essential role at all levels of human as well as machine perception. The recent ...
详细信息
ISBN:
(纸本)9780769549903
Symmetry is a pervasive phenomenon presenting itself in all forms and scales in natural and manmade environments. Its detection plays an essential role at all levels of human as well as machine perception. The recent resurging interest in computational symmetry for computervision and computer graphics applications has motivated us to conduct a US NSF funded symmetry detection algorithm competition as a workshop affiliated with the computervision and patternrecognition (CVPR) conference, 2013. This competition sets a more complete benchmark for computervision symmetry detection algorithms. In this report we explain the evaluation metric and the automatic execution of the evaluation workflow. We also present and analyze the algorithms submitted, and show their results on three test sets of real world images depicting reflection, rotation and translation symmetries respectively. This competition establishes a performance baseline for future work on symmetry detection.
Image deblurring and super-resolution (SR) are computervision tasks aiming to restore image detail and spatial scale, respectively. Besides, only a few recent works of literature contribute to this task, as conventio...
详细信息
ISBN:
(纸本)9781665448994
Image deblurring and super-resolution (SR) are computervision tasks aiming to restore image detail and spatial scale, respectively. Besides, only a few recent works of literature contribute to this task, as conventional methods deal with SR or deblurring separately. We focus on designing a novel Pixel-Guided dual-branch attention network (PDAN) that handles both tasks jointly to address this issue. Then, we propose a novel loss function better focus on large and medium range errors. Extensive experiments demonstrated that the proposed PDAN with the novel loss function not only generates remarkably clear HR images and achieves compelling results for joint image deblurring and SR tasks. In addition, our method achieves second place in NTIRE 2021 Challenge on track 1 of the Image Deblurring Challenge.
We investigate the application of Support Vector Machines (SVMs) in computet vision. SVM is a learning technique developed by V. Vapnik and his team (AT&T Bell Labs.) that can be seen as a new method for training ...
详细信息
ISBN:
(纸本)0780342364
We investigate the application of Support Vector Machines (SVMs) in computet vision. SVM is a learning technique developed by V. Vapnik and his team (AT&T Bell Labs.) that can be seen as a new method for training polynomial, neural network, or Radical Basis Functions classifiers. The decision surfaces are found by solving a linearly constrained quadratic programming problem. This optimization problem is challenging because the quadratic form is completely dense and the memory requirements grow with the square of the number of data points. We present a decomposition algorithm that guarantees global optimality, and can be used to train SVM's over very large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of optimality conditions which are used both to generate improved iterative values, and also establish the stopping criteria for the algorithm. We present experimental results of our implementation of SVM, and demonstrate the feasibility of our approach on a face detection problem that involves a data set of 50,000 data points.
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse...
详细信息
ISBN:
(纸本)9781665448994
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse computer graphics method for automatic makeup synthesis from a reference image, by learning a model that maps an example portrait image with makeup to the space of rendering parameters. This method can be used by artists to automatically create realistic virtual cosmetics image samples, or by consumers, to virtually try-on a makeup extracted from their favorite reference image.
暂无评论