Active appearance models (AAMs) provide a framework for modeling the joint shape and texture of an image. An AAM is a compact representation of both factors in a conditionally linear model. However the standard AAM fr...
详细信息
ISBN:
(纸本)076952334X
Active appearance models (AAMs) provide a framework for modeling the joint shape and texture of an image. An AAM is a compact representation of both factors in a conditionally linear model. However the standard AAM framework does not handle images which have missing features, or allow modification of certain structures in the image while leaving neighboring ones undeformed. We introduce the layered active appearance model (LAAM), which allows for missing features, occlusion, substantial spatial rearrangement of features, and which provides a more general representation that extends the applicability of the Active Appearance Model.
A novel framework is developed for automatic behaviour profiling and abnormality sampling/detection without any manual labelling of the training dataset. Natural grouping of behaviour patterns is discovered through un...
详细信息
ISBN:
(纸本)076952334X
A novel framework is developed for automatic behaviour profiling and abnormality sampling/detection without any manual labelling of the training dataset. Natural grouping of behaviour patterns is discovered through unsupervised model selection and feature selection on the eigen-vectors of a normalised affinity matrix. Our experiments demonstrate that a behaviour model trained using an unlabelled dataset is superior to those trained using the same but labelled dataset in detecting abnormality from an unseen video.
Pictures taken by a rotating camera cover the viewing sphere surrounding the center of rotation. Having a set of images registered and blended on the sphere what is left to be done, in order to obtain a flat panorama,...
详细信息
ISBN:
(纸本)076952334X
Pictures taken by a rotating camera cover the viewing sphere surrounding the center of rotation. Having a set of images registered and blended on the sphere what is left to be done, in order to obtain a flat panorama, is projecting the spherical image onto a picture plane. This step is unfortunately not obvious - the surface of the sphere may not be flattened onto a page without some form of distortion. The objective of this paper is discussing the difficulties and opportunities that are connected to the projection from viewing sphere to image plane. We first explore a number of alternatives to the commonly used linear perspective projection. These are 'global' projections and do not depend on image content. We then show that multiple projections may coexist successfully in the same mosaic: these projections are chosen locally and depend on what is present in the pictures. We show that such multi-view projections can produce more compelling results than the global projections.
image fusion as a way of combining multiple image signals into a single fused image has in recent years been extensively researched for a variety of multisensor applications. Choosing an optimal fusion approach for ea...
详细信息
ISBN:
(纸本)076952334X
image fusion as a way of combining multiple image signals into a single fused image has in recent years been extensively researched for a variety of multisensor applications. Choosing an optimal fusion approach for each application from the plethora of algorithms available however, remains a largely open issue. A small number of metrics proposed so far provide only a rough, numerical estimate of fusion performance with limited understanding of the relative merits of different fusion schemes. This paper proposes a method for comprehensive, objective, image fusion performance characterisation using a fusion evaluation framework based on gradient information representation. The method provides an in-depth analysis of fusion performance by quantifying, information contributions by each sensor, fusion gain, fusion information loss and fusion artifacts (artificial information created). It is demonstrated on the evaluation of an extensive dataset Of multisensor images fused with a wide range of established image fusion algorithms. The results demonstrate and quantify a number of well known issues concerning the performance of these schemes and provide a useful insight into a number of more subtle yet important fusion performance effects not immediately accessible to an observer.
In many vision problems, the appearances of the observed images, e.g. the human facial images, are often influenced by multiple underlying factors. In this paper, a kernel-based factorization framework is proposed to ...
详细信息
ISBN:
(纸本)076952334X
In many vision problems, the appearances of the observed images, e.g. the human facial images, are often influenced by multiple underlying factors. In this paper, a kernel-based factorization framework is proposed to analyze a multifactor dataset. Specifically, we perform N-mode Singular Value Decomposition (Nmode SVD) in a higher dimensional feature space instead of the input space by using kernel approaches. Given an input sample, its specific underlying factors which may be all absent in the training set can be extracted and translated from one sample to another by using kernel-based 'translation'. Therefore our framework is suitable for tasks of new image synthesis and underlying factor recognition. We demonstrate the capabilities of our framework on ensembles official images subjected to different person identities, viewpoints and illuminations with high-quality synthetic faces and high face recognition accuracy.
We present a two-layer hierarchical formulation to exploit different levels of contextual information in images for robust classification. Each layer is modeled as a conditional field that allows one to capture arbitr...
详细信息
ISBN:
(纸本)076952334X
We present a two-layer hierarchical formulation to exploit different levels of contextual information in images for robust classification. Each layer is modeled as a conditional field that allows one to capture arbitrary observation dependent label interactions. The proposed framework has two main advantages. First, it encodes both the short-range interactions (e.g., pixelwise label smoothing) as well as the long-range interactions (e.g., relative configurations of objects or regions) in a tractable manner Second, the formulation is general enough to be applied to different domains ranging from pixelwise image labeling to contextual object detection. The parameters of the model are learned using a sequential maximum-likelihood approximation. The benefits of the proposed framework are demonstrated on four different datasets and comparison results are presented.
We present a 2D model-based approach to localizing human body in images viewed from arbitrary and unknown angles. The central component is a statistical shape representation of the nonrigid and articulated body contou...
详细信息
ISBN:
(纸本)076952334X
We present a 2D model-based approach to localizing human body in images viewed from arbitrary and unknown angles. The central component is a statistical shape representation of the nonrigid and articulated body contours, where a nonlinear deformation is decomposed based on the concept of parts. Several image cues are combined to relate the body configuration to the observed image, with self-occlusion explicitly treated. To accommodate large viewpoint changes, a mixture of view-dependent models is employed. Inference is done by direct sampling of the posterior mixture, using Sequential Monte Carlo (SMC) simulation enhanced with annealing and kernel move. The fitting method is independent of the number of mixture components, and does not require the preselection of a "correct" viewpoint. The models were trained on a large number of interactively labeled gait images. Preliminary tests demonstrated the feasibility of the proposed approach.
This paper develops new theory for the optimal placement of photometric stereo lighting in the presence of camera noise. We show that for three lights, any triplet of orthogonal light directions minimises the uncertai...
详细信息
ISBN:
(纸本)076952334X
This paper develops new theory for the optimal placement of photometric stereo lighting in the presence of camera noise. We show that for three lights, any triplet of orthogonal light directions minimises the uncertainty in scaled normal computation. The assumptions are that the camera noise is additive and normally distributed, and uncertainty is defined as the expectation of squared distance of scaled normal to the ground truth. If the camera noise is of zero mean and variance sigma(2), the optimal (minimum) uncertainty in the scaled normal is 3 sigma(2). For case of n > 3 lights, we show that the minimum uncertainty is 9 sigma(2) /n, and identify sets of light configurations which reach this theoretical minimum.
The goal of deconvolution is to recover an image x. from its convolution with a known blurring Junction. This is equivalent to inverting the, linear system y = Hx. In this paper we consider the generalized problem whe...
详细信息
ISBN:
(纸本)076952334X
The goal of deconvolution is to recover an image x. from its convolution with a known blurring Junction. This is equivalent to inverting the, linear system y = Hx. In this paper we consider the generalized problem where the system matrix H is an arbitrary non-negative matrix. Linear inverse problems can be solved by adding a regularization term to impose spatial smoothness. To avoid oversmoothing, the regularization term must preserve discontinuities;this results in a particularly challenging energy, minimization problem. Where H is diagonal, as occurs in image denoising, the energy function can be solved by techniques such as graph cuts, which have proven to be very effective for problems in early vision. When H is non-diagonal, however the data cost for a pixel to have a intensity depends on the hypothesized intensities of nearby pixels, so existing graph cut methods cannot be applied This paper shows how to use graph cuts to obtain a discontinuity-preserving solution to a linear inverse system with an arbitrary non-negative system matrix. We use a dynamically chosen approximation to the energy which can be minimized by graph cuts;minimizing this approximation also decreases the original energy. Experimental results are shown for MRI reconstruction from fourier data.
In this paper we present a new learning framework for image style transforms. Considering that the images in different style representations constitute different vector spaces, we propose a novel framework called Coup...
详细信息
ISBN:
(纸本)076952334X
In this paper we present a new learning framework for image style transforms. Considering that the images in different style representations constitute different vector spaces, we propose a novel framework called Coupled Space Learning to learn the relations between different spaces and use them to infer the images from one style to another style. Observing that for each style, only the components correlated to the space of the target style are useful for inference, we first develop the Correlative Component Analysis to pursue the embedded hidden subspaces that best preserve the inter-space correlation information. Then we develop the Coupled Bidirectional Transform algorithm to estimate the transforms between the two embedded spaces, where the coupling between the forward transform and the backward transform is explicitly taken into account. To enhance the capability of modelling complex data, we further develop the Coupled Gaussian Mixture Model to generalize our framework to a mixture-model architecture. The effectiveness of the framework is demonstrated in the applications including face super-resolution and bidirectional portrait style transforms.
暂无评论