Objective image quality assessment (IQA) models aim to automatically predict human visual perception of image quality and are of fundamental importance in the field of imageprocessing.and computer vision. With an inc...
详细信息
ISBN:
(纸本)9781467388511
Objective image quality assessment (IQA) models aim to automatically predict human visual perception of image quality and are of fundamental importance in the field of imageprocessing.and computer vision. With an increasing number of IQA models proposed, how to fairly compare their performance becomes a major challenge due to the enormous size of image space and the limited resource for subjective testing. The standard approach in literature is to compute several correlation metrics between subjective mean opinion scores (MOSs) and objective model predictions on several well-known subject-rated databases that contain distorted images generated from a few dozens of source images, which however provide an extremely limited representation of real-world images. Moreover, most IQA models developed on these databases often involve machine learning and/or manual parameter tuning steps to boost their performance, and thus their generalization capabilities are questionable. Here we propose a novel methodology to compare IQA models. We first build a database that contains 4,744 source natural images, together with 94,880 distorted images created from them. We then propose a new mechanism, namely group MAximum Differentiation (gMAD) competition, which automatically selects subsets of image pairs from the database that provide the strongest test to let the IQA models compete with each other. Subjective testing on the selected subsets reveals the relative performance of the IQA models and provides useful insights on potential ways to improve them. We report the gMAD competition results between 16 well-known IQA models, but the framework is extendable, allowing future IQA models to be added into the competition.
The previous implementations of the authors' epipolar-plane image-analysis mapping technique demonstrated the feasibility and benefits of the approach, but were carried out for restricted camera geometries. The qu...
详细信息
ISBN:
(纸本)0818608625
The previous implementations of the authors' epipolar-plane image-analysis mapping technique demonstrated the feasibility and benefits of the approach, but were carried out for restricted camera geometries. The question of more general geometries made the technique's utility for autonomous navigation uncertain. The authors have developed a generalization of the analysis that (a) enables varying view direction, including varying over time;(b) provides three-dimensional connectivity information for building coherent spatial descriptions of observed objects, and (c) operates sequentially, allowing initiation and refinement of scene feature estimates while the sensor is in motion. To implement this generalization it was necessary to develop an explicit description of the evolution of images over time. They achieved this by building a process that creates a set of two-dimensional manifolds defined at the zeros of a three-dimensional spatiotemporal Laplacian. These manifolds represent explicitly both the spatial and temporal structure of the temporally evolving imagery and are termed spatiotemporal surfaces.
Steerable functions find application in numerous problems in imageprocessing.computer vision and computer graphics. As such, it is important to develop the appropriate mathematical tools to analyze them. In this pap...
详细信息
Steerable functions find application in numerous problems in imageprocessing.computer vision and computer graphics. As such, it is important to develop the appropriate mathematical tools to analyze them. In this paper, we introduce the mathematics of Lie group theory in the context of steerable functions and present a canonical decomposition of these functions under any transformation group. The theory presented in this paper can be applied and extended in various ways.
The authors present region-based imageprocessing.algorithms participating in task sequencing for stereo vision. The algorithms described are prior to stereo matching. Requirements in image similarity provide helpful ...
详细信息
The authors present region-based imageprocessing.algorithms participating in task sequencing for stereo vision. The algorithms described are prior to stereo matching. Requirements in image similarity provide helpful additional knowledge for their improvement. A recursive region division algorithm using a threshold method based on contrast maximization is described. The regions are processed in parallel and absorb their noise before being thresholded. Subsequent morphological processes improve similarity and matching results. Additional knowledge is produced by analytical processes, and is useful for both segmentation and match control. Results are presented for a stereo pair of gray-level images.< >
Mathematical morphology supplies powerful tools for low level image analysis, with applications in robotic vision, visual inspection, medicine, texture analysis and many other areas. Many of the mentioned applications...
详细信息
A simple imaging range sensor is described, based on the measurement of focal error, as described by A. Pentland (1982 and 1987). The current implementation can produce range over a 1 m/sup 3/ workspace with a measure...
详细信息
A simple imaging range sensor is described, based on the measurement of focal error, as described by A. Pentland (1982 and 1987). The current implementation can produce range over a 1 m/sup 3/ workspace with a measured standard error of 2.5% (4.5 significant bits of data). The system is implemented using relatively inexpensive commercial image-processing.equipment. Experience shows that this ranging technique can be both economical and practical for tasks which require quick and reliable but coarse estimates of range. Examples of such tasks are initial target acquisition or obtaining the initial coarse estimate of stereo disparity in a coarse-to-fine stereo algorithm.< >
Semantic object parsing is a fundamental task for understanding objects in detail in computer vision community, where incorporating multi-level contextual information is critical for achieving such fine-grained pixel-...
详细信息
ISBN:
(纸本)9781467388511
Semantic object parsing is a fundamental task for understanding objects in detail in computer vision community, where incorporating multi-level contextual information is critical for achieving such fine-grained pixel-level recognition. Prior methods often leverage the contextual information through post-processing.predicted confidence maps. In this work, we propose a novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into the feature learning over all pixel positions. In each LG-LSTM layer, local guidance from neighboring positions and global guidance from the whole image are imposed on each position to better exploit complex local and global contextual information. Individual LSTMs for distinct spatial dimensions are also utilized to intrinsically capture various spatial layouts of semantic parts in the images, yielding distinct hidden and memory cells of each position for each dimension. In our parsing approach, several LG-LSTM layers are stacked and appended to the intermediate convolutional layers to directly enhance visual features, allowing network parameters to be learned in an end-to-end way. The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions. Comprehensive evaluations on three public datasets well demonstrate the significant superiority of our LG-LSTM over other state-of-the-art methods.
The computation of optical flow from image derivatives is biased in regions of non uniform gradient distributions. A least-squares or total least squares approach to computing optic flow from image derivatives even in...
详细信息
The computation of optical flow from image derivatives is biased in regions of non uniform gradient distributions. A least-squares or total least squares approach to computing optic flow from image derivatives even in regions of consistent flow can lead to a systematic bias dependent upon the direction of the optic flow, the distribution of the gradient directions, and the distribution of the image noise. The bias a consistent underestimation of length and a directional error. Similar results hold for various methods of computing optical flow in the spatiotemporal frequency domain. The predicted bias in the optical flow is consistent with psychophysical evidence of human judgment of the velocity of moving plaids, and provides an explanation of the Ouchi illusion. Correction of the bias requires accurate estimates of the noise distribution; the failure of the human visual system to make these corrections illustrates both the difficulty of the task and the feasibility of using this distorted optic flow or undistorted normal flow in tasks requiring higher lever processing.
This paper approximates the 3D geometry of a scene by a small number of 3D planes. The method is especially suited to man-made scenes, and only requires two calibrated wide-baseline views as inputs. It relies on the c...
详细信息
ISBN:
(纸本)9781467388511
This paper approximates the 3D geometry of a scene by a small number of 3D planes. The method is especially suited to man-made scenes, and only requires two calibrated wide-baseline views as inputs. It relies on the computation of a dense but noisy 3D point cloud, as for example obtained by matching DAISY descriptors [35] between the views. It then segments one of the two reference images, and adopts a multi-model fitting process to assign a 3D plane to each region, when the region is not detected as occluded. A pool of 3D plane hypotheses is first derived from the 3D point cloud, to include planes that reasonably approximate the part of the 3D point cloud observed from each reference view between randomly selected triplets of 3D points. The hypothesis-to-region assignment problem is then formulated as an energy-minimization problem, which simultaneously optimizes an original data-fidelity term, the assignment smoothness over neighboring regions, and the number of assigned planar proxies. The synthesis of intermediate viewpoints demonstrates the effectiveness of our 3D reconstruction, and thereby the relevance of our proposed data fidelity-metric.
An approach to feature extraction that eliminates binarization by extracting features directly from gray scale images is presented. It not only allows the processing.of poor quality input (e.g., low contrast, dirty im...
详细信息
An approach to feature extraction that eliminates binarization by extracting features directly from gray scale images is presented. It not only allows the processing.of poor quality input (e.g., low contrast, dirty images), but also offers the possibility of significantly lower resolution for digitization.< >
暂无评论