This paper presents a new thermal image enhancement algorithm based on combined local and global imageprocessing in the frequency domain. The presented approach uses the fact that the relationship between stimulus an...
详细信息
This paper presents a new thermal image enhancement algorithm based on combined local and global imageprocessing in the frequency domain. The presented approach uses the fact that the relationship between stimulus and perception is logarithmic. The basic idea is to apply logarithmic transform histogram matching with spatial equalization approach on different image blocks. The resulting image is a weighted mean of all processing blocks. The weights for every local and global enhanced image driven through optimization of measure of enhancement (EME). Some presented experimental results illustrate the performance of the proposed algorithm on real thermal images in comparison with the traditional methods.
In this work, a non-gradient descent learning (NGDL) scheme was proposed for deep feedforward neural networks (DNN). It is known that an autoencoder can be used as the building blocks of the multi-layer perceptron (ML...
详细信息
Stereoscopic-3D (S3D) displays are widely used but present problems related to experiences of visual discomfort for human vision. One aspect of this issue is the movement of the gaze point within different depth field...
详细信息
ISBN:
(纸本)9781538644591;9781538644584
Stereoscopic-3D (S3D) displays are widely used but present problems related to experiences of visual discomfort for human vision. One aspect of this issue is the movement of the gaze point within different depth fields. Here we aim to analyze the relationship between eye movement patterns and visual comfort experienced when viewing S3D images. Rather than simply lab.ling eye movement data according to categories such as gaze, saccade and so on, we depoly nonparametric Bayesian method to analyze and cluster several eye movement patterns, and to relate them to visual comfort. The results are relevant to the prediction of visual comfort assessment in S3D images by automatic algorithms.
The tasks of recognition actions and classification objects are fundamental in computer vision systems. Even subtasks, such as recognition of atomic motion and single objects form the basis for understanding the situa...
The tasks of recognition actions and classification objects are fundamental in computer vision systems. Even subtasks, such as recognition of atomic motion and single objects form the basis for understanding the situation in the work area and the scene in general. This is especially important in video surveillance systems designed to ensure security. Thus, the effectiveness of recognition and classification methods is one of the primary tasks of computer vision. But the visual methods implemented in similar video surveillance systems, encounter some difficulties, such as inhomogeneous background, uncontrolled operating environments, irregular illumination, etc. To address these drawbacks, the paper presents a model for combining visible range images and depth images. This model allows to improve the quality of recognized images, provides the construction of a more informative descriptor, which also positively affects the recognition efficiency. Our results show that it has good performance in fusion visible image and depth map.
The presence of random extra pulses during quasi-closed glottal cycle phases may constitute a distinct voice quality type relevant to the clinical care of disordered voices. In this paper, we propose for this voice ty...
详细信息
Digital images used in the investigation of a crime often undergo several concurrent enhancement operations for improved automated analysis. The challenges are related to the big size of data and complexity of the for...
详细信息
Digital images used in the investigation of a crime often undergo several concurrent enhancement operations for improved automated analysis. The challenges are related to the big size of data and complexity of the forensic imageprocessing. Our purpose is providing a smart cloud system to imageprocessing for PC and Smartphones with limited computation complexity. This paper presents a new thermal image enhancement algorithm based on combined local and global imageprocessing in the frequency domain. The presented approach uses the fact that the relationship between stimulus and perception is logarithmic. The basic idea is to apply logarithmic transform histogram matching with spatial equalization approach on different image blocks. The resulting image is a weighted mean of all processing blocks. The weights for every local and global enhanced image driven through optimization of measure of enhancement (EME). Some presented experimental results illustrate the performance of the proposed cloud system on real thermal images in comparison with the traditional methods.
Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation tec...
详细信息
Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation tec...
详细信息
Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66.7%). Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences [1]. However, the conventional literature is usually too focused on only describing the for...
详细信息
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences [1]. However, the conventional literature is usually too focused on only describing the forward input-output transform. Instead, in this work we present the mathematics of such cascades beyond the forward transform, namely the Jacobian matrices and the inverse. The fundamental reason for this analytical treatment is that it offers useful analytical insight into the psychophysics, the physiology, and the function of the visual system. For instance, we show how the trends of the sensitivity (volume of the discrimination regions) and the adaptation of the receptive fields can be identified in the expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t. the parameters shows which aspects of the model have bigger impact in the response, and hence their relative relevance. The analytic inverse implies conditions for the response and model parameters to ensure appropriate decoding. From the experimental and applied perspective, (a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian matrices w.r.t. the parameters are convenient to learn the model from classical experiments or alternative goal optimization, and (c) the inverse is a promising model-based alternative to blind machine-learning methods for neural decoding that do not include meaningful biological information. The theory is checked by building and testing a vision model that actually follows the modular program suggested in [1]. Our illustrative derivable and invertible model consists of a cascade of modules that account for brightness, contrast, energy masking, and wavelet masking. To stress the generality of this modular setting we show exa
In this paper, we present a new video quality metric targeted for use within cloud-based video storage systems. Because of the limited capacity of storage solutions currently in use, it is common for stored videos to ...
详细信息
In this paper, we present a new video quality metric targeted for use within cloud-based video storage systems. Because of the limited capacity of storage solutions currently in use, it is common for stored videos to have low perceived video quality. Ability to predict the quality of coded video sequence in fully automatic fashion is crucial for optimal coding parameters selection. We have developed general quality estimation approach applicable for different kinds of video content and useful for online correction of perceived quality of stored video sequence. Proposed objective video quality metric is designed to have a high level of correlation with human-based quality assessment results. A high level of conformity of predicted video quality to perceived quality level is achieved by using a convolutional neural network trained on a large volume of video data in the framework of generative adversarial learning and combination of carefully selected regularization techniques. Evaluation of developed VQA metric on commonly used datasets shows equal or better correlation with MOS than the current state of the art approaches.
暂无评论