In this paper we propose a fully convolutional encoder-decoder framework for image residual transformation tasks. Instead of only using per-pixel loss function, the proposed framework learn end-to-end mapping combined...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
In this paper we propose a fully convolutional encoder-decoder framework for image residual transformation tasks. Instead of only using per-pixel loss function, the proposed framework learn end-to-end mapping combined with perceptual loss function that depend on low-level features from a pre-trained network. Pointing out the mapping function in order to handle noise-free image by introduce identity mapping. And through an analysis of the interplay between the neural networks and the underlying noisy distribution which they seeking to learn. We also show how to construct a uniform transform, which is then used to make a single deep neural network work well across different levels of noise. Comparing with previous approaches, ours achieves better performance. the experimental results indicate the efficiency of the proposed algorithm to cope with image denoising tasks.
In this paper, we propose a fast solution for the problem of illuminant color estimation. We present a physics-based algorithm that uses the mean projections maximization assumption. We investigated this hypothesis on...
详细信息
ISBN:
(纸本)9783319336183;9783319336176
In this paper, we propose a fast solution for the problem of illuminant color estimation. We present a physics-based algorithm that uses the mean projections maximization assumption. We investigated this hypothesis on a large images dataset and used it afterwords to estimate the illuminant color. the proposed algorithm reduces the illuminant estimation problem to an uncentred PCA problem. the evaluation of the algorithm on two well-known image datasets results in lower angular errors.
In this paper, we generalize the orthogonal Fourier-Mellin moments (OFMMs) to the fractional orthogonal Fourier-Mellin moments (FOFMMs), which are based on the fractional radial polynomials. We propose a new method to...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
In this paper, we generalize the orthogonal Fourier-Mellin moments (OFMMs) to the fractional orthogonal Fourier-Mellin moments (FOFMMs), which are based on the fractional radial polynomials. We propose a new method to construct FOFMMs by using a continuous parameter t (t > 0). the fractional radial polynomials of FOFMMs have the same number of zeros as OFMMs withthe same degree. But the zeros of FOFMMs polynomial are more uniformly distributed than which of OFMMs and the first zero is closer to the origin. A recursive method is also given to reduce computation time and improve numerical stability. Experimental results show that the proposed FOFMMs have better performance.
Person re-identification, which aims at recognizing a person of interest across spatially disjoint camera views, is still a challenging task. Plenty of approaches emerge in recent years and some of them achieve good m...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
Person re-identification, which aims at recognizing a person of interest across spatially disjoint camera views, is still a challenging task. Plenty of approaches emerge in recent years and some of them achieve good matching results. Given a probe image, we observe that the ranking results generated by different approaches differ from each other. Considering these conventional methods are reasonable, we propose an Adaptive Multi-Metric Fusion (AMMF) method which fuses the existing ranking results with query-specific weights. Experiments on two challenging databases, VIPeR and EthZ, demonstrate that the proposed method achieves further performance improvement.
this work provides a theoretical framework for the colorimetric scene reproduction under different illuminants. First, based on the principle of CIE color matching experiment as well as image production pipeline in di...
详细信息
this work provides a theoretical framework for the colorimetric scene reproduction under different illuminants. First, based on the principle of CIE color matching experiment as well as image production pipeline in digital still cameras, we analyze and conclude that the Color Matching Function (CMF) of the color space determined by a display is the best Spectral Response Function (SRF) for scene reproduction on the certain conditions. then, we wonder what the best SRFs for other illuminants beyond D65 are. In order to do this, we propose a new imaging formula in which the SRF is changing with illumination variation. Different from previous imaging formulae suitable for RAW format for D65 illuminants, our imaging formulae is a theoretical one that can achieve color constancy for an arbitrary camera under any illuminants and is suitable for final image formats. We also propose two methods to solve the optimal SRF, including the pseudo-inverse solution and Brandford method. Finally, we compare the proposed method with previous common-used method, and the experimental results can validate the performance of the proposed method.
Image denoising, which aims to recover a clean image from a noisy one, is a classical yet still active topic in low level vision due to its high value in various practical applications. Existing image denoising method...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
Image denoising, which aims to recover a clean image from a noisy one, is a classical yet still active topic in low level vision due to its high value in various practical applications. Existing image denoising methods generally assume the noisy image is generated by adding an additive white Gaussian noise (AWGN)to the clean image. Following this assumption, synthetic noisy images with ideal AWGN rather than real noisy images are usually used to test the performance of the denoising methods. Such synthetic noisy images, however, lack the necessary image quantification procedure which implies some pixel intensity values may be even negative or higher than the maximum of the value interval (e.g., 255), leading to a violation of the image coding. Consequently, this naturally raises the question: what is the difference between those two kinds of denoised images with and without quantization setting? In this paper, we first give an empirical study to answer this question. Experimental results demonstrate that the pixel value range of the denoised images with quantization setting tend to be narrower than that without quantization setting, as well as that of ground-truth images. In order to resolve this unwanted effect of quantization, we then propose an empirical trick for state-of-the-art weighted nuclear norm minimization (WNNM) based denoising method such that the pixel value interval of the denoised image with quantization setting accords withthat of the corresponding groundtruth image. As a result, our findings can provide a deeper understanding on effect of quantization and its possible solutions.
the combination of traditional methods (e.g., ACF) and Convolutional Neural Networks (CNNs) has achieved great success in pedestrian detection. Despite effectiveness, design of this method is intricate. In this paper,...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
the combination of traditional methods (e.g., ACF) and Convolutional Neural Networks (CNNs) has achieved great success in pedestrian detection. Despite effectiveness, design of this method is intricate. In this paper, we present an end-to-end network based on Faster R-CNN and neural cascade classifier for pedestrian detection. Different from Faster R-CNN that only makes use of the last convolutional layer, we utilize features from multiple layers and feed them to a neural cascade classifier. Such an architecture favors more low-level features and implements a hard negative mining process in the network. Both of these two factors are important in pedestrian detection. the neural cascade classifier is jointly trained withthe Faster R-CNN in our unifying network. the proposed network achieves comparable performance to the state-of-the-art on Caltech pedestrian dataset with a more concise framework and faster processing speed. Meanwhile, the detection result obtained by our method is tighter and more accurate.
Convolutional neural networks (CNNs) have been widely used in computervision community, significantly improving the state-of-the-art. In most of the available CNNs, the softmax loss function is used as the supervisio...
详细信息
ISBN:
(纸本)9783319464787;9783319464770
Convolutional neural networks (CNNs) have been widely used in computervision community, significantly improving the state-of-the-art. In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this paper proposes a new supervision signal, called center loss, for face recognition task. Specifically, the center loss simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers. More importantly, we prove that the proposed center loss function is trainable and easy to optimize in the CNNs. Withthe joint supervision of softmax loss and center loss, we can train a robust CNNs to obtain the deep features withthe two key learning objectives, inter-class dispension and intra-class compactness as much as possible, which are very essential to face recognition. It is encouraging to see that our CNNs (with such joint supervision) achieve the state-of-the-art accuracy on several important face recognition benchmarks, Labeled Faces in the Wild (LFW), YouTube Faces (YTF), and MegaFace Challenge. Especially, our new approach achieves the best results on MegaFace (the largest public domain face benchmark) under the protocol of small training set (contains under 500000 images and under 20000 persons), significantly improving the previous results and setting new state-of-the-art for both face recognition and face verification tasks.
Dictionary Learning (DL) and Sparse Representation Classification (SRC) have shown great success in face recognition recently. Practice have proven that SRC has strong robustness against noise and occlusion in face im...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
Dictionary Learning (DL) and Sparse Representation Classification (SRC) have shown great success in face recognition recently. Practice have proven that SRC has strong robustness against noise and occlusion in face images. Our work focused on a new low-quality character recognition method based on DL and Sparse Representation (SR). SRC is introduced to deal withthe low quality of character images, such as broken stroke, noise, fuzziness. Simultaneously, we also apply the linear combination of over-complete dictionary to recognize characters with different fonts and sizes. A dictionary learning method based on factor analysis is also proposed to make the dictionary more discriminative. Experiments show our method not only can recognizes characters with different fonts and sizes, but also is robust against broken stroke, noise, and fuzziness. Our method is also efficacious as it does not acquire some complex preprocessing procedures, such as binarization and refinement.
Most of existing skeleton-based representations for action recognition can not effectively capture the spatio-temporal motion characteristics of joints and are not robust enough to noise from depth sensors and estimat...
详细信息
ISBN:
(纸本)9783319464787;9783319464770
Most of existing skeleton-based representations for action recognition can not effectively capture the spatio-temporal motion characteristics of joints and are not robust enough to noise from depth sensors and estimation errors of joints. In this paper, we propose a novel low-level representation for the motion of each joint through tracking its trajectory and segmenting it into several semantic parts called motionlets. During this process, the disturbance of noise is reduced by trajectory fitting, sampling and segmentation. then we construct an undirected complete labeled graph to represent a video by combining these motionlets and their spatio-temporal correlations. Furthermore, a new graph kernel called subgraph-pattern graph kernel (SPGK) is proposed to measure the similarity between graphs. Finally, the SPGK is directly used as the kernel of SVM to classify videos. In order to evaluate our method, we perform a series of experiments on several public datasets and our approach achieves a comparable performance to the state-of-the-art approaches.
暂无评论