In this contribution, a novel image quality enhancement algorithm based on convolutional network is proposed for low bit rate image compression. Specifically, a downsample procedure is performed to generate lower reso...
详细信息
ISBN:
(纸本)9781509053162
In this contribution, a novel image quality enhancement algorithm based on convolutional network is proposed for low bit rate image compression. Specifically, a downsample procedure is performed to generate lower resolution image for low bit rate compression. While the decoder side, upsample is to be performed firstly to the original resolution. image quality is further enhanced by the proposed convolutional deep network. In particular, an optional image quality improvement network can be utilized for further enhancement after the first network. With the help of deep network, more detailed and high-frequency information can be recovered while maintaining the consistency of contour area, leading to better visual quality. Another benefit of this approach lies in that the proposed approach is fully compatible with all third-party image codec pipeline. Experimental result shows that the proposed scheme significantly outperforms JPEG in low bit rate image compression.
Recent work has shown that image memorability, in general, can be reliably predicted using some state-of-the-art features. However, all existing methods are not effective in predicting memorability of natural-scene im...
详细信息
ISBN:
(纸本)9781509053162
Recent work has shown that image memorability, in general, can be reliably predicted using some state-of-the-art features. However, all existing methods are not effective in predicting memorability of natural-scene images, far from human. In this paper, we propose a novel method to improve the effectiveness of memorability prediction for natural-scene images. Specifically, we argue that some of HSV colors have either positive or negative impact on memorability of natural-scene images in our Natural Scene image Memorability (NSIM) dataset. Then, we develop an HSV-based feature for memorability prediction. Finally, the HSV-based feature is combined with other efficient state-of-the-art features in our approach to predict memorability on natural scene images. Experimental results validate the effectiveness of our method.
This paper proposes a novel algorithm for compressive sensing (CS) reconstruction of color images. First of all, to better describe color image characteristics, we take inter channel correlation into consideration and...
详细信息
ISBN:
(纸本)9781509053162
This paper proposes a novel algorithm for compressive sensing (CS) reconstruction of color images. First of all, to better describe color image characteristics, we take inter channel correlation into consideration and present two types of regularization, including inter-channel correlation-based nonlocal low-rank (ICNL) regularization and inter-channel correlation based total variation (ICTV) regularization. Afterwards, both regularization terms are incorporated into the minimization problem, and an efficient algorithm is proposed to solve the joint formulation, by using a split-Bregman-based technique. To demonstrate the effectiveness of the proposed approach, four benchmark methods are compared, and the experiments are carried out on several color images with different subrates.
In recent years, Deep convolutional neural networks (DCNNs) have shown excellent performance in the image recognition field. A DCNN is one of the types of multi-layer neural networks, which can automatically obtain fe...
详细信息
ISBN:
(纸本)9781509061839
In recent years, Deep convolutional neural networks (DCNNs) have shown excellent performance in the image recognition field. A DCNN is one of the types of multi-layer neural networks, which can automatically obtain feature representation from input data. The Neocognitron, proposed by Kunihiko Fukushima in the 1980s, is a prototype of a DCNN. It was inspired by the hierarchical structure in the mammalian primary visual cortex. On the other hand, a method called Network In Network (NIN) has recently been proposed. This is a network architecture that embeds some translational symmetric micro networks in a DCNN, and it has been experimentally clarified that with it higher classification accuracy is obtained than in a conventional DCNN. However, it cannot be said that NIN has been as sufficiently analyzed from a physiological point of view compared to DCNNs. We focused on the similarities between the processing of NIN, which accumulates the feature extraction filter of a DCNN, and the operation of a mammalian visual structure called an "orientation continuity" which means preferred orientations of neighboring cells changes continuously, and pointed out the relationships between them. We also studied and pointed out the relevance of the neurophysiological knowledge and the process results obtained with high layer of NIN.
The VLAD (vector of locally aggregated descriptors) representation, derived from BoF and Fisher kernel, has shown its efficiency in the field of image search. However, assigning local descriptors to a codeword is a ha...
详细信息
ISBN:
(纸本)9781509053162
The VLAD (vector of locally aggregated descriptors) representation, derived from BoF and Fisher kernel, has shown its efficiency in the field of image search. However, assigning local descriptors to a codeword is a hard voting process, which does not consider the uncertainty and the plausibility for single codeword. In this paper, we propose an approach to combine VLAD with locality-constrained linear coding, as opposed to the original one, considering several nearest neighbors when assigning local descriptors and computing weights. In order to evaluate our proposed method, experiments are conducted on several image classification benchmarks, using VLAD for comparison. The experimental results show that our method stably outperforms VLAD in terms of classification accuracy, while producing feature representation of the same dimension without much additional computational cost.
The key problem of frame rate up-conversion (FRUC) is to obtain true motion vectors (MV), especially for the motion boundaries. In this paper, we propose a novel FRUC algorithm based on motion-region segmentation. Acc...
详细信息
ISBN:
(纸本)9781509053162
The key problem of frame rate up-conversion (FRUC) is to obtain true motion vectors (MV), especially for the motion boundaries. In this paper, we propose a novel FRUC algorithm based on motion-region segmentation. According to region's temporal consistency, motion-regions are determined by a categorization of detected feature points' true MVs. Then, constrained by MV's spatial smoothness within a region, true motions are propagated to the entire frame. This motion-region segmentation based method achieves truthful motion vector field and preferable interpolated frames. Experiments show that comparing to the state-of-art methods, the proposed algorithm produces videos with better quality in terms of objective and subjective evaluation.
We propose a novel integrated framework to combine the self-learning super-resolution (SR) with dual-learning noise-reduction (NR) for compressed images. Contrary to existing learning based denoising approach, dual-le...
详细信息
ISBN:
(纸本)9781509053162
We propose a novel integrated framework to combine the self-learning super-resolution (SR) with dual-learning noise-reduction (NR) for compressed images. Contrary to existing learning based denoising approach, dual-learning based joint SR and NR is proposed by adding a denoised training set. It makes the proposed framework more suitable for highly compressed noise by referring to closer patch in a training set. Also, it is robust for SR artifacts since the joint framework is designed in such a way that one could learn a process to simultaneously perform NR and SR. Experimental results show that the proposed joint SR and NR framework can achieve higher objective and subjective qualities, compared with individual processing of NR and SR.
We propose perceptual contrast enhancement of dark images based on textural coefficients. The textural coefficient indicates textural degree of intensity and adaptively stretches the dynamic range in an image. First, ...
详细信息
ISBN:
(纸本)9781509053162
We propose perceptual contrast enhancement of dark images based on textural coefficients. The textural coefficient indicates textural degree of intensity and adaptively stretches the dynamic range in an image. First, we calculate gray level difference between a central pixel and its adjacent ones. Because some differences are obviously noticeable by human eyes, we only use unnoticeable differences to obtain the textural coefficient. We apply the just noticeable difference (JND) of the human visual system (HVS) to obtain the proper threshold. Then, we apply a Gaussian kernel to texture coefficients for avoiding excessive differences between adjacent ones. Finally, we perform optimal contrast tone mapping to obtain a mapping function. Experimental results show that the proposed method successfully enhances dark regions while avoiding over-enhancement in bright regions without halo artifact and tone distortion.
In the general 3D scene, the correlation of depth image and corresponding color image exists, so many filtering methods have been proposed to improve the quality of depth images according to this correlation. Unlike t...
详细信息
ISBN:
(纸本)9781509053162
In the general 3D scene, the correlation of depth image and corresponding color image exists, so many filtering methods have been proposed to improve the quality of depth images according to this correlation. Unlike the conventional methods, in this paper both depth and color information can be jointly employed to improve the quality of compressed depth image by the way of iterative guidance. Firstly, due to noises and blurring in the compressed image, a depth pre-filtering method is essential to remove artifact noises. Considering that the received geometry structure in the distorted depth image is more reliable than its color image, the color information is merged with depth image to get depth-merged color image. Then the depth image and its corresponding depth-merged color image can be used to refine the quality of the distorted depth image using joint iterative guidance filtering method. Therefore, the efficient depth structural information included in the distorted depth images are preserved relying on depth itself, while the corresponding color structural information are employed to improve the quality of depth image. We demonstrate the efficiency of the proposed filtering method by comparing objective and visual quality of the synthesized image with many existing depth filtering methods.
Massive visual traffic data have become available recently, which provides an opportunity for intelligent traffic analysis. Timely processing is particularly necessary for traffic analysis. In this paper, we study tim...
详细信息
ISBN:
(纸本)9781467399531
Massive visual traffic data have become available recently, which provides an opportunity for intelligent traffic analysis. Timely processing is particularly necessary for traffic analysis. In this paper, we study time-bounded aggregation analytics on large visual traffic data. We first find that current MapReduce framework can not work well due to two challenges: first, significant dual diversities exist on data distributions and processing time;second, no apriori knowledge on these distributions and time costs is available. However, we also observe spatial and temporal locality on data values and processing time. Based on the examination, we design TaG, an augmented MapReduce framework for time-bounded traffic analytics jobs. Particularly, we propose a novel sampling algorithm that exploits traffic data localities and stratifies samples based on data distributions and processing time. It runs in an iterative, adaptive manner without apriori knowledge. Moreover, we propose a heuristic scheduling algorithm with considerations of batch processing overhead. Further, we refine load balancing mechanism based on data processing time locality to respect job time bounds. We implement TaG on Hadoop and conduct extensive experiments on a large traffic image dataset. The evaluations on different data sizes show TaG is able to achieve high accuracy within different time bounds.
暂无评论