This paper proposes a new algorithm for image inpainting algorithm based on the matrix rank minimization and locally linear embedding (LLE) method. Assuming that intensity values of an image belong to an unknown manif...
详细信息
ISBN:
(纸本)9781538644584
This paper proposes a new algorithm for image inpainting algorithm based on the matrix rank minimization and locally linear embedding (LLE) method. Assuming that intensity values of an image belong to an unknown manifold, which can be mapped into a linear subspace by an unknown function, the image inpainting problem is formulated as minimizing sum of rank of submatrices of Hankel matrix. In order to solve the problem, this paper modifies the iterative partial matrix shrink-age (IPMS) algorithm and provides an inpainting algorithm. Numerical examples show that the proposed algorithm recovers missing pixels efficiently.
In this paper, we propose a novel hierarchical subspace regression algorithm based on edge orientations of compressed face image patches which includes two parts: training and restoration phases. In the training phase...
详细信息
ISBN:
(纸本)9781538661192
In this paper, we propose a novel hierarchical subspace regression algorithm based on edge orientations of compressed face image patches which includes two parts: training and restoration phases. In the training phase, the rule of the face edge-orientation (EO) distribution is used to classify the image patches into shallow subspaces. Then, the k-means clustering is used to cluster the deep subspaces of each EO-based shallow subspace, and corresponding linear mapping training is performed for each deep subspace. In restoration phase, an appropriate linear mapping selected based on the EO of compressed input image patch is applied to generate the restored output image patch. The experimental results show that the peak signal to noise ratio (PSNR) and structural similarity index (SSIM) are better than the existing popular algorithm, and they can effectively remove the blocking artifact and zigzag effect, so as to improve the visual effect.
Recently, tile-based viewport adaptation is a popular method for 360 video streaming. Our demonstration adopts a rate-mixed transmission approach utilizing a VR adaptation agent at the server end for viewport-based st...
详细信息
ISBN:
(纸本)9781538644584
Recently, tile-based viewport adaptation is a popular method for 360 video streaming. Our demonstration adopts a rate-mixed transmission approach utilizing a VR adaptation agent at the server end for viewport-based streaming, which is client-compatible and can be scalable to different users. The FOV prediction is applied to improve the viewing experience. The feasibility of our system to head-mounted displays is verified, which can reduce bandwidth consumption by up to 36%.
In this paper, we optimized the Linphone-based real-time video communication system. Firstly, we used HEVC to replace the H.264 in order to reduce the bandwidth pressure while reducing the buffer delay by configuring ...
详细信息
ISBN:
(纸本)9781538644584
In this paper, we optimized the Linphone-based real-time video communication system. Firstly, we used HEVC to replace the H.264 in order to reduce the bandwidth pressure while reducing the buffer delay by configuring the appropriate encoding parameters. Secondly, we added an effective bitrate adaptive algorithm based on additive increase and multiplicative decrease (AIMD) in the system, which can effectively reduce the packet loss rate in the network with high bandwidth fluctuations.
In this paper we focus on the problem of unsupervised domain adaptation for semantic segmentation. The previous works usually focus on adversarial learning either in pixel-level or feature-level. However, global struc...
详细信息
In this paper we focus on the problem of unsupervised domain adaptation for semantic segmentation. The previous works usually focus on adversarial learning either in pixel-level or feature-level. However, global structure knowledge is often neglected in the adversarial learning due to the possible reasons: First, the result of pixel-level adversarial learning does not necessarily preserve the semantic consistency of the input image. Second, global structure knowledge is not embedded to regularize the feature-level adversarial learning. In this work, we propose a framework for unsupervised domain adaptation in semantic segmentation which effectively incorporates pixel-level, feature-level adversarial learning and self-training strategy. Our framework embeds the global structure knowledge into the adversarial training step to tackle the problem of structure misalignment. Consequently, our proposed framework achieves the state-of-the-art semantic segmentation domain adaptation results on the task of transferring GTA5 to Cityscapes.
We take advantage of the popularity of deep convolutional neural networks (CNNs) and have developed a very simple image quality assessment method that rivals state of the art. We show that convolutional layer outputs ...
详细信息
ISBN:
(纸本)9781538644584
We take advantage of the popularity of deep convolutional neural networks (CNNs) and have developed a very simple image quality assessment method that rivals state of the art. We show that convolutional layer outputs (deep features) of a CNN compute the local structural information of spatial regions of different sizes in the input image. The learned convolutional kernels contain a much richer set of weights thus capturing much more local structural information than hand crafted ones. As the deep features learned from large datasets already contain very rich multi-resolutional structural image information, they can be directly used to calculate visual distortion of an image and it is not necessary to introduce further complicated computational process. We will present experimental results to demonstrate that this is indeed the case, and that simple cosine distance of the deep features is as good as state the art methods for full reference image quality assessment.
The blind image quality assessment (BIQA) metric based on deep neural network (DNN) achieves the best evaluation accuracy at present, and the depth of neural networks plays a crucial role for deep learning-based BIQA ...
The blind image quality assessment (BIQA) metric based on deep neural network (DNN) achieves the best evaluation accuracy at present, and the depth of neural networks plays a crucial role for deep learning-based BIQA metric. However, training a DNN for quality assessment is known to be hard because of the lack of labeled data, and getting quality labels for a large number of images is very time consuming and costly. Therefore, training a deep BIQA metric directly will lead to over-fitting in all likelihood. In order to solve this problem, we introduced a weakly supervised approach for learning a deep BIQA metric. First, we pre-trained a novel encoder-decoder architecture by using the training data with weak quality annotations. The annotation is the error map between the distorted image and its undistorted version, which can roughly describes the distribution of distortion and can be easily acquired for training. Next, we fine-tuned the pre-trained encoder on the quality labeled data set. Moreover, we used the group convolution to reduce the parameters of the proposed metric and further reduce the risk of over-fitting. These training strategies, which reducing the risk of over-fitting, enable us to build a very deep neural network for BIQA to have a better performance. Experimental results showed that the proposed model had the state-of-art performance for various images with different distortion types.
The task of person re-identification (re-id) is to find the same pedestrian across non-overlapping cameras. Normally, the performance of person re-id can be affected by background clutters. However, existing segmentat...
详细信息
The task of person re-identification (re-id) is to find the same pedestrian across non-overlapping cameras. Normally, the performance of person re-id can be affected by background clutters. However, existing segmentation algorithms are hard to obtain perfect foreground person images. To effectively leverage the body (foreground) cue, and in the meantime pay attention to discriminative information in the background (e.g., companion or vehicle), we propose to use a cross-learning strategy to take both foreground and other discriminative information into account. In addition, since currently existing foreground segmentation result always involves noise, we use Label Smoothing Regularization (LSR) to strengthen the generalization capability during our learning process. In experiments, we pick up two state-of-the-art person re-id methods to verify the effectiveness of our proposed cross-learning strategy. Our experiments are carried out on two publicly available person re-id datasets. Obvious performance improvements can be observed on both datasets.
In this paper, we propose a near-duplicate image retrieval method based on multiple features. Combining the deep features extracted from the VGG relu6 layer with the improved local feature descriptors, we attempt to s...
详细信息
ISBN:
(纸本)9781538644584
In this paper, we propose a near-duplicate image retrieval method based on multiple features. Combining the deep features extracted from the VGG relu6 layer with the improved local feature descriptors, we attempt to simulate the nearduplicate image retrieval process of the human brain through a two-layer retrieval structure. Inspired by the proposed CROW feature, we calculate the weights on VGG shallow pooling layer and extract the interest domains for screening surf feature points. At the same time, a center weight is proposed to improve the VLAD algorithm. Experiments show that our method can not only obtain the visually similar results of an image, but also obtain the results that contain the visually prominent parts of the image.
image upscaling to obtain high quality digital image is one of the active research topics as it is applicable in the consumer electronics industries. Traditional image upscaling techniques have low computational compl...
详细信息
image upscaling to obtain high quality digital image is one of the active research topics as it is applicable in the consumer electronics industries. Traditional image upscaling techniques have low computational complexity and applicable for real-time processing, but reconstructed image often contains artifacts and undesirable visual effect. The relationship between image interpolation and super-resolution leads our assumption that the interpolated image can be further optimized and may be considered as a part of super-resolution algorithm. In this paper, we propose a new image super-resolution method to combine fast image interpolation with iterative back-projection. This method does not require any external pre-trained datasets and has low computation time while the quality of the reconstructed image can be measured up to the high programming complexity methods such as the dictionary and deep convolutional neural networks.
暂无评论