The performance of superresolution video enhancement relies heavily on the robustness and accuracy of motion estimation techniques. In this paper, we propose a novel and efficient block matching motion estimation algo...
详细信息
ISBN:
(纸本)0819444111
The performance of superresolution video enhancement relies heavily on the robustness and accuracy of motion estimation techniques. In this paper, we propose a novel and efficient block matching motion estimation algorithm suitable for estimating general motion existing in low resolution video frames. We exploit the spatial correlations between motion vectors and apply a coarse-to-fine multi-stage scheme to get the dense motion fields. We incorporate our motion estimation technique into the Projection Onto Convex Sets (POCS) superresolution framework. Experimental results show that the resulting high resolution images yield visual sharper enhanced images with significant PSNR improvement.
A tensor display is a type of 3D light field display, composed of multiple transparent screens and a back-light that can render a scene with correct depth, allowing to view a 3D scene without wearing glasses. The anal...
详细信息
ISBN:
(纸本)9781665475921
A tensor display is a type of 3D light field display, composed of multiple transparent screens and a back-light that can render a scene with correct depth, allowing to view a 3D scene without wearing glasses. The analysis of state-of-the-art tensor displays assumes that the content is Lambertian. In order to extend its capabilities, we analyze the limitations of displaying non-Lambertian scenes and propose a new method to factorize the non-Lambertian scenes using disparity analysis. Moreover, we demonstrate a new prototype of a tensor display with three layers of full HD content at 60 fps. Compared with state-of-the-art, the evaluation results verify that the proposed non-Lambertian rendering method can display a higher quality for non-Lambertian scenes on both simulation and a prototyped tensor display.
With the rapid development of whole brain imaging technology, a large number of brain images have been produced, which puts forward a great demand for efficient brain image compression methods. At present, the most co...
详细信息
ISBN:
(纸本)9781728185514
With the rapid development of whole brain imaging technology, a large number of brain images have been produced, which puts forward a great demand for efficient brain image compression methods. At present, the most commonly used compression methods are all based on 3-D wavelet transform, such as JP3D. However, traditional 3-D wavelet transforms are designed manually with certain assumptions on the signal, but brain images are not as ideal as assumed. What's more, they are not directly optimized for compression task. In order to solve these problems, we propose a trainable 3-D wavelet transform based on the lifting scheme, in which the predict and update steps are replaced by 3-D convolutional neural networks. Then the proposed transform is embedded into an end-to-end compression scheme called iWave3D, which is trained with a large amount of brain images to directly minimize the rate-distortion loss. Experimental results demonstrate that our method outperforms JP3D significantly by 2.012 dB in terms of average BD-PSNR.
Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point map...
详细信息
ISBN:
(纸本)9781728180687
Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point maps with a Gaussian kernel for later model learning (generation). Then, the counting network learns to predict density maps from input images (estimation). Most studies focus on the estimation problem while overlooking the generation problem. In this paper, a training framework is proposed to generate density maps by learning and train generation and estimation subnetworks jointly. Experiments demonstrate that our method outperforms other density map-based methods and shows the best performance on drone-based car counting.
In this article a novel optimized imageprocessing algorithm for replication of fast motion perception of the human visual system [1], [2] is presented. Rapid approaching objects are detected using a CMOS pixel array ...
详细信息
ISBN:
(纸本)0780374029
In this article a novel optimized imageprocessing algorithm for replication of fast motion perception of the human visual system [1], [2] is presented. Rapid approaching objects are detected using a CMOS pixel array with internal parallel imageprocessing. The new algorithm will be discussed with several examples. The response time is about 10mus.
Previous Deepfake detection methods perform well within their training domains, but their effectiveness diminishes significantly with new synthesis techniques. Recent studies have revealed that detection models make d...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Previous Deepfake detection methods perform well within their training domains, but their effectiveness diminishes significantly with new synthesis techniques. Recent studies have revealed that detection models make decision boundaries based on facial identity instead of synthetic artifacts, leading to poor cross-domain performance. To address this issue, we propose FRIDAY, a novel training method that attenuates facial identity utilizing a face recognizer. To be specific, we first train a face recognizer using the same backbone as the Deepfake detector. We then freeze the recognizer and use it during the detector's training to mitigate facial identity information. This is achieved by feeding input images into both the recognizer and the detector, then minimizing the similarity of their feature embeddings using our Facial Identity Attenuating loss. This process encourages the detector to produce embeddings distinct from the recognizer, effectively attenuating facial identity. Comprehensive experiments demonstrate that our approach significantly improves detection performance on both in-domain and cross-domain datasets.
Access to technologies like mobile phones contributes to the significant increase in the volume of digital visual data (images and videos). In addition, photo editing software is becoming increasingly powerful and eas...
详细信息
ISBN:
(纸本)9781728180687
Access to technologies like mobile phones contributes to the significant increase in the volume of digital visual data (images and videos). In addition, photo editing software is becoming increasingly powerful and easy to use. In some cases, these tools can be utilized to produce forgeries with the objective to change the semantic meaning of a photo or a video (e.g. fake news). Digital image forensics (DIF) includes two main objectives: the detection (and localization) of forgery and the identification of the origin of the acquisition (i.e. sensor identification). Since 2005, many classical methods for DIF have been designed, implemented and tested on several databases. Meantime, innovative approaches based on deep learning have emerged in other fields a nd have surpassed traditional techniques. In the context of DIF, deep learning methods mainly use convolutional neural networks (CNN) associated with significant preprocessing modules. This is an active domain and two possible ways to operate preprocessing have been studied: prior to the network or incorporated into it. None of the various studies on the digital image forensics provide a comprehensive overview of the preprocessing techniques used with deep learning methods. Therefore, the core objective of this article is to review the preprocessing modules associated with CNN models.
In recent years, a lot of deep convolution neural networks have been successfully applied in single image super-resolution (SISR). Even in the case of using small convolution kernel, those methods still require large ...
详细信息
ISBN:
(纸本)9781665475921
In recent years, a lot of deep convolution neural networks have been successfully applied in single image super-resolution (SISR). Even in the case of using small convolution kernel, those methods still require large number of parameters and computation. To tackle the problem above, we propose a novel framework to extract features more efficiently. Inspired by the idea of deep separable convolution, we improve the standard residual block and propose the inverted bottleneck block (IBNB). The IBNB replaces the small-sized convolution kernel with the large-sized convolution kernel without introducing additional computation. The proposed IBNB proves that large kernel size convolution is available for SISR. Comprehensive experiments demonstrate that our method surpasses most methods by up to 0.10 similar to 0.32dB in quantitative metrics with fewer parameters.
With the advancement of deep learning techniques, learned image compression (LIC) has surpassed traditional compression methods. However, these methods typically require training separate models to achieve optimal rat...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
With the advancement of deep learning techniques, learned image compression (LIC) has surpassed traditional compression methods. However, these methods typically require training separate models to achieve optimal rate-distortion performance, leading to increased time and resource consumption. To tackle this challenge, we propose leveraging multi-gain and inverse multi-gain unit pairs to enable variable rate adaptation within a single model. Nevertheless, experiments have shown that rate-distortion performance may degrade at certain bitrates. Therefore, we introduce weighted probability assignment, where different selection probabilities are assigned during training based on lambda values, to increase the model's training frequency under specific bitrate conditions. To validate our approach, extensive experiments were conducted on Transformer-based and CNN-based models. The experimental results validate the efficiency of our proposed method.
This paper presents a simple technique for estimating the space location from which a certain image has been taken. The basic assumption is that the scene portrayed in the image is planar. The method is based on the a...
详细信息
ISBN:
(纸本)0819444111
This paper presents a simple technique for estimating the space location from which a certain image has been taken. The basic assumption is that the scene portrayed in the image is planar. The method is based on the acquisition of a new set of images, closely resembling the given image. The location is recovered from the parameters describing the camera's pose during the acquisition of that among the new images showing the highest degree of correlation with the original image. An example of application of this technique is discussed in the paper.
暂无评论