As an important biometric feature of every person, face data has faced serious risks of leakage in recent years. Lawbreakers can use face recognition systems (FRS) to analyze the leaked face data and then correlate ot...
详细信息
The recent state of the art on monocular 3D face reconstruction from image data has made some impressive advancements, thanks to the advent of Deep Learning. However, it has mostly focused on input coming from a singl...
详细信息
In this paper, a imageprocessing.method called VRHI is developed to enhance single hazy images. More specifically, inspired by visual characteristics of haze, a haze density estimation model is designed to predict th...
详细信息
ISBN:
(纸本)9781665448994
In this paper, a imageprocessing.method called VRHI is developed to enhance single hazy images. More specifically, inspired by visual characteristics of haze, a haze density estimation model is designed to predict the haze distribution. According to this recognized haze distribution, a quadtree based recursive strategy is subsequently proposed to locate the atmospheric light. Finally, by combining a global-wise adjusting mechanism and atmospheric scattering model, the haze cover in an image can be easily excluded using the estimated parameters. It is worth mentioning that VRHI is based on whole image to search the unknown parameters, thereby avoiding some unfavorable phenomena, e.g., over-enhancement and color distortion. Extensive experiments on real-world images and well-known dehazing datasets show that VRHI outperforms state-of-the-art techniques in robustness and effectiveness.
We introduce FlowIBR, a novel approach for efficient monocular novel view synthesis of dynamic scenes. Existing techniques already show impressive rendering quality but tend to focus on optimization within a single sc...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
We introduce FlowIBR, a novel approach for efficient monocular novel view synthesis of dynamic scenes. Existing techniques already show impressive rendering quality but tend to focus on optimization within a single scene without leveraging prior knowledge, resulting in long optimization times per scene. FlowIBR circumvents this limitation by integrating a neural image-based rendering method, pretrained on a large corpus of widely available static scenes, with a per-scene optimized scene flow field. Utilizing this flow field, we bend the camera rays to counteract the scene dynamics, thereby presenting the dynamic scene as if it were static to the rendering network. The proposed method reduces per-scene optimization time by an order of magnitude, achieving comparable rendering quality to existing methods — all on a single consumer-grade GPU.
Instance segmentation, the task of identifying and separating each individual object of interest in the image, is one of the actively studied research topics in computer vision. Although many feed-forward networks pro...
详细信息
ISBN:
(纸本)9781665445092
Instance segmentation, the task of identifying and separating each individual object of interest in the image, is one of the actively studied research topics in computer vision. Although many feed-forward networks produce high-quality binary segmentation on different types of images, their final result heavily relies on the post-processing.step, which separates instances from the binary mask. In comparison, the existing iterative methods extract a single object at a time using discriminative knowledge-based properties (e.g., shapes, boundaries, etc.) without relying on postprocessing. However, they do not scale well with a large number of objects. To exploit the advantages of conventional sequential segmentation methods without impairing the scalability, we propose a novel iterative deep reinforcement learning agent that learns how to differentiate multiple objects in parallel. By constructing a relational graph between pixels, we design a reward function that encourages separating pixels of different objects and grouping pixels that belong to the same instance. We demonstrate that the proposed method can efficiently perform instance segmentation of many objects without heavy post-processing.
image-to-image translation is an important and challenging problem in computer vision and imageprocessing. Diffusion models (DM) have shown great potentials for high-quality image synthesis, and have gained competiti...
image-to-image translation is an important and challenging problem in computer vision and imageprocessing. Diffusion models (DM) have shown great potentials for high-quality image synthesis, and have gained competitive performance on the task of image-to-image translation. However, most of the existing diffusion models treat image-to-image translation as conditional generation processes, and suffer heavily from the gap between distinct domains. In this paper, a novel image-to-image translation method based on the Brownian Bridge Diffusion Model (BBDM) is proposed, which models image-to-image translation as a stochastic Brownian Bridge process, and learns the translation between two domains directly through the bidirectional diffusion process rather than a conditional generation process. To the best of our knowledge, it is the first work that proposes Brownian Bridge diffusion process for image-to-image translation. Experimental results on various benchmarks demonstrate that the proposed BBDM model achieves competitive performance through both visual inspection and measurable metrics.
Optimal transport (OT) is a rising research area to overcome distribution shifts in real-world data, which has been widely applied in visual signal processing.tasks due to its appealing mathematical properties. Howeve...
详细信息
Most mobile device image Signal processing.(ISP) pipelines operate directly on RAW image data for all processing.tasks. However, the rise of super-high-resolution cameras on mobile devices has led to increased memory ...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Most mobile device image Signal processing.(ISP) pipelines operate directly on RAW image data for all processing.tasks. However, the rise of super-high-resolution cameras on mobile devices has led to increased memory demands for multi-frame ISP pipelines. In this work, we introduce a novel ISP pipeline that operates on a learned compressed domain, aiming to conserve memory for downstream ISP modules’ inputs. We utilize RGB image compression to define a compressed latent domain, preserving both semantic information and high-frequency details. To facilitate mapping of raw images to the compressed domain, we develop a transfer learning strategy. All downstream processing.tasks, including demosaicing, single and multi-frame denoising, and registration, are performed on this compressed latent domain. We demonstrate the effectiveness of our compressed domain ISP pipeline on both public and internal datasets. Remarkably, our pipeline achieves ISP performance similar to non-compression methods while significantly reducing mobile memory requirements.
Multimedia services are constantly trying to deliver better image quality to users. To meet this need, they must have an effective and reliable tool to assess the perceptual image quality. This is particularly true fo...
详细信息
ISBN:
(纸本)9781665448994
Multimedia services are constantly trying to deliver better image quality to users. To meet this need, they must have an effective and reliable tool to assess the perceptual image quality. This is particularly true for image restoration (IR) algorithms, where the image quality assessment (IQA) metric plays a key role in the development of these latter. For instance, the recent advances in IR algorithms, which are mainly due to the adoption of generative adversarial network (GAN)-based methods, have clearly shown the need for a reliable IQA metric highly correlated with human judgment. In this paper, we propose an ensemble of gradient boosting (EGB) metric based on selected features similarity and ensemble learning. First, we analyzed the capability of features extracted by different layers of deep convolutional neural network (CNN) to characterize the perceptual quality distance between the reference and distorted/processed images. We observed that a subset of these layers is more relevant to the IQA task. Accordingly, we exploited these selected layers to compute the features similarity, which are then used as input to a regression network to predict the image quality score. The regression network consists of three gradient boosting regression models that are combined to derive the final quality score. Experiments were performed on the perceptual imageprocessing.algorithms (PIPAL) dataset, which has been used in the NTIRE 2021 perceptual image quality assessment challenge. The results show that the proposed metric significantly outperforms the state-of-the-art methods for IQA task.
Blurry images usually exhibit similar blur at various locations across the image domain, a property barely captured in nowadays blind deblurring neural networks. We show that when extracting patches of similar underly...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Blurry images usually exhibit similar blur at various locations across the image domain, a property barely captured in nowadays blind deblurring neural networks. We show that when extracting patches of similar underlying blur is possible, jointly processing.the stack of patches yields superior accuracy than handling them separately. Our collaborative scheme is implemented in a neural architecture with a pooling layer on the stack dimension. We present three practical patch extraction strategies for image sharpening, camera shake removal and optical aberration correction, and validate the proposed approach on both synthetic and real-world benchmarks. For each blur instance, the proposed collaborative strategy yields significant quantitative and qualitative improvements.
暂无评论