In this work, we present DeepEraser, an effective deep network for generic text removal. DeepEraser utilizes a recurrent architecture that erases the text in an image via iterative operations. Our idea comes from the ...
详细信息
In Versatile Video Coding (VVC), local affine motion compensation (LAMC) is adopted to handle complex motions, such as rotation and zooming. However, it is inefficient to use LAMC to handle the global motion due to th...
详细信息
ISBN:
(纸本)9781665475938
In Versatile Video Coding (VVC), local affine motion compensation (LAMC) is adopted to handle complex motions, such as rotation and zooming. However, it is inefficient to use LAMC to handle the global motion due to the following two reasons. First, the use of LAMC may lead to some extra bit cost on the affine motion model parameters. Second, the precision of LAMC is restricted by the MV precision of the control points. Therefore, in this paper, we propose a global homography motion compensation (GHMC) framework to better characterize the global motion. For each coding block, an extra mode is added to perform motion compensation based on an 8-parameter global homography motion model. In addition, an extrapolation scheme is designed to derive the parameters from reference frames to save the bit cost for signaling them. The proposed framework is implemented into the VVC reference software VTM-6.0. Experimental results show that, on average, 0.69% and 0.66% BD-rate reduction is achieved under Low Delay P and Low Delay B configurations, respectively, for sequences with rich complex global motions.
The Segment Anything Model (SAM) exhibits remarkable versatility and zero-shot learning abilities, owing largely to its extensive training data (SA-1B). Recognizing SAM’s dependency on manual guidance given its categ...
详细信息
3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first considers a novel task known as One-shot 3D Generative Domain Adaptation ...
详细信息
Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Cl...
详细信息
ISBN:
(纸本)9781665492584
Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods.
Composite regularization models are widely used in sparse signal processing, making multiple regularization parameters selection a significant problem to be solved. Variety kinds of composite regularization models are...
详细信息
Over the past few years, the advancement of Multimodal Large Language Models (MLLMs) has captured the wide interest of researchers, leading to numerous innovations to enhance MLLMs’ comprehension. In this paper, we p...
详细信息
Synthetic aperture radar (SAR) tomography (TomoSAR) has garnered significant attention due to its capability for three-dimensional reconstruction. Compressed sensing (CS) methods are widely employed to address the Tom...
详细信息
This paper is centered on gridless direction of arrival (DoA) estimation for single-snapshot data collected by non-uniform linear arrays (NLAs) in the context of automotive applications. While recent single-snapshot D...
详细信息
Existing learning-based methods for blind image quality assessment (BIQA) are heavily dependent on large amounts of annotated training data, and usually suffer from a severe performance degradation when encountering t...
详细信息
暂无评论