This paper presents a study of motion coding schemes for learned video compression. Most learned video compression systems explicitly signal optical flow maps to characterize motion between video frames for motion com...
详细信息
ISBN:
(数字)9781665481281
ISBN:
(纸本)9781665481281
This paper presents a study of motion coding schemes for learned video compression. Most learned video compression systems explicitly signal optical flow maps to characterize motion between video frames for motion compensation. The flow maps, usually of the same size as the video frames, represent a considerable portion of the compressed bitstream. This work studies several schemes to make a non-linear prediction of the flow maps for efficient motion coding. These include signaling an incremental flow map between a coding frame and a motion-compensated frame derived from the flow map predictor. In forming the flow map predictor, we propose a learned motion extrapolation module and a motion forward warping scheme. They are further incorporated into two novel approaches, termed double warping and frame synthesis with motion forward warping, in creating an inter-frame predictor by combining the incremental flow and the flow map predictor. Extensive experiments are conducted to analyze the merits and faults of these variants, and demonstrate their superiority to predictive motion coding and intra motion coding.
This paper provides a description and analysis of technology included in two contributions to the Call for Proposals for Video Compression with Capability beyond HEVC. This Call for Proposals was issued jointly by the...
详细信息
This paper provides a description and analysis of technology included in two contributions to the Call for Proposals for Video Compression with Capability beyond HEVC. This Call for Proposals was issued jointly by the Moving Picture Experts Group (MPEG) and the Video coding Experts Group (VCEG). The contribution emphasized a flexible and rectangular partitioning structure, which was combined with both new and existing coding tools. New coding tools included methods for improved motion vector coding, quantization signaling, and deblocking;while existing tools were largely methods studied in the Joint Exploration Model software. Results show the efficacy of the approach. Using the sequences and test conditions defined in the Call for Proposals, the described approach provided an average bitrate reduction, relative to an HEVC anchor, of 41.2% and 35.7% for 4K and HD test sequences, respectively. Moreover, the method achieved a compression performance of 34.3% and 32.2% for high dynamic range content using the perceptual quantizer (PQ) and Hybrid-Log Gamma transfer functions, respectively.
This work addresses motion coding in end-to-end learned video compression. The efficiency of motion coding is critical at low bit rates, at which a large portion of the bitstream signals motion information. Most end-t...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
This work addresses motion coding in end-to-end learned video compression. The efficiency of motion coding is critical at low bit rates, at which a large portion of the bitstream signals motion information. Most end-to-end learned video codecs adopt an intra-coding approach to codingmotion information as individual optical flow maps. Some recent studies introduce predictive motion coding to encode optical flow map residuals. Still, motion coding remains an active research area for learned video compression. We present an incremental optical flow coding scheme. It first leverages an extrapolated flow together with the reference frame in estimating an incremental flow between the reference and the target frames for efficient motion coding. It then derives the final flow map for motion compensation by integrating the incremental and the extrapolated flows in a double-warping scheme. Experimental results on commonly used datasets show the superiority of our method over predictive motion coding and other advanced schemes.
Early experimental work suggested that the retina’smain role was to detect changes in brightness and contrast, namelyworking as a light detector, and that most of the complexcomputations in the visual system happened...
详细信息
Early experimental work suggested that the retina’smain role was to detect changes in brightness and contrast, namelyworking as a light detector, and that most of the complexcomputations in the visual system happened upstream in the *** reality, there is a growing wealth of literature indicating thatthe retina itself processes multiple channels of visual information(contrast, motion, orientation, etc.), making it much more complexthan it originally appeared. For instance, there now appear to beover 20 types of retinal ganglion cells. To this end, the work inthis thesis will focus on the identification and characterizationof a single type of retinal ganglion cell in the mouse retina. Inthe first section of my results, I will show that this cell type,identified as the only GFP+ ganglion cell in the transgenicHb9::eGFP retina, is a directionally selective ganglion cell(DSGC), that preferentially responds to objects moving upwardthrough the visual field. This cell has a pronounced morphologicalasymmetry that helps it to synergistically (along with asymmetricinhibition) generate directionally selective responses. In thesecond results section, I will describe a novel phenomenonexhibited by Hb9+ DSGCs: Thanks to gap junction mediated signals,Hb9+ cells are able to anticipate moving stimuli and correct forlags that are inherent in visual signals generated byphotoreceptors. In the third results section I will elucidate themechanisms for the gap junction mediated anticipatory signalsoutlined in the second results section. Together, these resultsprovide a significant advancement in our understanding of how theretina processes moving stimuli and provide a compelling example ofhow chemical and electrical synapses interact to allow forexquisite signal multiplexing
When a brief flash is quickly presented aligned with a moving target, the flash typically appears to lag behind the moving stimulus. This effect is widely known in the literature as a flash-lag illusion (FLI). The fla...
详细信息
When a brief flash is quickly presented aligned with a moving target, the flash typically appears to lag behind the moving stimulus. This effect is widely known in the literature as a flash-lag illusion (FLI). The flash-lag is an example of a motion-induced position shift. Since auditory deprivation leads to both enhanced visual skills and impaired temporal abilities, both crucial for the perception of the flash-lag effect, here we hypothesized that lack of audition could influence the FLI. 13 early deaf and 18 hearing individuals were tested in a visual FLI paradigm to investigate this hypothesis. As expected, results demonstrated a reduction of the flash-lag effect following early deafness, both in the central and peripheral visual fields. Moreover, only for deaf individuals, there is a positive correlation between the flash-lag effect in the peripheral and central visual field, suggesting that the mechanisms underlying the effect in the center of the visual field expand to the periphery following deafness. Overall, these findings reveal that lack of audition early in life profoundly impacts early visual processing underlying the flash-lag effect.
In the field of crowd behavior analysis, existing methods mainly focus on using local representations inspired by models found in other disciplines (e.g., fluid dynamics and social dynamics) to describe motion pattern...
详细信息
In the field of crowd behavior analysis, existing methods mainly focus on using local representations inspired by models found in other disciplines (e.g., fluid dynamics and social dynamics) to describe motion patterns. However, less attention is paid to exploiting motion structures (e.g., visual information contained in trajectories) for behavior analysis. In this paper, we consider both local characteristics and global structures of a motion vector field, and propose the Curl and Divergence of motion Trajectories (CDT) descriptors to describe collective motion patterns. To this end, a trajectory-based motion coding algorithm is designed to extract the CDT descriptors. For each motion vector field we construct its conjugate field, in which each vector is perpendicular to the counterpart in the original vector field. The trajectories in the motion and corresponding conjugate fields indicate the tangential and radial motion structures, respectively. By integrating curl (and divergence, respectively) along the tangential paths (and the radial paths, respectively), the CDT descriptors are extracted. We show that the proposed motion descriptors are scale- and rotation-invariant for effective crowd behavior analysis. For concreteness, we apply the CDT descriptors to identify five typical crowd behaviors (lane, clockwise arch, counterclockwise arch, bottleneck and fountainhead) with a pipeline including motion decomposition. Extensive experimental results on two benchmark datasets demonstrate the effectiveness of the CDT descriptors for describing and classifying crowd behaviors.
Efficient image sequence coding exploits both intra- and inter-frame correlations. Set partition coding (SPC) is efficient in intra-frame de-correlation for still images. Based on SPC, a novel image sequence coding sy...
详细信息
Efficient image sequence coding exploits both intra- and inter-frame correlations. Set partition coding (SPC) is efficient in intra-frame de-correlation for still images. Based on SPC, a novel image sequence coding system, called motion differential SPC (M-D-SPC), is presented in this paper. It removes inter-frame redundancy by re-using the significance map of a previously SPC coded frame. Every frame is encoded and decoded separate from other frames. Furthermore, there is no reconstruction of encoded frames in the encoder, as is done with interframe prediction methods. The M-D-SPC exhibits an auxiliary key frame coding framework, which achieves higher coding efficiency compared to the all-intra-coding schemes and meanwhile maintains the beneficial features of SPC all-intra-coding, such as computational simplicity, rate scalability, error non-propagation, and random frame access. SPIHT-based simulations on hyperspectral images, 3D/4D medical images, and video show greater compression efficiency than the standard intraframe coding method of motion JPEG2000. (c) 2012 Elsevier Inc. All rights reserved.
Since hierarchical variable size block matching and bidirectional motion compensation are used in the motion-compensated embedded zero block coding (MC-EZBC), the motion information consists of motion vector quadtree ...
详细信息
ISBN:
(纸本)0819459763
Since hierarchical variable size block matching and bidirectional motion compensation are used in the motion-compensated embedded zero block coding (MC-EZBC), the motion information consists of motion vector quadtree map and motion vectors. In the conventional motion coding scheme, the quadtree structure is coded directly, the motion vector modes are coded with Huffman codes, and the motion vector differences are coded by an mary arithmetic coder with 0-order models. In this paper we propose a new motion coding scheme which uses an extension of the CABAC algorithm and new context modeling for quadtree structure coding and mode coding. In addition, we use a new scalable motion coding method which scales the motion vector quadtrees according to the rate-distortion slope of the tree nodes. Experimental results show that the new coding scheme increases the efficiency of the motion coding by more than 25%. The performance of the system is improved accordingly, especially in low bit rates. Moreover, with the scalable motion coding, the subjective and objective coding performance is further enhanced in low bit rate scenarios.
The authors introduce a method to reconstruct shape deformations of objects using linear interpolation of spherical harmonic (SH) representation parameters. It is shown that uniform shape deformation results in fairly...
详细信息
The authors introduce a method to reconstruct shape deformations of objects using linear interpolation of spherical harmonic (SH) representation parameters. It is shown that uniform shape deformation results in fairly linear changes in SH parameters. Therefore, shape changes can be reconstructed once the characteristics of the deformation are obtained in the SH domain.
A parallel H.263 video encoder, which utilises spatial pal allelism, has been modelled using a multi-threaded program. Spatial parallelism is a technique where an image is subdivided into equal parts las far as physic...
详细信息
A parallel H.263 video encoder, which utilises spatial pal allelism, has been modelled using a multi-threaded program. Spatial parallelism is a technique where an image is subdivided into equal parts las far as physically possible) and each part is processed by a separate processor by computing motion and texture coding with all processors each acting on a different part of the image. This method leads to a performance increase, which is roughly in proportion to the number of parallel processors used.
暂无评论