In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper in...
详细信息
In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper introduces a class of transforms called graph-based transforms (GBTs) for video compression, and proposes two different techniques to design GBTs. In the first technique, we formulate an optimization problem to learn graphs from data and provide solutions for optimal separable and nonseparable GBT designs, called GL-GBTs. The optimality of the proposed GL-GBTs is also theoretically analyzed based on Gaussian-Markov random field (GMRF) models for intra and inter predicted block signals. The second technique develops edge-adaptive GBTs (EA-GBTs) in order to flexibly adapt transforms to block signals with image edges (discontinuities). The advantages of EA-GBTs are both theoretically and empirically demonstrated. Our experimental results show that the proposed transforms can significantly outperform the traditional Karhunen-Loeve transform (KLT).
Experimental and theoretical studies have tried to gain insights into the involvement of the Temporal Parietal Junction (TPJ) in a broad range of cognitive functions like memory, attention, language, self-agency and t...
详细信息
Experimental and theoretical studies have tried to gain insights into the involvement of the Temporal Parietal Junction (TPJ) in a broad range of cognitive functions like memory, attention, language, self-agency and theory of mind. Recent investigations have demonstrated the partition of the TPJ in discrete subsectors. Nonetheless, whether these subsectors play different roles or implement an overarching function remains debated. Here, based on a review of available evidence, we propose that the left TPJ codes both matches and mismatches between expected and actual sensory, motor, or cognitive events while the right TPJ codes mismatches. These operations help keeping track of statistical contingencies in personal, environmental, and conceptual space. We show that this hypothesis can account for the participation of the TPJ in disparate cognitive functions, including "humour", and explain: a) the higher incidence of spatial neglect in right brain damage;b) the different emotional reactions that follow left and right brain damage;c) the hemispheric lateralisation of optimistic bias mechanisms;d) the lateralisation of mechanisms that regulate routine and novelty behaviours. We propose that match and mismatch operations are aimed at approximating "free energy", in terms of the free energy principle of decision-making. By approximating "free energy", the match/mismatch TPJ system supports both information seeking to update one's own beliefs and the pleasure of being right in one's own' current choices. This renewed view of the TPJ has relevant clinical implications because the misfunctioning of TPJ-related "match" and "mismatch" circuits in unilateral brain damage can produce low-dimensional deficits of active-inference and predictive coding that can be associated with different neuropsychological disorders.(c) 2022 Elsevier B.V. All rights reserved.
The use of forward models (mechanisms that predict the future state of a system) is well established in cognitive and computational neuroscience. We compare and contrast two recent, but interestingly divergent, accoun...
详细信息
The use of forward models (mechanisms that predict the future state of a system) is well established in cognitive and computational neuroscience. We compare and contrast two recent, but interestingly divergent, accounts of the place of forward models in the human cognitive architecture. On the Auxiliary Forward Model (AFM) account, forward models are special-purpose prediction mechanisms implemented by additional circuitry distinct from core mechanisms of perception and action. On the Integral Forward Model (IFM) account, forward models lie at the heart of all forms of perception and action. We compare these neighbouring but importantly different visions and consider their implications for the cognitive sciences. We end by asking what kinds of empirical research might offer evidence favouring one or the other of these approaches.
A crude but commonly used technique for compressing ordered scientific data consists of simply retaining every sth datum (with a value of s = 10 generally the default) and discarding the remainder. Should the value of...
详细信息
A crude but commonly used technique for compressing ordered scientific data consists of simply retaining every sth datum (with a value of s = 10 generally the default) and discarding the remainder. Should the value of a discarded datum be required afterwards, an approximation is generated by linear interpolation of the two nearest retained values. Despite the widespread use of this and similar techniques, there is little by way of theoretical analysis of their expected performance. First, we quantify the accuracy achieved by linear interpolation when approximating values discarded by decimation, obtaining both deterministic bounds in terms of appropriate smoothness measures of the data and probabilistic bounds in terms of statistics of the data. Second, we investigate the efficiency of the lossless compression scheme consisting of decimation coupled with encoding of the interpolation errors. In particular, we bound the expected compression ratio in terms of the appropriate measures of the data. Finally, we provide numerical illustrations of the practical performance of the algorithm on some real datasets.
Video communication through wireless channels is still a challenging problem due to the limitations in bandwidth and the presence of channel errors. Since,many video sources are originally coded at a high rate and wit...
详细信息
Video communication through wireless channels is still a challenging problem due to the limitations in bandwidth and the presence of channel errors. Since,many video sources are originally coded at a high rate and without considering,the different channel conditions that may be encountered later, a means to repurpose this content for delivery over a dynamic wireless channel is needed. Transcoding is typically used to reduce the rate and change the format of the originally encoded video source to match network conditions and terminal capabilities. Given the existence of channel errors that can easily corrupt video quality, there is also the need to make the bitstream more resilient to transmission errors. In this article we provide an overview of the error resilience tools found in today's video coding standards and describe a variety of techniques that may be used to achieve error-resilient video transcoding.
Real-time semantic segmentation (SS) is a major task for various vision-based applications such as self-driving. Due to the limited computing resources and stringent performance requirements, streaming videos from cam...
详细信息
Real-time semantic segmentation (SS) is a major task for various vision-based applications such as self-driving. Due to the limited computing resources and stringent performance requirements, streaming videos from camera-embedded mobile devices to edge servers for SS is a promising approach. While there are increasing efforts on task-oriented video compression, most SS-applicable algorithms apply more uniform compression, as the sensitive regions are less obvious and concentrated. Such processing results in low compression performance and significantly limits the capacity of edge servers supporting real-time SS. In this paper, we propose STAC, a novel task-oriented DNN-driven video compressive streaming algorithm tailed for SS, to strike accuracy-bitrate balance and adapt to time-varying bandwidth. It exploits DNN's gradients as sensitivity metrics for fine-grained spatial adaptive compression and includes a temporal adaptive scheme that integrates spatial adaptation with predictive coding. Furthermore, we design a new bandwidth-aware neural network, serving as a compatible configuration tuner to fit time-varying bandwidth and content. STAC is evaluated in a system with a commodity mobile device and an edge server with real-world network traces. Experiments show that STAC can save up to 63.7-75.2% of bandwidth or improve accuracy by 3.1-9.5% compared to state-of-the-art algorithms, while capable of adapting to time-varying bandwidth.
Delta Modulation (DM) is a simple waveform coding algorithm used mostly when timely data delivery is more important than the transmitted data quality. While the implementation of DM is fairly simple and inexpensive, i...
详细信息
Delta Modulation (DM) is a simple waveform coding algorithm used mostly when timely data delivery is more important than the transmitted data quality. While the implementation of DM is fairly simple and inexpensive, it suffers from several limitations, such as slope overload and granular noise, which can be overcome using Adaptive Delta Modulation (ADM). This paper presents novel 2-digit ADM with six-level quantization using variable-length coding, for encoding the time-varying signals modelled by Laplacian distribution. Two variants of quantizer are employed, distortion-constrained quantizer which is optimally designed for minimal mean-squared error (MSE), and rate-constrained quantizer, which is suboptimal in the minimal MSE sense, but enables minimal loss in SQNR for the target bit rate. Experimental results using real speech signal are provided, indicating that the proposed configuration outperforms the baseline ADM algorithms, including Constant Factor Delta Modulation (CFDM), Continuously Variable Slope Delta Modulation (CVSDM), 2-digit and 2-bit ADM, and operates in a much wider dynamic range.
Growing percentage of the world population now uses image and video coding technologies on a regular basis. These technologies are behind the success and quick deployment of services and products such as digital pictu...
详细信息
Growing percentage of the world population now uses image and video coding technologies on a regular basis. These technologies are behind the success and quick deployment of services and products such as digital pictures, digital television, DVDs, and Internet video communications. Today's digital video coding paradigm represented by the ITU-T and MPEG standards mainly relies on a hybrid of block- based transform and interframe predictive coding approaches. In this coding framework, the encoder architecture has the task to exploit both the temporal and spatial redundancies present in the video sequence, which is a rather complex exercise. As a consequence, all standard video encoders have a much higher computational complexity than the decoder (typically five to ten times more complex), mainly due to the temporal correlation exploitation tools, notably the motion estimation process. This type of architecture is well-suited for applications where the video is encoded once and decoded many times, i.e., one-to-many topologies, such as broadcasting or video-on-demand, where the cost of the decoder is more critical than the cost of the encoder.
Real time transmission of image and video requires a high degree of processing and computing power. A new emerging technique called compressed sensing is used to address this issue and lower the sampling rate of signa...
详细信息
Real time transmission of image and video requires a high degree of processing and computing power. A new emerging technique called compressed sensing is used to address this issue and lower the sampling rate of signals. This paper presents an effective compressed sensing based prediction measurement (CSPM) encoder compatible for wireless multimedia sensor networks. CSPM encoding focuses on a significant reduction in data storage and saving in transmission energy. The compression performance of CSPM method is evaluated using metrics such as compression ratio and bit rate. The video is reconstructed by the orthogonal matching pursuit algorithm. The recovered video quality is analyzed by peak signal to noise ratio and structural similarity index. The transmission of encoded data is tested in real time environment using Telos B motes. The experimental results show that the CSPM encoding technique is able to deliver the video at good quality and achieve a high compression ratio of 90.7 % compared to conventional encoders.
In JPEG-LS, simple edge detection techniques are used in determining the predictive value of each pixel. These techniques only detect horizontal/vertical edges and have only been optimized for the prediction of pixels...
详细信息
In JPEG-LS, simple edge detection techniques are used in determining the predictive value of each pixel. These techniques only detect horizontal/vertical edges and have only been optimized for the prediction of pixels in the locality of such edges. Thus, JPEG-LS produces large prediction errors in the locality of diagonal edges. We propose a low complexity technique that accurately detects diagonal edges and efficiently predicts pixels, based on the information available within the standard predictive template of JPEG-LS. We show that the proposed technique outperforms JPEG-LS in terms of predicted mean squared error, by margins of up to 15%. (C) 2003 Elsevier B.V. All rights reserved.
暂无评论