To facilitate large-scale deployment of convolutional networks, integer-arithmetic-only inference has been demonstrated effective, which not only reduces computational cost but also ensures cross-platform consistency....
详细信息
To facilitate large-scale deployment of convolutional networks, integer-arithmetic-only inference has been demonstrated effective, which not only reduces computational cost but also ensures cross-platform consistency. However, previous studies on integer networks usually report a decline in the inference accuracy, given the same number of parameters as floating-point-number (FPN) networks. In this paper, we propose to finetune and quantize a well-trained FPN convolutional network to obtain an integer convolutional network. Our key idea is to adjust the upper bound of a bounded rectified linear unit (ReLU), which replaces the normal ReLU and effectively controls the dynamic range of activations. Based on the tradeoff between learning ability and quantization error of networks, we managed to preserve full accuracy after quantization and obtain efficient integer networks. Our experiments on ResNet for image classification demonstrate that our 8-bit integer networks achieve state-of-the-art performance compared with Google's TensorFlow and NVIDIA's TensorRT. Moreover, we experiment on VDSR for image super-resolution and on VRCNN for compression artifact reduction, both of which serve regression tasks that natively require high inference accuracy. Besides ensuring the equivalent performance as the corresponding FPN networks, our integer networks have only 1/4 memory cost and run 2× faster on GPUs.
Deep learning (DL) methods have shown great potential for remote sensing image change detection recently, but still suffer from several limitations. Within the identical semantic concept, significant but irrelevant ch...
详细信息
Recently, reinforced adaptive bitrate (ABR) algorithms have achieved remarkable success in tile-based 360-degree video streaming. However, they heavily rely on accurate viewport prediction. To alleviate this issue, we...
详细信息
ISBN:
(纸本)9781728173221
Recently, reinforced adaptive bitrate (ABR) algorithms have achieved remarkable success in tile-based 360-degree video streaming. However, they heavily rely on accurate viewport prediction. To alleviate this issue, we propose a hierarchical reinforcement-learning (RL) based ABR algorithm, dubbed 360HRL. Specifically, 360HRL consists of a top agent and a bottom agent. The former is used to decide whether to download a new segment for continuous playback or re-download an old segment for correcting wrong bitrate decisions caused by inaccurate viewport estimation, and the latter is used to select bitrates for tiles in the chosen segment. In addition, 360HRL adopts a two-stage training methodology. In the first stage, the bottom agent is trained under the environment where the top agent always chooses to download a new segment. In the second stage, the bottom agent is fixed and the top agent is optimized with the help of a heuristic decision rule. Experimental results demonstrate that 360HRL outperforms existing RL-based ABR algorithms across a broad of network conditions and quality of experience (QoE) objectives.
We have witnessed the rapid development of learned image compression (LIC). The latest LIC models have outperformed almost all traditional image compression standards in terms of rate-distortion (RD) performance. Howe...
详细信息
ISBN:
(纸本)9781728173221
We have witnessed the rapid development of learned image compression (LIC). The latest LIC models have outperformed almost all traditional image compression standards in terms of rate-distortion (RD) performance. However, the time complexity of LIC model is still underdiscovered, limiting the practical applications in industry. Even with the acceleration of GPU, LIC models still struggle with long coding time, especially on the decoder side. In this paper, we analyze and test a few prevailing and representative LIC models, and compare their complexity with traditional codecs including H.265/HEVC intra and H.266/VVC intra. We provide a comprehensive analysis on every module in the LIC models, and investigate how bitrate changes affect coding time. We observe that the time complexity bottleneck mainly exists in entropy coding and context modelling. Although this paper pay more attention to experimental statistics, our analysis reveals some insights for further acceleration of LIC model, such as model modification for parallel computing, model pruning and a more parallel context model.
Many multi-dimensional (M-D) graph signals appear in the real world, such as digital images, sensor network measurements and temperature records from weather observation stations. It is a key challenge to design a tra...
详细信息
Graph signal processing (GSP) has emerged as a powerful framework for analyzing data on irregular domains. In recent years, many classical techniques in signal processing (SP) have been successfully extended to GSP. A...
详细信息
Domain generalization in person re-identification is a highly important meaningful and practical task in which a model trained with data from several source domains is expected to generalize well to unseen target doma...
详细信息
Geometry-based point cloud compression (G-PCC) can achieve remarkable compression efficiency for point clouds. However, it still leads to serious attribute compression artifacts, especially under low bitrate scenarios...
详细信息
Volumetric image compression has become an urgent task to effectively transmit and store images produced in biological research and clinical practice. At present, the most commonly used volumetric image compression me...
详细信息
Existing blind image quality assessment (BIQA) methods are mostly designed in a disposable way and cannot evolve with unseen distortions adaptively, which greatly limits the deployment and application of BIQA models i...
详细信息
暂无评论