检索结果-内蒙古大学图书馆

Picture Coding Symposium, PCS

作者： Henan Wang Hanxin Zhu Zhibo Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781665492584

Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods.

关键词： Image coding Redundancy Pipelines Neural networks Rate-distortion Light fields Encoding

来源：评论

学校读者我要写书评

暂无评论

Objective quality assessment for image retargeting based on hybrid distortion pooled model

Objective quality assessment for image retargeting based on ...

引用

International Workshop on Quality of Multimedia Experience, QoMEx

作者： Jianxin Lin Lingling Zhu Zhibo Chen Xiaoming Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781479989591

With the increasing popularity of mobile devices, there are more and more screens with heterogeneous resolutions. In order to solve the mismatching problem of images displaying on different screens, various image retargeting techniques have been proposed. However, little effective objective quality assessment metric for image retargeting has been proposed. In this paper, we propose an objective image retargeting quality assessment method based on Hybrid Distortion Pooled Model (HDPM) considering image local similarity, content information loss and image structural distortion. The proposed HDPM method measures the retargeted image's local similarity based on matching the similar block by Scale-Invariant Features Transform (SIFT) features and computing the corresponding blocks' similarity by structural similarity (SSIM). Furthermore, the image content information loss in retargeted image, which is regarded as the SIFT feature loss, is taken into account. Besides, we also consider image's structural distortion in the proposed method, which is based on GLCM (Gray-level co-occurrence matrix). To evaluate the effectiveness of the proposed method, extensive experiments have been conducted, and the results show improved consistency between the proposed HDPM method and the corresponding subjective evaluations.

关键词： Distortion Quality assessment Correlation Feature extraction Distortion measurement Image edge detection Image quality

来源：评论

学校读者我要写书评

暂无评论

Efficient Integer-Arithmetic-Only Convolutional Networks with Bounded ReLU

Efficient Integer-Arithmetic-Only Convolutional Networks wit...

引用

IEEE International Symposium on Circuits and systems (ISCAS)

作者： Hengrui Zhao Dong Liu Houqiang Li CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

To facilitate large-scale deployment of convolutional networks, integer-arithmetic-only inference has been demonstrated effective, which not only reduces computational cost but also ensures cross-platform consistency. However, previous studies on integer networks usually report a decline in the inference accuracy, given the same number of parameters as floating-point-number (FPN) networks. In this paper, we propose to finetune and quantize a well-trained FPN convolutional network to obtain an integer convolutional network. Our key idea is to adjust the upper bound of a bounded rectified linear unit (ReLU), which replaces the normal ReLU and effectively controls the dynamic range of activations. Based on the tradeoff between learning ability and quantization error of networks, we managed to preserve full accuracy after quantization and obtain efficient integer networks. Our experiments on ResNet for image classification demonstrate that our 8-bit integer networks achieve state-of-the-art performance compared with Google's TensorFlow and NVIDIA's TensorRT. Moreover, we experiment on VDSR for image super-resolution and on VRCNN for compression artifact reduction, both of which serve regression tasks that natively require high inference accuracy. Besides ensuring the equivalent performance as the corresponding FPN networks, our integer networks have only 1/4 memory cost and run 2× faster on GPUs.

关键词： Upper bound Quantization (signal) Image coding Superresolution Dynamic range Task analysis Image classification

来源：评论

学校读者我要写书评

暂无评论

Padding-Aware Learned Image Compression

Padding-Aware Learned Image Compression

引用

IEEE International Symposium on Circuits and systems (ISCAS)

作者： Haotian Zhang Junqi Liao Yiheng Jiang Li Li Dong Liu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

For current learned image compression methods, padding input images is necessary to meet the resolution requirements of down-sampling layers. However, the impact of padding has not been studied thoroughly. Most previous studies ignore padded images in the training process. In this paper, we analyze the impact of padding on compression performance. Then, we propose a padding-aware training (PAT) strategy, handling the padding effect during the training. Specifically, our PAT strategy calculates the loss of pre-padding image through a masking operation. Finally, according to our systematic experimental results, we find that images with different resolutions tend to favor different padding modes. Therefore, we further propose to conduct padding mode decision in the encoding process for rate-distortion optimization. Experiments demonstrate that our proposed PAT strategy and padding mode decision effectively compensate for the performance drop caused by padding.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Chain Code-Based Occupancy Map Coding for Video-Based Point Cloud Compression

Chain Code-Based Occupancy Map Coding for Video-Based Point ...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Runyu Yang Ning Yan Li Li Dong Liu Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

In video-based point cloud compression (V-PCC), occupancy map video is utilized to indicate whether a 2-D pixel corresponds to a valid 3-D point or not. In the current design of V-PCC, the occupancy map video is directly compressed losslessly with High Efficiency Video Coding (HEVC). However, the coding tools in HEVC are specifically designed for natural images, thus unsuitable for the occupancy map. In this paper, we present a novel quadtree-based scheme for lossless occupancy map coding. In this scheme, the occupancy map is firstly divided into several coding tree units (CTUs). Then, the CTU is divided into coding units (CUs) recursively using a quadtree. The quadtree partition is terminated when one of the three conditions is satisfied. Firstly, all the pixels have the same value. Secondly, the pixels in the CU only have two kinds of values and they can be separated by a continuous edge whose endpoints lie on the side of the CU. The continuous edge is then coded using chain code. Thirdly, the CU reaches the minimum size. This scheme simplifies the design of block partitioning in HEVC and designs simpler yet more effective coding tools. Experimental results show significant reduction of bit-rate and complexity compared with the occupancy map coding scheme in V-PCC. In addition, this scheme is also very efficient to compress the semantic map.

关键词： Encoding Three-dimensional displays Image coding Semantics Markov processes geometry Copper

来源：评论

学校读者我要写书评

暂无评论

Improving Compression Artifact Reduction via End-to-End Learning of Side information

Improving Compression Artifact Reduction via End-to-End Lear...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Haichuan Ma Dong Liu Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

We propose to improve neural network-based compression artifact reduction by transmitting side information for the neural network. The side information consists of artifact descriptors that are obtained by analyzing the original and compressed images in the encoder. In the decoder, the received descriptors are used as additional input to a well-designed conditional post-processing neural network. To reduce the transmission overhead, the entire model is optimized under the rate-distortion constraint via end-to-end learning. Experimental results show that introducing the side information greatly improves the ability of the post-processing neural network, and improves the rate-distortion performance.

关键词： Image coding Neural networks Decoding Training Feature extraction Computational modeling Transform coding

来源：评论

学校读者我要写书评

暂无评论

LEARNED SCALABLE IMAGE COMPRESSION WITH BIDIRECTIONAL CONTEXT DISENTANGLEMENT NETWORK

arXiv

引用

arXiv 2018年

作者： Zhang, Zhizheng Chen, Zhibo Lin, Jianxin Li, Weiping CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

In this paper, we propose a learned scalable/progressive image compression scheme based on deep neural networks (DNN), named Bidirectional Context Disentanglement Network (BCD-Net). For learning hierarchical representations, we first adopt bit-plane decomposition to decompose the information coarsely before the deep-learning-based transformation. However, the information carried by different bit-planes is not only unequal in entropy but also of different importance for reconstruction. We thus take the hidden features corresponding to different bit-planes as the context and design a network topology with bidirectional flows to disentangle the contextual information for more effective compressed representations. Our proposed scheme enables us to obtain the compressed codes with scalable rates via a one-pass encoding-decoding. Experiment results demonstrate that our proposed model outperforms the state-of-the-art DNN-based scalable image compression methods in both PSNR and MS-SSIM metrics. In addition, our proposed model achieves better performance in MS-SSIM metric than conventional scalable image codecs. Effectiveness of our technical components is also verified through sufficient ablation experiments. Copyright © 2018, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Visualization of user interests in online music services

Visualization of user interests in online music services

引用

IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

作者： Jingxian Zhang Dong Liu CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781479947164

Online music services have been popular for end users to obtain music, where user interests, as reflected by their downloading records, are crucial for service providers to understand users and thus to provide personalization. However, the raw downloading records are of huge volume and difficult to analyze intuitively. We study a visualization approach to analyzing downloading records so as to present user interests. To reveal the underlying relevance between music tracks, we utilized not only the metadata of music (especially genres), but also collaborative relevance that is voted by users. To present time varying user interests, we designed several new figures, namely Bean plot, Instrument plot, and Transitional Pie plot, that are capable in displaying different aspects of user interests variation. We have performed experiments with a real-world data set, and the results show the effectiveness of our proposed visualization method. Our work is also inspiring for visualization of time varying data in other applications.

关键词： Instruments Collaboration Color Statistical distributions Data visualization Neck Educational institutions

来源：评论

学校读者我要写书评

暂无评论

Deeply Exploit Depth information for Object Detection

Deeply Exploit Depth Information for Object Detection

引用

IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

作者： Saihui Hou Zilei Wang Feng Wu CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

This paper addresses the issue on how to more effectively coordinate the depth with RGB aiming at boosting the performance of RGB-D object detection. Particularly, we investigate two primary ideas under the CNN model: property derivation and property fusion. Firstly, we propose that the depth can be utilized not only as a type of extra information besides RGB but also to derive more visual properties for comprehensively describing the objects of interest. So a two-stage learning framework consisting of property derivation and fusion is constructed. Here the properties can be derived either from the provided color/depth or their pairs (e.g. the geometry contour adopted in this paper). Secondly, we explore the fusion method of different properties in feature learning, which is boiled down to, under the CNN model, from which layer the properties should be fused together. The analysis shows that different semantic properties should be learned separately and combined before passing into the final classifier. Actually, such a detection way is in accordance with the mechanism of the primary neural cortex (V1) in brain. We experimentally evaluate the proposed method on the challenging dataset, and have achieved state-of-the-art performance.

关键词： Object detection Visualization Image color analysis Feature extraction geometry Gravity Computer vision

来源：评论

学校读者我要写书评

暂无评论

Fast genetic multi-operator image retargeting

Fast genetic multi-operator image retargeting

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Lingling Zhu Zhibo Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781509053179

Content-aware image retargeting has attracted substantial research interests in the related research community. However, so far there is still no method can adequately preserve important image contents and structure well without introducing conspicuous visible deformation in a relatively short period of time. To address this problem, we propose a Fast Genetic Multi-operator (FGM) method which integrates multiple retargeting operators. To improve the efficiency, FGM method utilizes Genetic Algorithms (GAs) to reach the optimal operator ratio, which adopts saliency and Gray-Level Co-occurrence Matrix (GLCM) as its energy function. FGM method not only can well preserve salient contents and structure, but also can greatly reduce the computational complexity. Experimental results demonstrated that our method outperforms state-of-art image retargeting methods.

关键词： Genetic algorithms Biological cells Genetics Distortion Computational complexity Visualization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：