检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Wang, Henan Zhu, Hanxin Chen, Zhibo CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods. Copyright © 2024, The Authors. All rights reserved.

关键词： Redundancy

来源：评论

学校读者我要写书评

暂无评论

M-LVC: Multiple frames prediction for learned video compression

arXiv

引用

arXiv 2020年

作者： Lin, Jianping Liu, Dong Li, Houqiang Wu, Feng Cas Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027

We propose an end-to-end learned video compression scheme for low-latency scenarios. Previous methods are limited in using the previous one frame as reference. Our method introduces the usage of the previous multiple frames as references. In our scheme, the motion vector (MV) field is calculated between the current frame and the previous one. With multiple reference frames and associated multiple MV fields, our designed network can generate more accurate prediction of the current frame, yielding less residual. Multiple reference frames also help generate MV prediction, which reduces the coding cost of MV field. We use two deep auto-encoders to compress the residual and the MV, respec-tively. To compensate for the compression error of the autoencoders, we further design a MV refinement network and a residual refinement network, taking use of the multiple ref-erence frames as well. All the modules in our scheme are jointly optimized through a single rate-distortion loss func-tion. We use a step-by-step training strategy to optimize the entire scheme. Experimental results show that the proposed method outperforms the existing learned video compression methods for low-latency mode. Our method also performs better than H.265 in both PSNR and MS-SSIM. Our code and models are publicly available. Copyright © 2020, The Authors. All rights reserved.

关键词： Image compression

来源：评论

学校读者我要写书评

暂无评论

Quality Assessment of Stereoscopic 360-degree Images from Multi-viewports

Quality Assessment of Stereoscopic 360-degree Images from Mu...

引用

Picture Coding Symposium, PCS

作者： Jiahua Xu Ziyuan Luo Wei Zhou Wenyuan Zhang Zhibo Chen CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex aspects are involved in 3D omnidirectional IQA, especially unlimited field of view (FoV) and extra depth perception, which brings difficulty to evaluate the quality of experience (QoE) of 3D omnidirectional images. In this paper, we propose a multi-viewport based full-reference stereo 360 IQA model. Due to the freely changeable viewports when browsing in the head-mounted display, our proposed approach processes the image inside FoV rather than the projected one such as equirectangular projection (ERP). In addition, since overall QoE depends on both image quality and depth perception, we utilize the features estimated by the difference map between left and right views which can reflect disparity. The depth perception features along with binocular image qualities are employed to further predict the overall QoE of 3D 360 images. The experimental results on our public Stereoscopic OmnidirectionaL Image quality assessment Database (SOLID) show that the proposed method achieves a significant improvement over some well-known IQA metrics and can accurately reflect the overall QoE of perceived images.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Convolutional Neural Network-Based Residue Super-Resolution for Video Coding

Convolutional Neural Network-Based Residue Super-Resolution ...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Kang Liu Dong Liu Houqiang Li Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781538644591;9781538644584

Inspired by the progress of image and video super-resolution (SR) achieved by convolutional neural network (CNN), we propose a CNN-based residue SR method for video coding. Different from the previous works that operate in the pixel domain, i.e. down- and up-sampling of image or video frame, we propose to perform down- and up-sampling in the residue domain. Specifically, for each block, we perform motion estimation and compensation to achieve residual signal at the original resolution, then we down-sample the residue and compress it at low resolution, and perform residue SR using a trained CNN model. We design a new CNN for residue SR with the help of the motion compensated prediction signal. We integrate the residue SR method into the High Efficiency Video Coding (HEVC) scheme, providing mode decision at the level of coding tree unit. Experimental results show that our method achieves on average 4.0% and 2.8% BD-rate reduction under low-delay P and low-delay B configurations, respectively.

关键词： Encoding Signal resolution Video coding spatial resolution Delays Convolution

来源：评论

学校读者我要写书评

暂无评论

Image annotation via social diffusion analysis with common interests

Image annotation via social diffusion analysis with common i...

引用

IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

作者： Chenyi Lei Dong Liu CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

Automatic annotation of images is of crucial importance in image retrieval and management systems. Most of the existing annotation methods rely on content-based approach to annotation, whose effectiveness is restricted due to the semantic gap between low-level features and semantic annotations, as well as the irrelevance between annotations and image content. Recently, social media analysis has been investigated for image annotation. Inspired by the abundant social diffusion records of images in online social networks, we propose a novel image annotation approach based on social diffusion analysis. We present a common-interest model to interpret social diffusion, i.e. different images have different social diffusion routes due to the preferences of users, and such preferences are represented as common interests of pairwise users rather than personalized interests. We propose an image annotation framework that consists of learning of common interests, feature extraction from social diffusion records, and automatic annotation by learning to rank. Experimental results on a real-world dataset show that our proposed approach outperforms content-based and user-preference-based annotation methods.

关键词： Abstracts Vectors Positron emission tomography

来源：评论

学校读者我要写书评

暂无评论

Objective quality assessment for image retargeting based on hybrid distortion pooled model

Objective quality assessment for image retargeting based on ...

引用

International Workshop on Quality of Multimedia Experience, QoMEx

作者： Jianxin Lin Lingling Zhu Zhibo Chen Xiaoming Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781479989591

With the increasing popularity of mobile devices, there are more and more screens with heterogeneous resolutions. In order to solve the mismatching problem of images displaying on different screens, various image retargeting techniques have been proposed. However, little effective objective quality assessment metric for image retargeting has been proposed. In this paper, we propose an objective image retargeting quality assessment method based on Hybrid Distortion Pooled Model (HDPM) considering image local similarity, content information loss and image structural distortion. The proposed HDPM method measures the retargeted image's local similarity based on matching the similar block by Scale-Invariant Features Transform (SIFT) features and computing the corresponding blocks' similarity by structural similarity (SSIM). Furthermore, the image content information loss in retargeted image, which is regarded as the SIFT feature loss, is taken into account. Besides, we also consider image's structural distortion in the proposed method, which is based on GLCM (Gray-level co-occurrence matrix). To evaluate the effectiveness of the proposed method, extensive experiments have been conducted, and the results show improved consistency between the proposed HDPM method and the corresponding subjective evaluations.

关键词： Distortion Quality assessment Correlation Feature extraction Distortion measurement Image edge detection Image quality

来源：评论

学校读者我要写书评

暂无评论

Hybrid transform for HEVC-based lossless coding

Hybrid transform for HEVC-based lossless coding

引用

IEEE International Symposium on Circuits and systems (IScas)

作者： Fangdong Chen Jinlei Zhang Houqiang Li CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781479934331

The High Efficiency Video Coding (HEVC) with the transform bypass mode is simple but inefficient for lossless coding. For this reason, we propose a novel transform to further eliminate the redundancy between residues of different blocks in intra prediction. Dependent on intra prediction modes, the proposed transform is adaptable to exploit correlations of residues formed by different modes. In order to accurately obtain parameters of the transform matrix, an approach similar to the Wiener filtering method is adopted. Experimental results show that on top of the lossless coding mode in HEVC, our method offers the performance with a 7.4% bit-rate reduction on average for All Intra Main configuration. Compared with other representative algorithms, our proposal still shows an improvement in the compression ratio, without substantial increases of computational complexity in the encoder or decoder.

关键词： Encoding Transforms Video coding Redundancy Prediction algorithms Proposals Standards

来源：评论

学校读者我要写书评

暂无评论

Deep Local and Global Spatiotemporal Feature Aggregation for Blind Video Quality Assessment

Deep Local and Global Spatiotemporal Feature Aggregation for...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Wei Zhou Zhibo Chen CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

In recent years, deep learning has achieved promising success for multimedia quality assessment, especially for image quality assessment (IQA). However, since there exist more complex temporal characteristics in videos, very little work has been done on video quality assessment (VQA) by exploiting powerful deep convolutional neural networks (DCNNs). In this paper, we propose an efficient VQA method named Deep SpatioTemporal video Quality assessor (DeepSTQ) to predict the perceptual quality of various distorted videos in a no-reference manner. In the proposed DeepSTQ, we first extract local and global spatiotemporal features by pre-trained deep learning models without fine-tuning or training from scratch. The composited features consider distorted video frames as well as frame difference maps from both global and local views. Then, the feature aggregation is conducted by the regression model to predict the perceptual video quality. Finally, experimental results demonstrate that our proposed DeepSTQ outperforms state-of-the-art quality assessment algorithms.

关键词： Quality assessment Video recording Feature extraction Spatiotemporal phenomena Databases Indexes Deep learning

来源：评论

学校读者我要写书评

暂无评论

Tensor oriented no-reference light field image quality assessment

arXiv

引用

arXiv 2019年

作者： Zhou, Wei Shi, Likun Chen, Zhibo Cas Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027

Light field image (LFI) quality assessment is becoming more and more important, which helps to better guide the acquisition, processing and application of immersive media. However, due to the inherent high dimensional characteristics of LFI, the LFI quality assessment turns into a multi-dimensional problem that requires consideration of the quality degradation in both spatial and angular dimensions. Therefore, we propose a novel Tensor oriented No-reference Light Field image Quality evaluator (Tensor-NLFQ) based on tensor theory. Specifically, since the LFI is regarded as a low-rank 4D tensor, the principle components of four oriented sub-aperture view stacks are obtained via Tucker decomposition. Then, the Principal Component spatial Characteristic (PCSC) is designed to measure the spatial-dimensional quality of LFI considering its global naturalness and local frequency properties. Finally, the Tensor Angular Variation Index (TAVI) is proposed to measure angular consistency quality by analyzing the structural similarity distribution between the first principal component and each view in the view stack. Extensive experimental results on four publicly available LFI quality databases demonstrate that the proposed Tensor-NLFQ model outperforms state-of-the-art 2D, 3D, multi-view, and LFI quality assessment algorithms. Copyright © 2019, The Authors. All rights reserved.

关键词： Tensors

来源：评论

学校读者我要写书评

暂无评论

Combining directional intra prediction and intra block copy with block partition for HEVC

Combining directional intra prediction and intra block copy ...

引用

IEEE International Conference on Image processing

作者： Yue Li Li Li Dong Liu Houqiang Li Feng Wu CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

The directional intra prediction (DIP) modes in HEVC are capable of predicting local continuous image features. Recently, intra block copy (IBC) is proposed for screen content coding, aiming at predicting non-local recurrent image features. For natural video, we observe that recurrent features are often irregular and not aligned with blocks. Thus, we propose a combination of DIP and IBC with block partition for better intra prediction, where one block can be divided into several partitions, each of which may choose between DIP and IBC. We study an intra prediction scheme with the proposed combination, especially the rate-distortion optimization and entropy coding in the scheme. Preliminary experimental results show that the proposed combined intra prediction achieves as high as 5.8% bit-rate saving compared to HEVC anchor.

关键词： Electronics packaging Rate-distortion Optimization Entropy coding Motion estimation Complexity theory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：