检索结果-内蒙古大学图书馆

IEEE Visual Communications and Image processing (VCIP)

作者： Wei Zhou Zhibo Chen CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

In recent years, deep learning has achieved promising success for multimedia quality assessment, especially for image quality assessment (IQA). However, since there exist more complex temporal characteristics in videos, very little work has been done on video quality assessment (VQA) by exploiting powerful deep convolutional neural networks (DCNNs). In this paper, we propose an efficient VQA method named Deep SpatioTemporal video Quality assessor (DeepSTQ) to predict the perceptual quality of various distorted videos in a no-reference manner. In the proposed DeepSTQ, we first extract local and global spatiotemporal features by pre-trained deep learning models without fine-tuning or training from scratch. The composited features consider distorted video frames as well as frame difference maps from both global and local views. Then, the feature aggregation is conducted by the regression model to predict the perceptual video quality. Finally, experimental results demonstrate that our proposed DeepSTQ outperforms state-of-the-art quality assessment algorithms.

关键词： Quality assessment Video recording Feature extraction Spatiotemporal phenomena Databases Indexes Deep learning

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Nonlinear Transform for HEVC Intra Coding

Deep Learning-Based Nonlinear Transform for HEVC Intra Codin...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Kun Yang Dong Liu Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

In the hybrid video coding framework, transform is adopted to exploit the dependency within the input signal. In this paper, we propose a deep learning-based nonlinear transform for intra coding. Specifically, we incorporate the directional information into the residual domain. Then, a convolutional neural network model is designed to achieve better decorrelation and energy compaction than the conventional discrete cosine transform. This work has two main contributions. First, we propose to use the intra prediction signal to reduce the directionality in the residual. Second, we present a novel loss function to characterize the efficiency of the transform during the training. To evaluate the compression performance of the proposed transform, we implement it into the High Efficiency Video Coding reference software. Experimental results demonstrate that the proposed method achieves up to 1.79% BD-rate reduction for natural videos.

关键词： Transforms Discrete cosine transforms Neural networks Transform coding Video coding Image coding Decoding

来源：评论

学校读者我要写书评

暂无评论

Improving Compression Artifact Reduction via End-to-End Learning of Side information

Improving Compression Artifact Reduction via End-to-End Lear...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Haichuan Ma Dong Liu Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

We propose to improve neural network-based compression artifact reduction by transmitting side information for the neural network. The side information consists of artifact descriptors that are obtained by analyzing the original and compressed images in the encoder. In the decoder, the received descriptors are used as additional input to a well-designed conditional post-processing neural network. To reduce the transmission overhead, the entire model is optimized under the rate-distortion constraint via end-to-end learning. Experimental results show that introducing the side information greatly improves the ability of the post-processing neural network, and improves the rate-distortion performance.

关键词： Image coding Neural networks Decoding Training Feature extraction Computational modeling Transform coding

来源：评论

学校读者我要写书评

暂无评论

Chain Code-Based Occupancy Map Coding for Video-Based Point Cloud Compression

Chain Code-Based Occupancy Map Coding for Video-Based Point ...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Runyu Yang Ning Yan Li Li Dong Liu Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728180687

ISBN: (纸本)9781728180694

In video-based point cloud compression (V-PCC), occupancy map video is utilized to indicate whether a 2-D pixel corresponds to a valid 3-D point or not. In the current design of V-PCC, the occupancy map video is directly compressed losslessly with High Efficiency Video Coding (HEVC). However, the coding tools in HEVC are specifically designed for natural images, thus unsuitable for the occupancy map. In this paper, we present a novel quadtree-based scheme for lossless occupancy map coding. In this scheme, the occupancy map is firstly divided into several coding tree units (CTUs). Then, the CTU is divided into coding units (CUs) recursively using a quadtree. The quadtree partition is terminated when one of the three conditions is satisfied. Firstly, all the pixels have the same value. Secondly, the pixels in the CU only have two kinds of values and they can be separated by a continuous edge whose endpoints lie on the side of the CU. The continuous edge is then coded using chain code. Thirdly, the CU reaches the minimum size. This scheme simplifies the design of block partitioning in HEVC and designs simpler yet more effective coding tools. Experimental results show significant reduction of bit-rate and complexity compared with the occupancy map coding scheme in V-PCC. In addition, this scheme is also very efficient to compress the semantic map.

关键词： Encoding Three-dimensional displays Image coding Semantics Markov processes geometry Copper

来源：评论

学校读者我要写书评

暂无评论

Improving Semantic Segmentation via Label Propagation and Temporal Consistency

Improving Semantic Segmentation via Label Propagation and Te...

引用

201. IEEE International Conference on Signal, information and Data {1., ICSIDP 201.

作者： Qin, Feiyu Cao, Lumeng Chen, Xuejin CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781728123455

Semantic segmentation is a fundamental task in indoor scene understanding. Most previous supervised approaches rely on densely annotated image data sets. Due to the limited amount of images with segmentation labels, the performance of existing networks is greatly limited. In this paper, we exploit temporal correlation in video frames to improve the performance and robustness of segmentation networks. Two effective learning strategies are proposed to propagate the information from a few labeled frames to their immediate neighbor frames. First, we scale up training dataset for supervised semantic segmentation networks by generating pseudo ground-truth for neighboring frames from a labeled frame using filtered homography transformation. Furthermore, we introduce a self-supervised loss function to ensure temporal consistency between the segmentation results of adjacent frames. The experimental results demonstrate that our proposed method outperforms state-of-the-art techniques for semantic segmentation on NYU-Depth V2 dataset. © 201. IEEE.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

Human Machine Joint Decision Making in Distorted Surveillance Scenario 2

Human Machine Joint Decision Making in Distorted Surveillanc...

引用

2nd China Symposium on Cognitive Computing and Hybrid Intelligence, CCHI 201.

作者： Liu, Sen Zhao, Shuxin Pang, Yingxue Chen, Zhibo CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781728140919

There is plenty of human-machine joint decision-making scenarios in the real world applications, such as driving assistant, suspect identification, medical diagnosis, etc. Existing algorithms propose that machine should give a rejection option when having a high risk or uncertainty score so that the input can be passed to human to make the decision. This is an interesting algorithmic model of human-machine collaboration, but implicitly assumes that humans are more trustworthy than machines. Such an assumption ignores the bias and inconsistency of human, especially in scenarios where machines have superior recognition ability than humans. In this work, we investigate the human-machine joint decision-making problem in distorted surveillance videos, where machines experimentally prove to be comparable to human beings in tolerance to distortion, sometimes even stronger. We propose a new human-machine joint decision-making framework by considering both the confidences of machine and human. To obtain the confidence of human, we build a real-life human decision-making database and propose a deep neural network to estimate human's confidence. Then, confidence alignment method and decision rule are proposed to further output the final decision. Experiments demonstrate that the proposed framework can make less human intervention and more accurate decisions in several human-machine joint decision-making scenarios. © 201. IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

M-LVC: Multiple Frames Prediction for Learned Video Compression

M-LVC: Multiple Frames Prediction for Learned Video Compress...

引用

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Jianping Lin Dong Liu Houqiang Li Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728171685

ISBN: (纸本)9781728171692

We propose an end-to-end learned video compression scheme for low-latency scenarios. Previous methods are limited in using the previous one frame as reference. Our method introduces the usage of the previous multiple frames as references. In our scheme, the motion vector (MV) field is calculated between the current frame and the previous one. With multiple reference frames and associated multiple MV fields, our designed network can generate more accurate prediction of the current frame, yielding less residual. Multiple reference frames also help generate MV prediction, which reduces the coding cost of MV field. We use two deep auto-encoders to compress the residual and the MV, respectively. To compensate for the compression error of the auto-encoders, we further design a MV refinement network and a residual refinement network, taking use of the multiple reference frames as well. All the modules in our scheme are jointly optimized through a single rate-distortion loss function. We use a step-by-step training strategy to optimize the entire scheme. Experimental results show that the proposed method outperforms the existing learned video compression methods for low-latency mode. Our method also performs better than H.265 in both PSNR and MS-SSIM. Our code and models are publicly available.

关键词： Video compression Image coding Motion compensation Entropy Encoding Motion estimation Transforms

来源：评论

学校读者我要写书评

暂无评论

Edge-Guided Panoramic Video Stitching with Limited Overlap

Edge-Guided Panoramic Video Stitching with Limited Overlap

引用

201. IEEE International Conference on Signal, information and Data {1., ICSIDP 201.

作者： Xie, Chaoyu Chen, Xuejin CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781728123455

Video stitching remains a challenging problem in computer vision. In this paper, we propose a novel edge-guided method to stitch multiple videos that have small overlapped regions. Our algorithm consists of three steps: (1. spherical projection of the input video frames based on camera calibration, (2) edge detection and edge-guided feature matching for video registration, and (3) seam optimization to eliminate distortions and ghosts in the composited panoramic videos. The experimental results and user studies demonstrate that our method is robust to videos that have small overlapped regions and produces more visually pleasing panoramic videos than state-of-the-art techniques. © 201. IEEE.

关键词： Edge detection

来源：评论

学校读者我要写书评

暂无评论

Semantically Scalable Image Coding With Compression of Feature Maps

Semantically Scalable Image Coding With Compression of Featu...

引用

IEEE International Conference on Image processing

作者： Ning Yan Dong Liu Houqiang Li Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (数字)9781728163956

ISBN: (纸本)9781728163963

In this paper, we consider a novel image coding paradigm, termed semantically scalable coding. In the new paradigm, coded bitstream serves for multiple different semantic analysis tasks, and different tasks require different semantic granularities of the image. Thus, the bitstream is designed to be scalable in the sense that progressive decoding of the bitstream provides coarse-to-fine semantic granularities. As a concrete example, we consider the task of coarse-grained and fine-grained image classification. We present a method to compress the multiple deep feature maps that are intermediate representations of an image passing a trained deep network. The deep-layer feature maps can serve for coarse-grained image classification while the shallow-layer feature maps can serve for fine-grained image classification. Experimental results demonstrate the feasibility of the proposed method, as well as the advantage of the semantically scalable coding paradigm.

关键词： Image coding Task analysis Semantics Visualization Correlation Decoding Encoding

来源：评论

学校读者我要写书评

暂无评论

MSight: An Edge-Cloud Infrastructure-based Perception system for Connected Automated Vehicles

arXiv

引用

arXiv 2023年

作者： Zhang, Rusheng Meng, Depu Shen, Shengyin Zou, Zhengxia Li, Houqiang Liu, Henry X. The Department of Civil and Environmental Engineering University of Michigan Ann ArborMI48109 United States The University of Michigan Transportation Research Institude 2901 Baxer Rd Ann ArborMI48109 United States The Department of Guidance Navigation and Control School of Astronautics Beihang University Beijing100191 China The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei China Mcity University of Michigan Ann ArborMI48109 United States

As vehicular communication and networking technologies continue to advance, infrastructure-based roadside perception emerges as a pivotal tool for connected automated vehicle (CAV) applications. Due to their elevated positioning, roadside sensors, including cameras and lidars, often enjoy unobstructed views with diminished object occlusion. This provides them a distinct advantage over onboard perception, enabling more robust and accurate detection of road objects. This paper presents MSight, a cutting-edge roadside perception {1. specifically designed for CAVs. MSight offers real-time vehicle detection, localization, tracking, and short-term trajectory prediction. Evaluations underscore the {1.’s capability to uphold lane-level accuracy with minimal latency, revealing a range of potential applications to enhance CAV safety and efficiency. Presently, MSight operates 24/7 at a two-lane roundabout in the City of Ann Arbor, Michigan. Copyright © 2023, The Authors. All rights reserved.

关键词： Roadsides

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：