检索结果-内蒙古大学图书馆

The Next Frontier For MPEG-5 LCEVC: From HDR and Immersive Video to the Metaverse

IEEE MULTIMEDIA 2022年第4期29卷 111-122页

作者： Ferrara, Simone Ciccarelli, Lorenzo Moreno, Amaya Jimenez Zhao, Shiruo Joshi, Yetish Meardi, Guido Battista, Stefano Grois, Dan V Nova Ltd London W2 6LG England Univ Politecn Marche IT-60100 Ancona Italy

In 2021, the newest MPEG standard was published as MPEG-5 low complexity enhancement video coding (LCEVC). Contrary to typical video codecs, LCEVC is an enhancement codec, meaning it works in combination with other codecs, to produce a more efficiently compressed video. Thanks to its simplified architecture, it is designed to be deployed as a software enhancer, which uses hardware blocks more efficiently. Despite being relatively new, it has already been adopted for a major next-gen television system (TV 3.0 in Brazil) and is being deployed across a full spectrum of applications, from broadcast to broadband. In this article we are focusing on future applications of LCEVC, from high dynamic range, 8K, and immersive video to metaverse, explaining how this new standard can make a positive impact on these applications.

关键词： MPEG standards Video coding TV Metaverse transform coding Encoding

来源：评论

学校读者我要写书评

暂无评论

4D Epanechnikov Mixture Regression in LF Image Compression

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2022年第6期32卷 3906-3922页

作者： Liu, Boning Zhao, Yan Jiang, Xiaomeng Wang, Shigang Wei, Jian Jilin Univ Coll Commun Engn Changchun 130012 Peoples R China Jilin Univ Coll Math Changchun 130012 Peoples R China

With the emergence of light field imaging in recent years, the compression of its elementary image array (EIA) has become a significant problem. Our coding framework includes modeling and reconstruction. For the modeling, the covariance-matrix form of the 4D Epanechnikov kernel (4D EK) and its correlated statistics were deduced to obtain the 4-D Epanechnikov mixture models (4-D EMMs). A 4D Epanechnikov mixture regression (4D EMR) was proposed based on this 4D EK, and a 4D adaptive model selection (4D AMLS) algorithm was designed to realize the optimal modeling for a pseudo video sequence (PVS) of the extracted key-EIA. A linear function based reconstruction (LFBR) was proposed based on the correlation between adjacent elementary images (EIs). The decoded images realized a clear outline reconstruction and superior coding efficiency compared to high-efficiency video coding (HEVC) and JPEG 2000 below approximately 0.05 bpp. This work realized an unprecedented theoretical application by (1) proposing the 4D Epanechnikov kernel theory, (2) exploiting the 4D Epanechnikov mixture regression and its application in the modeling of the pseudo video sequence of light field images, (3) using 4D adaptive model selection for the optimal number of models, and (4) employing a linear function-based reconstruction according to the content similarity.

关键词： Image coding Kernel Image reconstruction Adaptation models transform coding Prediction algorithms Video sequences 4D Epanechnikov kernel 4D Epanechnikov mixture model 4D Epanechnikov mixture regression light field image coding Kernel method

来源：评论

学校读者我要写书评

暂无评论

MPEG Immersive Video coding Standard

引用

PROCEEDINGS OF THE IEEE 2021年第9期109卷 1521-1536页

作者： Boyce, Jill M. Dore, Renaud Dziembowski, Adrian Fleureau, Julien Jung, Joel Kroon, Bart Salahieh, Basel Vadakital, Vinod Kumar Malamal Yu, Lu Intel Corp Hillsboro OR 97124 USA Interdigital F-35576 Cesson Sevigne France Poznan Univ Tech Inst Multimedia Telecommun PL-60965 Poznan Poland Tencent MediaLab Palo Alto CA 94306 USA Philips Res Eindhoven NL-5656 AE Eindhoven Netherlands Intel Corp Santa Clara CA 95054 USA Nokia Technologies Tampere 33100 Finland Zhejiang Univ Hangzhou 310027 Peoples R China

This article introduces the ISO/IEC MPEG Immersive Video (MIV) standard, MPEG-I Part 12, which is undergoing standardization. The draft MIV standard provides support for viewing immersive volumetric content captured by multiple cameras with six degrees of freedom (6DoF) within a viewing space that is determined by the camera arrangement in the capture rig. The bitstream format and decoding processes of the draft specification along with aspects of the Test Model for Immersive Video (TMIV) reference software encoder, decoder, and renderer are described. The use cases, test conditions, quality assessment methods, and experimental results are provided. In the TMIV, multiple texture and geometry views are coded as atlases of patches using a legacy 2-D video codec, while optimizing for bitrate, pixel rate, and quality. The design of the bitstream format and decoder is based on the visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC) standard, MPEG-I Part 5.

关键词： Cameras Video compression transform coding Decoding Encoding Volume measurement Cloud computing Media Immersive media MPEG-I multiview compression video-based point cloud compression (V-PCC) visual volumetric video-based coding (V3C) volumetric representation

来源：评论

学校读者我要写书评

暂无评论

DIESEL plus : Accelerating Distributed Deep Learning Tasks on Image Datasets

引用

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2022年第5期33卷 1173-1184页

作者： Wang, Lipeng Luo, Qiong Yan, Shengen Hong Kong Univ Sci & Technol Dept Comp Sci & Engn Hong Kong Peoples R China SenseTime Res Shenzhen 518000 Guangdong Peoples R China

We observe that data access and processing takes a significant amount of time in large-scale deep learning training tasks (DLTs) on image datasets. Three factors contribute to this problem: (1) the massive and recurrent accesses to large numbers of small files;(2) the repeated, expensive decoding computation on each image, and (3) the frequent communication between computation nodes and storage nodes. Existing work has addressed some aspects of these problems;however, no end-to-end solutions have been proposed. In this article, we propose DIESEL+, an all-in-one system which accelerates the entire I/O pipeline of deep learning training tasks. DIESEL+ contains several components: (1) local metadata snapshot;(2) per-task distributed caching;(3) chunk-wise shuffling;(4) GPU-assisted image decoding and (5) online region-of-interest (ROI) decoding. The metadata snapshot removes the bottleneck on metadata access in frequent reading of large numbers of files. The per-task distributed cache across the worker nodes of a DLT task to reduce the I/O pressure on the underlying storage. The chunk-based shuffle method converts small file reads into large chunk reads, so that the performance is improved without sacrificing the training accuracy. The GPU-assisted image decoding and the online ROI method minimize the image decoding workloads and reduce the cost of data movement between nodes. These techniques are seamlessly integrated into the system. In our experiments, DIESEL+ outperforms existing systems by a factor of two to three times on the overall training time.

关键词： Decoding Training Task analysis transform coding Deep learning Metadata Image coding Storage system dataset management deep learning distributed cache dataset shuffling image decoding GPU

来源：评论

学校读者我要写书评

暂无评论

Forensic Analysis of JPEG-Domain Enhanced Images via Coefficient Likelihood Modeling

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2022年第3期32卷 1006-1019页

作者： Yang, Jianquan Zhu, Guopu Luo, Yao Kwong, Sam Zhang, Xinpeng Zhou, Yicong Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518055 Peoples R China Univ Chinese Acad Sci Shenzhen Coll Adv Technol Shenzhen 518055 Peoples R China Accuray Inc Chengdu 610000 Peoples R China City Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China City Univ Hong Kong Shenzhen Res Inst Shenzhen 518000 Peoples R China Fudan Univ Sch Comp Sci Shanghai 200433 Peoples R China Univ Macau Dept Comp & Informat Sci Taipa 999078 Macao Peoples R China

JPEG-domain enhancement improves the visual quality of JPEG images by directly manipulating the decoded DCT (discrete cosine transform) coefficients, which inevitably leads to mixed compression and enhancement artifacts. Existing forensic methods that merely consider JPEG artifacts are unsuitable to address such mixed artifacts and hence suffer a considerable performance decline in compression parameter estimation and lack the ability to estimate the enhancement parameter. This work attempts to explore the characterization of the mixed artifacts, and to further estimate both the enhancement and compression parameters of JPEG-domain enhanced images. First, a statistical likelihood function is proposed to characterize the periodicity of DCT coefficients, which can measure how well an enhanced image is de-enhanced back to its JPEG compressed version given the compression and enhancement parameters. The proposed likelihood function reaches its maximum if the parameters match their true values. Then, a forensic method of enhancement detection and parameter estimation is developed based on the proposed likelihood function for two kinds of classical JPEG-domain enhancement. Specifically, JPEG-domain enhanced images are detected by thresholding a scalar feature computed upon the likelihoods, and the enhancement and compression parameters are estimated by locating the maximal likelihood. In addition, mathematical proof of the de-enhancement feasibility is provided. Experimental results demonstrate that the proposed method outperforms the compared methods in both enhancement detection and parameter estimation.

关键词： transform coding Forensics Image coding Discrete cosine transforms Estimation Quantization (signal) Parameter estimation Image forensics coefficient periodicity analysis JPEG-domain enhancement maximum likelihood estimation quantization step estimation

来源：评论

学校读者我要写书评

暂无评论

The Forensicability of Operation Detection in Image Operation Chain

引用

IEEE ACCESS 2022年 10卷 68557-68569页

作者： Chen, Zhipeng Zhu, Jie Zhang, Jun Tangshan Normal Univ Dept Comp Sci Tangshan 063000 Peoples R China Natl Police Univ Criminal Justice Dept Informat Management Baoding 071000 Peoples R China

A large number of forensics research focus on operation detection to reveal the evidence of forgery action in the digital image. In the early works, analyst firstly model the probability distribution of single operation, and design the forensic tools based on feature extraction and machine learning based classifier. With increasing dimension of the feature and facing multiple operations detection scenario, the physical meaning of the feature gradually become ambiguous. Especially, since deep learning algorithm was used in forensic research, the automatic feature selection and making decision with high performance of classification conceals the intrinsic forensic clues. In this paper, we explore the availability of feature for operation detection in the operation chain, so called forensicability. An anti-forensic attack algorithm is introduced to formulate the impact on the feature due to the following operation. We propose two measurements: attack angle and scale, mutual information scale to indicate the forensic feature variation after the image manipulated by the following operation. The uncoupled relationship can be revealed by our methods. In the experiments, four operation chains involving ten operations are considered as the case study. The results are encouraging and improve the explanation of the forensics method based on high dimensional features.

关键词： Forensics Feature extraction Detectors transform coding Mutual information Deep learning Information filters Forensicability image operation chain detection image forensics

来源：评论

学校读者我要写书评

暂无评论

An Effective Imbalanced JPEG Steganalysis Scheme Based on Adaptive Cost-Sensitive Feature Learning

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2022年第3期34卷 1038-1052页

作者： Jia, Ju Zhai, Liming Ren, Weixiang Wang, Lina Ren, Yanzhen Wuhan Univ Sch Cyber Sci & Engn Key Lab Aerosp Informat Secur & Trusted Comp Minist Educ Wuhan 430072 Peoples R China

Steganalysis in real-world application often exhibit skewed sample distribution which poses a massive challenge for steganography detection. Conventional steganalysis algorithms are not effective when the training data distribution is imbalanced, and may fail in the scenario of imbalanced data distribution. To address imbalanced data distribution issue in steganalysis, a novel framework termed adaptive cost-sensitive feature learning via F-measure maximization is proposed, which is inspired by the fact that F-measure is a more suitable performance metric compared to accuracy for imbalanced data. We investigate the adaptive cost-sensitive strategy by generating and assigning different weight to each instance with misclassification occurrence. This scheme adaptively determines the weights according to the intra-class and inter-class costs from the imbalanced distribution. Features corresponding to the largest F-measure can be obtained by solving a series of adaptive cost-sensitive feature learning problems with optimization theory. In this way, the learned features are the most representative features between the cover and stego images so that imbalanced steganalysis can significantly alleviate. Extensive experiments on various imbalanced steganalysis tasks show the superiority of the proposed method over the state-of-the-art methods, and it can recognize more minority samples and has excellent classification performance.

关键词： Feature extraction Optimization transform coding Training Measurement Learning systems Estimation Steganalysis imbalanced data adaptive cost-sensitive feature learning F-measure maximization

来源：评论

学校读者我要写书评

暂无评论

Image Compression Using Stochastic-AFD Based Multisignal Sparse Representation

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2022年 31卷 5317-5331页

作者： Dai, Lei Zhang, Liming Li, Hong Univ Macau Fac Sci & Technol Taipa Macau Peoples R China Huazhong Univ Sci & Technol Sch Math & Stat Wuhan 430074 Hubei Peoples R China

Adaptive Fourier decomposition (AFD) is a newly developed signal processing tool that can adaptively decompose any single signal using a Szego kernel dictionary. To process multiple signals, a novel stochastic-AFD (SAFD) theory was recently proposed. The innovation of this study is twofold. First, a SAFD-based general multi-signal sparse representation learning algorithm is designed and implemented for the first time in the literature, which can be used in many signal and image processing areas. Second, a novel SAFD based image compression framework is proposed. The algorithm design and implementation of the SAFD theory and image compression methods are presented in detail. The proposed compression methods are compared with 13 other state-of-the-art compression methods, including JPEG, JPEG2000, BPG, and other popular deep learning-based methods. The experimental results show that our methods achieve the best balanced performance. The proposed methods are based on single image adaptive sparse representation learning, and they require no pre-training. In addition, the decompression quality or compression efficiency can be easily adjusted by a single parameter, that is, the decomposition level. Our method is supported by a solid mathematical foundation, which has the potential to become a new core technology in image compression.

关键词： Image coding Dictionaries Signal processing algorithms transform coding Kernel Training Discrete cosine transforms Stochastic adaptive Fourier decomposition sparse representation image compression image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Cross-Scale Residual Network: A General Framework for Image Super-Resolution, Denoising, and Deblocking

引用

IEEE TRANSACTIONS ON CYBERNETICS 2022年第7期52卷 5855-5867页

作者： Zhou, Yuan Du, Xiaoting Wang, Mingfei Huo, Shuwei Zhang, Yeda Kung, Sun-Yuan Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Princeton Univ Dept Elect Engn Princeton NJ 08540 USA

In general, image restoration involves mapping from low-quality images to their high-quality counterparts. Such optimal mapping is usually nonlinear and learnable by machine learning. Recently, deep convolutional neural networks have proven promising for such learning processing. It is desirable for an image processing network to support well with three vital tasks, namely: 1) super-resolution;2) denoising;and 3) deblocking. It is commonly recognized that these tasks have strong correlations, which enable us to design a general framework to support all tasks. In particular, the selection of feature scales is known to significantly impact the performance on these tasks. To this end, we propose the cross-scale residual network to exploit scale-related features among the three tasks. The proposed network can extract spatial features across different scales and establish cross-temporal feature reusage, so as to handle different tasks in a general framework. Our experiments show that the proposed approach outperforms state-of-the-art methods in both quantitative and qualitative evaluations for multiple image restoration tasks.

关键词： Task analysis Feature extraction Superresolution Image restoration transform coding Noise reduction Image denoising Convolutional neural network (CNN) general framework image processing

来源：评论

学校读者我要写书评

暂无评论

DWDN: Deep Wiener Deconvolution Network for Non-Blind Image Deblurring

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022年第12期44卷 9960-9976页

作者： Dong, Jiangxin Roth, Stefan Schiele, Bernt Nanjing Univ Sci & Technol Sch Comp Sci & Engn Nanjing 210094 Peoples R China Max Planck Inst Informat Dept Comp Vis & Machine Learning D-66123 Saarbrucken Germany Tech Univ Darmstadt Dept Comp Sci D-64289 Darmstadt Germany

We present a simple and effective approach for non-blind image deblurring, combining classical techniques and deep learning. In contrast to existing methods that deblur the image directly in the standard image space, we propose to perform an explicit deconvolution process in a feature space by integrating a classical Wiener deconvolution framework with learned deep features. A multi-scale cascaded feature refinement module then predicts the deblurred image from the deconvolved deep features, progressively recovering detail and small-scale structures. The proposed model is trained in an end-to-end manner and evaluated on scenarios with simulated Gaussian noise, saturated pixels, or JPEG compression artifacts as well as real-world images. Moreover, we present detailed analyses of the benefit of the feature-based Wiener deconvolution and of the multi-scale cascaded feature refinement as well as the robustness of the proposed approach. Our extensive experimental results show that the proposed deep Wiener deconvolution network facilitates deblurred results with visibly fewer artifacts and quantitatively outperforms state-of-the-art non-blind image deblurring methods by a wide margin.

关键词： Deconvolution Image restoration Feature extraction Kernel Standards Neural networks transform coding Image deblurring wiener deconvolution feature-based deconvolution multi-scale cascaded feature refinement saturation and JPEG artifacts

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：