检索结果-内蒙古大学图书馆

DeepPCC: Learned Lossy Point Cloud Compression

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2025年第2期9卷 1897-1909页

作者： Zhang, Junzhe Liu, Gexin Zhang, Junteng Ding, Dandan Ma, Zhan Hangzhou Normal Univ Sch Informat Sci & Technol Hangzhou 311121 Peoples R China Nanjing Univ Sch Elect Sci & Engn Nanjing 210093 Peoples R China

We propose DeepPCC, an end-to-end learning-based approach for the lossy compression of large-scale object point clouds. For both geometry and attribute components, we introduce the Multiscale Neighborhood Information Aggregation (NIA) mechanism, which applies resolution downscaling progressively (i.e., dyadic downsampling of geometry and average pooling of attribute) and combines sparse convolution and local self-attention at each resolution scale for effective feature representation. Under a simple autoencoder structure, scale-wise NIA blocks are stacked as the analysis and synthesis transform in the encoder-decoder pair to best characterize spatial neighbors for accurate approximation of geometry occupancy probability and attribute intensity. Experiments demonstrate that DeepPCC remarkably outperforms state-of-the-art rules-based MPEG G-PCC and learning-based solutions both quantitatively and qualitatively, providing strong evidence that DeepPCC is a promising solution for emerging AI-based PCC.

关键词： Point cloud compression Geometry transform coding Decoding Entropy Convolution Standards Image coding Distortion Correlation geometry attribute sparse convolution local self-attention

来源：评论

学校读者我要写书评

暂无评论

Lossless Compression Framework Using Lossy Prior for High-Resolution Remote Sensing Images

引用

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2025年 18卷 8590-8601页

作者： Gu, Enjia Zhang, Yongshan Wang, Xinxin Jiang, Xinwei China Univ Geosci Sch Comp Sci Wuhan 430074 Peoples R China Univ Macau Dept Comp & Informat Sci Macau 999078 Peoples R China

Lossless compression of remote sensing images is critically important for minimizing storage requirements while preserving the complete integrity of the data. The main challenge in lossless compression lies in striking a good balance between reasonable compression durations and high compression ratios. In this article, we introduce an innovative lossless compression framework that uniquely utilizes lossy compression data as prior knowledge to enhance the compression process. Our framework employs a checkerboard segmentation technique to divides the original remote sensing image into various subimages. The main diagonal subimages are compressed using a traditional lossy method to obtain prior knowledge for facilitating the compression of all subimages. These subimages are then subjected to lossless compression using our newly developed lossy prior probability prediction network (LP3Net) and arithmetic coding in a specific order. The proposed LP3Net is an advanced network architecture, consisting of an image preprocessing module, a channel enhancement module, and a pixel probability transformer module, to learn the discrete probability distribution of each pixel within every subimage, enhancing the accuracy and efficiency of the compression process. Experiments on high-resolution remote sensing image datasets demonstrate the effectiveness and efficiency of the proposed LP3Net and lossless compression framework, achieving a minimum of 4.57% improvement over traditional compression methods and 1.86% improvement over deep learning-based compression methods.

关键词： Image coding Remote sensing Image segmentation Arithmetic transform coding transformers Redundancy Accuracy Probability distribution Hyperspectral imaging Arithmetic coding checkerboard segmentation discrete probability prediction JPEG XL resampling lossless remote sensing image compression

来源：评论

学校读者我要写书评

暂无评论

High efficient hardware allocation framework of arbitrary inverse transform coding blocks in H.265

High efficient hardware allocation framework of arbitrary in...

引用

IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)

作者： Chi-Chia Sun Heng-Chi Lai Sin-Kun Lin Ming-Hwa Sheu Mladen Berekovic Deptpart of Electrical Engineering National Formosa University Taiwan Department of Electronic Engineering National Yunlin University of Science and Tech. Taiwan Circuit Design and Computer Engineer TU Braunschweig German

In this paper, a high efficient arbitrary transform blocks hardware allocation framework is proposed, which can adapt the prediction tree structure and then improve the utilization ratio as well as a parallel hardware design to improve the data throughput. This method will configure an appropriate combination of five different size inverse transform units: Fast IDST, 4×4 IDCT, 8×8 IDCT, 16×16 IDCT and 32×32 IDCT. If the input video stream changed, it will reconfigure the combination and allocated the hardware resources to retain a high utilization ratio in hardware framework. Experiments show that the performance of the proposed method is improved from 48.8% to 96.2% under various conditions. The proposed method can enhance the efficiency of H.265 decoder.

关键词： Laplace equations transforms Hardware Encoding Resource management Field programmable gate arrays transform coding

来源：评论

学校读者我要写书评

暂无评论

JPEG Image Encryption With DC Rotation and Undivided RSV-Based AC Group Permutation

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 1-15页

作者： Yuan, Yuan He, Hongjie Yang, Yaolin Amirpour, Hadi Timmerer, Christian Chen, Fan Southwest Jiaotong Univ Sch Informat Sci & Technol Chengdu 611756 Peoples R China Univ Klagenfurt Inst Informat Technol ITEC Christian Doppler Lab ATHENA A-9020 Klagenfurt Austria Southwest Jiaotong Univ Sch Comp & Artificial Intelligence Chengdu 611756 Peoples R China

Existing JPEG encryption approaches pose a security risk due to the difficulty in changing all block-feature values while considering format compatibility and file size expansion. To address these concerns, this paper introduces a novel JPEG image encryption scheme. First, the security of sketch information against chosen-plaintext attacks is improved by increasing the change rate of block-feature values. Second, a classification global permutation approach is designed to encrypt the undivided run/size, value (RSV)-based AC groups to achieve larger changes in the block-feature values. Third, to reduce file size expansion while maintaining format compatibility, the DC coefficients are rotated based on the mapped DC differences in the same category, and the nonzero AC coefficients are mapped in the same category. Extensive experiments demonstrate that the proposed algorithm is superior to existing schemes in terms of security. Notably, the average change rate of block-feature values is increased by at least 20%. Furthermore, the proposed scheme reduces the file size by an average of 2.036% compared to existing JPEG image encryption methods.

关键词： Encryption transform coding Cryptography Codes Image coding Standards transforms AC group permutation block features DC coefficient rotation JPEG image encryption undivided RSV

来源：评论

学校读者我要写书评

暂无评论

Towards JPEG-Resistant Image Forgery Detection and Localization Via Self-Supervised Domain Adaptation

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025年第5期47卷 3285-3297页

作者： Rao, Yuan Ni, Jiangqun Zhang, Weizhe Huang, Jiwu Sun Yat Sen Univ Guangdong Prov Key Lab Informat Secur Guangzhou 510275 Peoples R China Guangzhou Univ Inst Artificial Intelligence & Blockchain Guangzhou 510006 Peoples R China Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou 510275 Peoples R China Cyberspace Secur Res Ctr Peng Cheng Lab Shenzhen 518066 Peoples R China Harbin Inst Technol Sch Cyberspace Sci Harbin 150001 Peoples R China Shenzhen Univ Guangdong Key Lab Intelligent Informat Proc Shenzhen Key Lab Media Secur Shenzhen 518060 Peoples R China Shenzhen Inst Artificial Intelligence & Robot Soc Shenzhen 518000 Peoples R China

With wide applications of image editing tools, forged images (splicing, copy-move, removal and etc.) have been becoming great public concerns. Although existing image forgery localization methods could achieve fairly good results on several public datasets, most of them perform poorly when the forged images are JPEG compressed as they are usually done in social networks. To tackle this issue, in this paper, a self-supervised domain adaptation network, which is composed of a backbone network with Siamese architecture and a compression approximation network (ComNet), is proposed for JPEG-resistant image forgery detection and localization. To improve the performance against JPEG compression, ComNet is customized to approximate the JPEG compression operation through self-supervised learning, generating JPEG-agent images with general JPEG compression characteristics. The backbone network is then trained with domain adaptation strategy to localize the tampering boundary and region, and alleviate the domain shift between uncompressed and JPEG-agent images. Extensive experimental results on several public datasets show that the proposed method outperforms or rivals to other state-of-the-art methods in image forgery detection and localization, especially for JPEG compression with unknown QFs.

关键词： Forgery transform coding Image coding Location awareness Forensics Splicing Feature extraction Data augmentation domain adaptation forgery detection and localization image forensics self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

Prototype-Based Explanation for Semantic Gap Reduction With Distributional Embedding

引用

IEEE ACCESS 2025年 13卷 27449-27461页

作者： Joo, Hyungjun Hong, Sangwoo Han, Hyeonggeun Yoon, Youngseok Lee, Jungwoo Seoul Natl Univ Dept Elect & Comp Engn Commun & Machine Learning Lab Seoul 08826 South Korea Konkuk Univ Dept Comp Sci & Engn Seoul 05029 South Korea Univ Calif Santa Barbara Dept Elect & Comp Engn Santa Barbara CA 93106 USA

The demand for interpretable models has driven the exploration of explainable approaches grounded in human-friendly case-based reasoning. Among these approaches, prototype-based methods have proven effective in performing case-based reasoning by utilizing prototypes and similarity scores. However, their interpretability is affected by degraded similarity in the input space and latent space. This semantic gap leads to inconsistent explanation for images that are perceived to be similar, which undermines the reliability of the explanation. In this paper, we propose a distributional embedding framework in which the embedding is randomly sampled from a parameterized distribution in a regularized latent space. With a simple modification, our method significantly improves the reliability of the model's explanation by bridging the gap between similarity in human perception and explanation. To demonstrate this, we conduct experiments ranging from small-scale scenarios to direct explanation regarding similarity. Extensive comparisons with a real-world dataset and multiple backbone networks showcase the usability and efficacy of the proposed framework.

关键词： Prototypes Semantics Reliability Mathematical models Gaussian distribution Training Cognition Computational modeling Accuracy transform coding Explainable AI inherently-interpretable method trustworthy machine learning prototype-based machine learning

来源：评论

学校读者我要写书评

暂无评论

An End-to-End Spatially Scalable Light Field Image Compression Method

引用

IEEE TRANSACTIONS ON BROADCASTING 2025年

作者： Lei, Jianjun Li, Hao Peng, Bo Zhao, Bo Ling, Nam Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Tianjin Univ Int Engn Inst Tianjin 300072 Peoples R China Santa Clara Univ Dept Comp Sci & Engn Santa Clara CA 95053 USA

Recently, learning-based light field (LF) image compression methods have achieved impressive progress, while end-to-end spatially scalable LF image compression (SS-LFIC) has not been explored. To tackle this problem, this paper proposes an end-to-end spatially scalable LF compression network (SSLFC-Net). In the SSLFC-Net, a spatial-angular domain-specific enhancement layer coding strategy is designed to boost the coding performance of the enhancement layers (ELs). Specifically, by referencing domain-specific features, the ELs compress spatial features by predictive coding in the spatial domain to effectively remove inter-layer spatial redundancy, and reconstruct angular features by decoder-side generative method in the angular domain to strategically avoid angular compression. Particularly, to produce accurate spatial predictions and reconstruct high-quality LF images, an inter-layer spatial prediction module and a spatial-angular context-aware reconstruction module are presented to collaboratively promote EL compression. Experiments show that the proposed SSLFC-Net effectively supports spatial scalability and achieves state-of-the-art rate-distortion performance.

关键词： Image coding Image reconstruction Spatial resolution Scalability Feature extraction Standards Correlation Image resolution transform coding Redundancy Light field image compression end-to-end spatial scalability SSLFC-Net

来源：评论

学校读者我要写书评

暂无评论

Scalable Point Cloud Attribute Compression

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 889-899页

作者： Zhang, Junteng Wang, Jianqiang Ding, Dandan Ma, Zhan Hangzhou Normal Univ Sch Informat Sci & Technol Hangzhou 311121 Peoples R China Nanjing Univ Sch Elect Sci & Engn Nanjing 210093 Peoples R China

This article develops a Scalable Point Cloud Attribute Compression solution, termed ScalablePCAC. In a two-layer example, ScalablePCAC uses the standard G-PCC at the base layer to directly encode the thumbnail point cloud that is downscaled from the original input, and a learning-based model at the enhancement layer to compress and restore the full-resolution input point cloud conditioned on the base layer reconstruction. As such, the base layer provides a coarse reconstruction of the input point cloud and the enhancement layer further improves the quality. We then adopt a cross-layer rate allocation strategy that flexibly determines the resolution downscaling factor, the quantization parameter of the base layer, and the quality controlling factor of the enhancement layer to adapt the bitrate of the two layers for approximately optimal Rate-Distortion (R-D) performance. We conduct extensive experiments on popular point clouds following the MPEG common test conditions. Results demonstrate that the proposed ScalablePCAC achieves >10% BD-BR reduction against the latest G-PCC version 22 (TMC13v22) on the Y component;it also significantly outperforms existing learning-based solutions for point cloud attribute compression, e.g., compared with a recent work showing state-of-the-art performance, it achieves >20% BD- BR reduction.

关键词： Point cloud compression Image coding transform coding Three-dimensional displays Image reconstruction Scalability Rate-distortion 3D Point cloud attribute compression scalability MPEG G-PCC

来源：评论

学校读者我要写书评

暂无评论

Reversible Data Hiding in Encrypted JPEG Images With Polynomial Secret Sharing for IoT Security

引用

IEEE INTERNET OF THINGS JOURNAL 2025年第1期12卷 1098-1108页

作者： Xu, Shuying Horng, Ji-Hwei Chang, Ching-Chun Chang, Chin-Chen Feng Chia Univ Dept Informat Engn & Comp Sci Taichung 40724 Taiwan Natl Quemoy Univ Dept Elect Engn Kinmen 892009 Taiwan Feng Chia Univ Informat & Commun Secur Res Ctr Taichung 40724 Taiwan

Crypto-space reversible data hiding (RDH) has emerged as an effective technique for transmitting secret information over the Internet. However, most existing schemes are designed for uncompressed images, while almost all images are processed and transmitted in compressed formats. There is an urgent need to develop methods for compressed images, such as joint photographic experts group (JPEG). In this article, we propose an RDH in encrypted JPEG images, where the bitstreams of alternating current (AC) coefficients and the secret data are mapped to numbers over Galois field. The obtained numbers are then utilized to conduct a polynomial for secret sharing. By reproduction into secret shares, the AC coefficients and the secret data are secured. In addition, a block sorting strategy is used to reduce image distortion under low data payload. Experimental results demonstrate that the proposed scheme outperforms state-of-the-art methods in embedding capacity while preserving the file size and conforming to the JPEG format.

关键词： Cryptography transform coding Polynomials Image coding Encryption Internet of Things Galois fields Media Receivers Payloads Crypto-space steganography Galois field joint photographic experts group (JPEG) reversible data hiding (RDH) secret sharing

来源：评论

学校读者我要写书评

暂无评论

ByteNet: Rethinking Multimedia File Fragment Classification Through Visual Perspectives

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 1305-1319页

作者： Liu, Wenyang Wu, Kejun Liu, Tianyi Wang, Yi Yap, Kim-Hui Chau, Lap-Pui Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore Huazhong Univ Sci & Technol Sch Elect Informat & Commun Wuhan 430074 Peoples R China Hong Kong Polytech Univ Dept Elect & Elect Engn Hong Kong Peoples R China

Multimedia file fragment classification (MFFC) aims to identify file fragment types, e.g., image/video, audio, and text without system metadata. It is of vital importance in multimedia storage and communication. Existing MFFC methods typically treat fragments as 1D byte sequences and emphasize the relations between separate bytes (interbytes) for classification. However, the more informative relations inside bytes (intrabytes) are overlooked and seldom investigated. By looking inside bytes, the bit-level details of file fragments can be accessed, enabling a more accurate classification. Motivated by this, we first propose Byte2Image, a novel visual representation model that incorporates previously overlooked intrabyte information into file fragments and reinterprets these fragments as 2D grayscale images. This model involves a sliding byte window to reveal the intrabyte information and a rowwise stacking of intrabyte n-grams for embedding fragments into a 2D space. Thus, complex interbyte and intrabyte correlations can be mined simultaneously using powerful vision networks. Additionally, we propose an end-to-end dual-branch network ByteNet to enhance robust correlation mining and feature representation. ByteNet makes full use of the raw 1D byte sequence and the converted 2D image through a shallow byte branch feature extraction (BBFE) and a deep image branch feature extraction (IBFE) network. In particular, the BBFE, composed of a single fully-connected layer, adaptively recognizes the co-occurrence of several some specific bytes within the raw byte sequence, while the IBFE, built on a vision transformer, effectively mines the complex interbyte and intrabyte correlations from the converted image. Experiments on the two representative benchmarks, including 14 cases, validate that our proposed method outperforms state-of-the-art approaches on different cases by up to 12.2%.

关键词： Feature extraction Visualization Correlation Multimedia communication Media Gray-scale Discrete cosine transforms Symbols Accuracy transform coding Multimedia file fragment classification multimedia analysis deep learning computer vision

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：