检索结果-内蒙古大学图书馆

IEEE International Conference on image Processing (ICIP)

作者： Han, Jingning Chen, Cheng Galligan, Frank Massimino, Pascal Wilkins, Paul Chang, Wan-Teh Guyon, Yannis Xu, Yaowu Bankoski, James Google LLC Mountain View CA 94043 USA

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

We consider the perceptual quality optimization in image coding through adaptive quantization. A differential contrast model is proposed to measure the visual sensitivity to the quantization distortions, and thereby deriving the spatially adaptive quantization strategy. A complementary quantitative approach is provided as a means to efficiently calculate the proposed differential contrast model. The resulting visual quality improvement is experimentally demonstrated.

关键词： image coding perceptual quality

来源：评论

学校读者我要写书评

暂无评论

Remote Sensing image coding for Machines on Semantic Segmentation via Contrastive Learning

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2024年 62卷 1页

作者： Zhang, Junxi Chen, Zhenzhong Liu, Shan Wuhan Univ Sch Remote Sensing & Informat Engn Wuhan 430072 Peoples R China Tencent Amer Tencent Media Lab Palo Alto CA 94306 USA

Due to the huge data volume of high-resolution remote sensing imagery (RSI) and limited transmission bandwidth, RSIs are typically compressed for efficient transmission and storage. However, most of the existing compression algorithms are developed based on optimizing for the human perceptual that are not suitable for remote sensing image applications where RSIs are usually used for machine interpretation tasks, such as semantic segmentation for ground-object recognition. In this article, we propose an image coding for machines (ICMs) paradigm based on contrastive learning in a fully supervised manner to boost semantic segmentation of compressed RSIs. Specifically, we build an end-to-end compression framework to make full use of the global semantic information by clustering intracategory projected embeddings and spacing intercategory embeddings apart, to compensate for the loss of feature discriminability during the compression process and reconstruct the decision boundaries between different categories. Compared to the state-of-the-art image compression methods, our proposed method significantly improves the performance of semantic segmentation on the remote sensing labeling benchmark datasets.

关键词： image coding Remote sensing Semantic segmentation Feature extraction Contrastive learning Object segmentation Transform coding Standards Codecs Bit rate image coding for machines (ICMs) remote sensing interpretation semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

Rate-Distortion in image coding for Machines

Rate-Distortion in Image Coding for Machines

引用

Picture coding Symposium (PCS)

作者： Harell, Alon De Andrade, Anderson Bajic, Ivan, V Simon Fraser Univ Sch Engn Sci Burnaby BC Canada

ISBN: (纸本)9781665492577

In recent years, there has been a sharp increase in transmission of images to remote servers specifically for the purpose of computer vision. In many applications, such as surveillance, images are mostly transmitted for automated analysis, and rarely seen by humans. Using traditional compression for this scenario has been shown to be inefficient in terms of bit-rate, likely due to the focus on human based distortion metrics. Thus, it is important to create specific image coding methods for joint use by humans and machines. One way to create the machine side of such a codec is to perform feature matching of some intermediate layer in a Deep Neural Network performing the machine task. In this work, we explore the effects of the layer choice used in training a learnable codec for humans and machines. We prove, using the data processing inequality, that matching features from deeper layers is preferable in the sense of rate-distortion. Next, we confirm our findings empirically by re-training an existing model for scalable human-machine coding. In our experiments we show the trade-off between the human and machine sides of such a scalable model, and discuss the benefit of using deeper layers for training in that regard.

关键词： image coding Deep neural networks Collaborative intelligence Object detection

来源：评论

学校读者我要写书评

暂无评论

Accelerated decoding method in fractal image coding

引用

ELECTRONICS LETTERS 2024年第24期60卷

作者： Wang, Qiang Dalian Maritime Univ Coll Informat Sci & Technol Dalian Peoples R China

To accelerate the fractal decoding process, a minimum domain block set (MDBS)-based fast fractal decoding method is proposed here. In fractal encoding process, it is found that there exists a MDBS which can provide the best-matched domain blocks for all range blocks. In the decoding process, MDBS is first identified before the first iteration. Then, only the range blocks inside MDBS are reconstructed in each of the first to penultimate iterations, and the computations of reconstructing the remaining range blocks outside MDBS can be saved to speedup the decoding process. Finally, all range blocks are reconstructed to obtain the decoded image in the last iteration. Experimental results show that about 5%-17% of total computations in decoding process can be saved.

关键词： fractals image coding image processing

来源：评论

学校读者我要写书评

暂无评论

PO-ELIC: Perception-Oriented Efficient Learned image coding

PO-ELIC: Perception-Oriented Efficient Learned Image Coding

引用

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

作者： He, Dailan Yang, Ziming Yu, Hongjiu Xu, Tongda Luo, Jixiang Chen, Yuan Gao, Chenjian Shi, Xinjie Qin, Hongwei Wang, Yan SenseTime Res Hong Kong Peoples R China Tsinghua Univ Beijing Peoples R China

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

In the past years, learned image compression (LIC) has achieved remarkable performance. The recent LIC methods outperform VVC in both PSNR and MS-SSIM. However, the low bit-rate reconstructions of LIC suffer from artifacts such as blurring, color drifting and texture missing. Moreover, those varied artifacts make image quality metrics correlate badly with human perceptual quality. In this paper, we propose PO-ELIC, i.e., Perception-Oriented Efficient Learned image coding. To be specific, we adapt ELIC, one of the state-of-the-art LIC models, with adversarial training techniques. We apply a mixture of losses including hinge-form adversarial loss, Charbonnier loss, and style loss, to finetune the model towards better perceptual quality. Experimental results demonstrate that our method achieves comparable perceptual quality with HiFiC with much lower bitrate.

关键词： Measurement Training image quality Adaptation models Visualization image coding image color analysis

来源：评论

学校读者我要写书评

暂无评论

Low Energy Interleaved Chaotic Secure image coding Scheme for Visual Sensor Networks Using Pascal's Triangle Transform

引用

IEEE ACCESS 2021年 9卷 134576-134592页

作者： Suseela, G. Phamila, Y. Asnath Victy Niranjana, G. Ramana, Kadiyala Singh, Saurabh Yoon, Byungun SRM Inst Sci & Technol Sch Comp Sci & Engn Chennai 603203 Tamil Nadu India VIT Univ Sch Comp Sci & Engn Chennai Campus Chennai 600127 Tamil Nadu India Annamacharya Inst Technol & Sci Dept Artificial Intelligence & Data Sci Rajampet 516126 India Dongguk Univ Dept Ind & Syst Engn Seoul 04620 South Korea

The resource-constrained camera integrated Visual Sensor Networks (VSN) have conquered numerous visual aided services from visual surveillance to habitat monitoring. VSN is capable of sensing, processing and communicating visual data wirelessly. These networks are built with inexpensive low power sensor motes with a lightweight processor, limited storage, and bandwidth. The huge amount of redundancy present in the images makes the processing and communication consume more energy than expected. The number of bits must be reduced using energy-efficient compression techniques for efficient transmission. Low computational energy and communication energy are always favored for an increased lifetime of the wireless sensor network. The highly sensitive and self-descriptive nature of images makes security in VSN even more critical. In this work, we propose an energy-efficient low bitrate secured image coder for resource-constrained VSN. Light weight design protocols are highly required in secured image transmission over VSN. Through this communication, we also propose a novel chaotic map using Pascal's triangle. The system follows a unique interleaved fashion of compression and encryption process to consume less computational resources. A series of tests were carried out to validate the secured image coder's ruggedness and its suitability in VSN. The performance and the strength of the low bitrate secured image coder are tested with compression efficiency and cryptanalysis tests. Simulations were carried out in Atmel's ATmega128 processor for energy consumption analysis. The energy consumed by the proposed system for compression, encryption and transmission of an image of size 512 x 512 is 109.364mJ (milli Joules), which is only 4.57 % of the energy consumed by raw image transmission. In addition, the system is implemented in real time image sensor platform based on Arduino Due board integrated with OV7670 camera module for real time verification and the experimental result

关键词： image coding Encryption Visualization Transforms Memory management image communication Real-time systems VSN low bitrate secured image coding chaotic map image compression encryption

来源：评论

学校读者我要写书评

暂无评论

Semantic Structured image coding Framework for Multiple Intelligent Applications

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2021年第9期31卷 3631-3642页

作者： Sun, Simeng He, Tianyu Chen, Zhibo Univ Sci & Technol China Dept Elect Engn & Informat Sci Hefei 230026 Peoples R China DAMO Acad Alibaba Grp Hangzhou 311121 Peoples R China

Fast-growing intelligent media processing applications demand efficient processing throughout the processing chain from the edge to the cloud, and the complexity bottleneck usually lies in the parallel decoding of multiple-channel compressed bitstreams before analyzing. This occurs because the traditional media coding scheme generates a binary stream without a semantic structure, which is unable to be operated directly at the bitstream level to support different tasks such as classification, recognition, detection, etc. Therefore, in this article, we propose a learning-based semantically structured image coding (SSIC) framework to generate a semantically structured bitstream (SSB), where each part of the bitstream represents a specific object and can be directly used for the aforementioned intelligent tasks. Specifically, we integrate an object location extraction module into the compression framework to locate and align objects in the feature domain. Then, each object together with the background is compressed separately and reorganized to form a structured bitstream to enable the analysis or reconstruction of specific objects directly from partial bitstream. Furthermore, in contrast to existing learning-based compression schemes that train the specific model for a specific bitrate, we share most of the model parameters among various bitrates to significantly reduce the model size for variable-rate compression. The experimental results demonstrate the effectiveness of the proposed coding scheme whose compression performance is comparable to existing image coding schemes, where intelligent tasks such as classification and pose estimation can be directly performed on a partial bitstream without performance degradation, significantly reducing the complexity for analyzing tasks.

关键词： image coding Task analysis Bit rate image reconstruction Amplitude modulation Semantics Computational modeling Compression semantically structured compression neural networks

来源：评论

学校读者我要写书评

暂无评论

A Deep image coding Scheme With Generative Network to Learn From Correlated images

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2021年 23卷 2235-2244页

作者： Chen, Yihao Tan, Bin Wu, Jun Zhang, Zhifeng Ren, Haoqi Tongji Univ Coll Elect & Informat Engn Shanghai 201804 Peoples R China Tongji Univ Key Lab Embedded Syst & Serv Comp Minist Educ Shanghai 201804 Peoples R China Jinggangshan Univ Coll Elect & Informat Engn Jian 343009 Jiangxi Peoples R China Fudan Univ Sch Comp Sci Shanghai 200433 Peoples R China Fudan Univ Shanghai Key Lab Intelligent Informat Proc Shanghai 200433 Peoples R China

This paper provides a method to build a deep learning image coding system based on inverse problem, choosing a suitable measurement operator to reduce the amount of information transmitted at the sender, and reconstructing the original image by tackling the inverse problem at the receiver. Unlike most compressed sensing (CS) methods, the proposed coding scheme does not rely on sparsity but uses the structural priors of the generative adversarial networks (GAN) to solve the inverse problem. The proposed model trains the GAN to learn a mapping from the latent space to the sample space formed by correlated images on the cloud. Then the measurements are used to localize the optimal latent variable in the representation space which corresponding to the original image in the sample space. The proposed method encodes and transmits the measurements instead of the original image, which greatly reduces the cost of transmission while ensuring the quality of the reconstructed the image at high compression ratios. To the best of our knowledge, this is the first time to introduce the GAN-based inverse problem in the field of the deep image coding area. The experimental results show that the visual quality of the images generated by the proposed scheme is better than the traditional encoding scheme JPEG2000. Especially in the case of extremely high compression ratios, the proposed scheme can still maintain good performance.

关键词： image coding image reconstruction Inverse problems Generative adversarial networks Generators Deep learning Encoding coding schemes correlated images GAN inverse problem

来源：评论

学校读者我要写书评

暂无评论

Improvement of JPEG XL Lossy image coding Using Region Adaptive DCT Block Partitioning Structure

引用

IEEE ACCESS 2021年 9卷 113213-113225页

作者： Cho, Joonhyung Kwon, Oh-Jin Choi, Seungcheol Sejong Univ Dept Elect Engn Seoul 05006 South Korea

The Joint Picture Expert Group (JPEG) committee has been standardizing next-generation image compression, called JPEG XL, to meet the specific needs for a responsive web, wide color gamut, and high dynamic range. JPEG XL supports lossy and lossless compression. A variable-sized discrete cosine transform (DCT) block is used for lossy compression. A block partitioning method is regarded as a critical function for the performance of JPEG XL. The current DCT block partitioning method used in JPEG XL is highly dependent on the compression rate and tends to assign small-sized DCT blocks to homogeneously textured regions (HTRs) having similar or regular patterns. We propose a region-adaptive DCT block partitioning method that assigns larger blocks to the HTR. The proposed method identifies the HTRs by using a combined metric employing a sum-modified Laplacian, zero-crossing, and colorfulness metric for measuring the region homogeneity. Objective, subjective, and visual comparison evaluations with the ten images recommended by the JPEG working group were provided to show the improvement in coding performance. The proposed method shows its superiority in terms of the compression efficiency evaluated using six objective metrics, subjective tests with 15 participants, visual comparison improvements in the HTR, and gains in the execution time.

关键词： Transform coding image coding Discrete cosine transforms image color analysis Standards Encoding Measurement JPEG XL DCT block partitioning structure homogeneous region sum-modified Laplacian zero-crossing colorfulness

来源：评论

学校读者我要写书评

暂无评论

Towards coding for Human and Machine Vision: Scalable Face image coding

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2021年 23卷 2957-2971页

作者： Yang, Shuai Hu, Yueyu Yang, Wenhan Duan, Ling-Yu Liu, Jiaying Peking Univ Beijing 100080 Peoples R China

The past decades have witnessed the rapid development of image and video coding techniques in the era of big data. However, the signal fidelity-driven coding pipeline design limits the capability of the existing image/video coding frameworks to fulfill the needs of both machine and human vision. In this paper, we come up with a novel face image coding framework by leveraging both the compressive and the generative models, to support machine vision and human perception tasks jointly. Given an input image, the feature analysis is first applied, and then the generative model is employed to reconstruct image with compact structure and color features, where sparse edges are extracted to connect both kinds of vision and a key reference pixel selection method is proposed to determine the priorities of the reference color pixels for scalable coding. The compact edge map serves as the basic layer for machine vision tasks, and the reference pixels act as an enhanced layer to guarantee signal fidelity for human vision. By introducing advanced generative models, we train a decoding network to reconstruct images from compact structure and color representations, which is flexible to accept inputs in a scalable way and to control the imagery effect of the outputs between signal fidelity and visual realism. Experimental results and comprehensive performance analysis over the face image dataset demonstrate the superiority of our framework in both human vision tasks and machine vision tasks, which provide useful evidence on the emerging standardization efforts on MPEG VCM (Video coding for Machine).

关键词： image coding Machine vision Task analysis image reconstruction Visualization Feature extraction image color analysis Generative compression image coding scalable coding video coding for machine

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：