检索结果-内蒙古大学图书馆

2024 International Conference on image Processing

作者： Iino, Kei Akamatsu, Shunsuke Watanabe, Hiroshi Enomoto, Shohei Sakamoto, Akira Eda, Takeharu Waseda Univ Grad Sch Fundamental Sci & Engn Tokyo Japan NTT Software Innovat Ctr Tokyo Japan

ISBN: (纸本)9798350349405;9798350349399

image coding for machines (ICM) aims to compress images for machine analysis using recognition models rather than human vision. Hence, in ICM, it is important for the encoder to recognize and compress the information necessary for the machine recognition task. There are two main approaches in learned ICM;optimization of the compression model based on task loss, and Region of Interest (ROI) based bit allocation. These approaches provide the encoder with the recognition capability. However, optimization with task loss becomes difficult when the recognition model is deep, and ROI-based methods often involve extra overhead during evaluation. In this study, we propose a novel training method for learned ICM models that applies auxiliary loss to the encoder to improve its recognition capability and rate-distortion performance. Our method achieves Bjontegaard Delta rate improvements of 27.7% and 20.3% in object detection and semantic segmentation tasks, compared to the conventional training method.

关键词： image coding for machines ICM learned image compression auxiliary loss VCM

来源：评论

学校读者我要写书评

暂无评论

Scalable image coding for Humans and machines Using Feature Fusion Network 26

Scalable Image Coding for Humans and Machines Using Feature ...

引用

26th International Workshop on Multimedia Signal Processing

作者： Shindo, Takahiro Watanabe, Taiju Tatsumi, Yui Watanabe, Hiroshi Waseda Univ Tokyo Japan

ISBN: (纸本)9798350387261;9798350387254

As image recognition models become more prevalent, scalable coding methods for machines and humans gain more importance. Applications of image recognition models include traffic monitoring and farm management. In these use cases, the scalable coding method proves effective because the tasks require occasional image checking by humans. Existing image compression methods for humans and machines meet these requirements to some extent. However, these compression methods are effective solely for specific image recognition models. We propose a learning-based scalable image coding method for humans and machines that is compatible with numerous image recognition models. We combine an image compression model for machines with a compression model, providing additional information to facilitate image decoding for humans. The features in these compression models are fused using a feature fusion network to achieve efficient image compression. Our method's additional information compression model is adjusted to reduce the number of parameters by enabling combinations of features of different sizes in the feature fusion network. Our approach confirms that the feature fusion network efficiently combines image compression models while reducing the number of parameters. Furthermore, we demonstrate the effectiveness of the proposed scalable coding method by evaluating the image compression performance in terms of decoded image quality and bitrate. Code is available at https://***/final-0/ICM-v1.

关键词： Scalable image coding image coding for machines Learned image Compression

来源：评论

学校读者我要写书评

暂无评论

Side information-driven image coding for hybrid machine-human vision

引用

EURASIP JOURNAL ON image AND VIDEO PROCESSING 2025年第1期2025卷 1-24页

作者： Zhang, Zhongpeng Liu, Ying Peng, Wen-Hsiao Santa Clara Univ Dept Comp Sci & Engn 500 El Camino Real Santa Clara CA 95053 USA Natl Yang Ming Chiao Tung Univ Dept Comp Sci 1001 Ta Hsueh Rd Hsinchu 30010 Taiwan

With the development of machine learning, advanced photography and image transmission systems, images are being processed more and more by machines, so image coding for machines (ICM) came into being. After the image codec compresses and transmits the image, the image will be handed over to machine vision task networks. These vision tasks include image classification, semantic segmentation, and so on. We propose a side information-driven image coding for hybrid machine-human vision (SICMH) framework, not only for machine vision tasks, but also for human vision-oriented image reconstruction. The proposed SICMH framework can perform image classification, semantic segmentation, and coarse image reconstruction by using purely the side information. Moreover, SICMH can perform fine image reconstruction by using the residue information. In particular, we propose a multi-scale feature fusion block to enhance the usage of side information, and a novel semantic segmentation network named modified TrSeg to generate better semantic segmentation maps. The experimental results well demonstrated the effectiveness of our proposed framework. SICMH achieves the same image classification and semantic segmentation accuracy as the existing traditional or learning-based multi-task ICM frameworks using the lowest bitrate. For the image reconstruction task, the proposed SICMH achieved the same PSNR as existing learning-based multi-task hybrid ICM frameworks and the traditional image codec BPG again with the lowest bitrate.

关键词： image classification image coding for machines image compression Semantic segmentation Side information

来源：评论

学校读者我要写书评

暂无评论

Frequency-Aware Hierarchical image Compression for Humans and machines

Frequency-Aware Hierarchical Image Compression for Humans an...

引用

2024 Conference on Visual Communications and image Processing

作者： Luo, Yue Zhang, Zixiang Kuang, Jinhao Yu, Li Huazhong Univ Sci & Technol Wuhan Peoples R China Huazhong Univ Sci & Technol Shenzhen Res Inst Shenzhen Peoples R China

ISBN: (纸本)9798331529543;9798331529550

To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between features corresponding to different tasks, resulting in suboptimal coding performance. In this paper, we propose a frequency-aware hierarchical image compression framework designed for humans and machines. Specifically, we investigate task relationships from a frequency perspective, utilizing only HF information for machine vision tasks and leveraging both HF and LF features for image reconstruction. Besides, the residual block embedded octave convolution module is designed to enhance the information interaction between HF features and LF features. Additionally, a dual-frequency channel-wise entropy model is applied to reasonably exploit the correlation between different tasks, thereby improving multi-task performance. The experiments show that the proposed method offers -69.3%similar to-75.3% coding gains on machine vision tasks compared to the relevant benchmarks, and -19.1% gains over state-of-the-art scalable image codec in terms of image reconstruction quality.

关键词： scalable image coding image coding for machines learned image compression

来源：评论

学校读者我要写书评

暂无评论

An Effective Entropy Model for Semantic Feature Compression

An Effective Entropy Model for Semantic Feature Compression

引用

Picture coding Symposium (PCS)

作者： Shen, Tianma Liu, Ying Santa Clara Univ Dept Comp Sci & Engn Santa Clara CA 95053 USA

ISBN: (纸本)9798350358483;9798350358490

Semantic feature compression aims to compress image features for downstream machine vision tasks without reconstructing image pixels. Such a task is very challenging since it needs to learn features which are not only useful for machine vision tasks, but also easy to compress. While existing learnable feature coding models utilize downstream task networks as teacher networks to guide the learning and compression of semantic features, they use simple entropy models and do not effectively reduce information redundancy. In this work, we propose a transformer-based spatial-channel auto-regressive feature context model (SC-AR FCM) to assist the entropy coding of learnable features. Through extensive experimentation on object detection and segmentation tasks, we demonstrate that the rate-accuracy performance of our proposed method surpasses traditional image compression techniques and state-of-the-art learning-based feature compression techniques.

关键词： context model entropy model image coding for machines object detection segmentation

来源：评论

学校读者我要写书评

暂无评论

PARALLEL TASK-PROMPTS ICM: A VERSATILE FEATURE CODEC FOR MACHINE VISION 31

PARALLEL TASK-PROMPTS ICM: A VERSATILE FEATURE CODEC FOR MAC...

引用

2024 International Conference on image Processing

作者： Shen, Tianma Liu, Ying Santa Clara Univ Dept Comp Sci & Engn Santa Clara CA 95053 USA

ISBN: (纸本)9798350349405;9798350349399

image coding for machines (ICM) is developed to compress images with a focus on machine vision tasks rather than human perception. For ICM, It is very important to develop a universal codec adaptable to different machine tasks. In this paper, we propose novel parallel task-prompts that can be easily adapted to various machine vision tasks without necessitating new networks or scratch training. Besides, Our parallel prompts are compatible with mainstream backbones such as transformers and convolutional neural networks, making them widely applicable across different model architectures. In order to fine-tune our task-prompts, we leverage a machine task network as the teacher net, guiding our student ICM network to efficiently compress feature maps for downstream machine tasks. Through extensive experimentation on object detection and segmentation, we demonstrate that our proposed method surpasses traditional image compression techniques and state-of-the-art learning-based feature compression techniques in terms of rate-accuracy performance.

关键词： entropy model image coding for machines object detection segmentation task-prompts transformer

来源：评论

学校读者我要写书评

暂无评论

Learning on JPEG-LDPC Compressed images: Classifying with Syndromes 32

Learning on JPEG-LDPC Compressed Images: Classifying with Sy...

引用

32nd European Signal Processing Conference (EUSIPCO)

作者： Aliouat, Ahcen Dupraz, Elsa IMT Atlantique CNRS Lab STICC UMR 6285 Brest France

ISBN: (纸本)9789464593617;9798331519773

In goal-oriented communications, the objective of the receiver is often to apply a Deep-Learning model, rather than reconstructing the original data. In this context, direct learning over compressed data, without any prior decoding, holds promise for enhancing the time-efficient execution of inference models at the receiver. However, conventional entropic-coding methods like Huffman and Arithmetic break data structure, rendering them unsuitable for learning without decoding. In this paper, we propose an alternative approach in which entropic coding is realized with Low-Density Parity Check (LDPC) codes. We hypothesize that Deep Learning models can more effectively exploit the internal code structure of LDPC codes. At the receiver, we leverage a specific class of Recurrent Neural Networks (RNNs), specifically Gated Recurrent Unit (GRU), trained for image classification. Our numerical results indicate that classification based on LDPC-coded bit-planes surpasses Huffman and Arithmetic coding, while necessitating a significantly smaller learning model. This demonstrates the efficiency of classification directly from LDPC-coded data, eliminating the need for any form of decompression, even partial, prior to applying the learning model.

关键词： Goal-oriented communications image coding for machines Entropic coding LDPC codes RNN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：