检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Dong, Cunhui Ma, Haichuan Zhang, Haotian Gao, Changsheng Li, Li Liu, Dong The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Neural network-based image coding has been developing rapidly since its birth. Until 2022, its performance has surpassed that of the best-performing traditional image coding framework – H.266/VVC. Witnessing such success, the IEEE 1857.11 working subgroup initializes a neural network-based image coding standard project and issues a corresponding call for proposals (CfP). In response to the CfP, this paper introduces a novel wavelet-like transform-based end-to-end image coding framework – iWaveV3. iWaveV3 incorporates many new features such as affine wavelet-like transform, perceptual-friendly quality metric, and more advanced training and online optimization strategies into our previous wavelet-like transform-based framework iWave++. While preserving the features of supporting lossy and lossless compression simultaneously, iWaveV3 also achieves state-of-the-art compression efficiency for objective quality and is very competitive for perceptual quality. As a result, iWaveV3 is adopted as a candidate scheme for developing the IEEE Standard for neural-network-based image coding. Copyright © 2024, The Authors. All rights reserved.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Overfitted image coding at Reduced Complexity

Overfitted Image Coding at Reduced Complexity

引用

European Signal Processing Conference (EUSIPCO)

作者： Théophile Blard Théo Ladune Pierrick Philippe Gordon Clare Xiaoran Jiang Olivier Déforges Orange Innovation France IETR France

ISBN: (数字)9789464593617

ISBN: (纸本)9798331519773

Overfitted image codecs offer compelling compression performance and low decoder complexity, through the overfitting of a lightweight decoder for each image. Such codecs include Cool-chic, which presents image coding performance on par with VVC while requiring around 2000 multiplications per decoded pixel. This paper proposes to decrease Cool-chic encoding and decoding complexity. The encoding complexity is reduced by shortening Cool-chic training, up to the point where no overfitting is performed at all. It is also shown that a tiny neural decoder with 300 multiplications per pixel still outperforms HEVC. A near real-time CPU implementation of this decoder is made available at https://***/Cool-Chic/.

关键词： Training image coding Codecs Europe Signal processing Encoding Real-time systems Hardware Decoding Complexity theory

来源：评论

学校读者我要写书评

暂无评论

Tell Codec What Worth Compressing: Semantically Disentangled image coding for Machine with LMMs

Tell Codec What Worth Compressing: Semantically Disentangled...

引用

IEEE Visual Communications and image Processing (VCIP)

作者： Jinming Liu Yuntao Wei Junyan Lin Shengyang Zhao Heming Sun Zhibo Chen Wenjun Zeng Xin Jin Shanghai Jiao Tong University Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo China Yokohama National University University of Science and Technology of China

ISBN: (数字)9798331529543

ISBN: (纸本)9798331529550

We present a new image compression paradigm to achieve "intelligently coding for machine" by cleverly leveraging the common sense of Large Multimodal Models (LMMs). We are motivated by the evidence that large language/multimodal models are powerful general-purpose semantics predictors for understanding the real world. Different from traditional image compression typically optimized for human eyes, the image coding for machines (ICM) framework we focus on requires the compressed bitstream to more comply with different downstream intelligent analysis tasks. To this end, we employ LMM to${\text{tell codec what to compress}}$: 1) first utilize the powerful semantic understanding capability of LMMs w.r.t object grounding, identification, and importance ranking via prompts, to disentangle image content before compression, 2) and then based on these semantic priors we accordingly encode and transmit objects of the image in order with a structured bitstream. In this way, diverse vision benchmarks including image classification, object detection, instance segmentation, etc., can be well supported with such a semantically structured bitstream. We dub our method "SDComp" for "Semantically Disentangled Compression", and compare it with state-of-the-art codecs on a wide variety of different vision tasks. SDComp codec leads to more flexible reconstruction results, promised decoded visual quality, and a more generic/satisfactory intelligent task-supporting ability.

关键词： Instance segmentation Visualization image coding Codecs Grounding Visual communication Semantics Object detection Predictive models Object recognition

来源：评论

学校读者我要写书评

暂无评论

Scalable image coding for Humans and Machines Using Feature Fusion Network

Scalable Image Coding for Humans and Machines Using Feature ...

引用

IEEE Workshop on Multimedia Signal Processing

作者： Takahiro Shindo Taiju Watanabe Yui Tatsumi Hiroshi Watanabe Waseda University Tokyo Japan

ISBN: (数字)9798350387254

ISBN: (纸本)9798350387261

As image recognition models become more prevalent, scalable coding methods for machines and humans gain more importance. Applications of image recognition models include traffic monitoring and farm management. In these use cases, the scalable coding method proves effective because the tasks require occasional image checking by humans. Existing image compression methods for humans and machines meet these requirements to some extent. However, these compression methods are effective solely for specific image recognition models. We propose a learning-based scalable image coding method for humans and machines that is compatible with numerous image recognition models. We combine an image compression model for machines with a compression model, providing additional information to facilitate image decoding for humans. The features in these compression models are fused using a feature fusion network to achieve efficient image compression. Our method's additional information compression model is adjusted to reduce the number of parameters by enabling combinations of features of different sizes in the feature fusion network. Our approach confirms that the feature fusion network efficiently combines image compression models while reducing the number of parameters. Furthermore, we demonstrate the effectiveness of the proposed scalable coding method by evaluating the image compression performance in terms of decoded image quality and bitrate. Code is available at https://***/final-0/ICM-v1.

关键词： Performance evaluation image quality image coding Codes image edge detection Conferences Signal processing Decoding Monitoring Information rates

来源：评论

学校读者我要写书评

暂无评论

Non-Expansive Implementation of Discrete Wavelet Transform for Real Time image coding System

Non-Expansive Implementation of Discrete Wavelet Transform f...

引用

Computer, Electronics & Electrical Engineering & their Applications (IC2E3), International Conference on

作者： Shalini Singh Irfanul Hasan Department of Electronics and Communication Engineering Graphic Era (Deemed to be University) Dehradun India

ISBN: (数字)9798350388534

ISBN: (纸本)9798350388541

The filter bank implementation of standard Discrete Wavelet Transform (DWT) suffers from coefficient expansion problem. In this paper a methodology is proposed to address the coefficient expansion problem of standard DWT. With this methodology a non –expansive implementation is carried out without increasing computation load and memory requirements. However, this non-expansive filter bank realization of discrete wavelet transform suffers from boundary artifacts, but it is restricted to a fewer coefficients only at the boundaries. An optimal filter is required to reduce this boundary artifacts problem. The proposed non-expansive DWT is highly suited for real time wavelet based image coding system.

关键词： image coding Memory management Filter banks Low-pass filters Real-time systems Discrete wavelet transforms Computational efficiency Complexity theory Standards image reconstruction

来源：评论

学校读者我要写书评

暂无评论

A Comparison of Charging Voltage image coding Methods for Lithium-Ion Battery State of Health Estimation

A Comparison of Charging Voltage Image Coding Methods for Li...

引用

Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD), International Conference on

作者： Hang Wang Yuanyuan Zhou Zhongding Fan Zhiyong Hu Lei Mao Yongbin Liu School of Electrical Engineering and Automation Anhui University Smart Grid Digital Collaborative Technology Joint Laboratory of Anhui Province Hefei China School of Electrical Engineering and Automation Anhui University Hefei China School of Engineering Science University of Science and Technology of China Hefei China

ISBN: (数字)9798331529192

ISBN: (纸本)9798331529208

Accurate estimation of the state of health (SOH) of lithium-ion batteries is a key initiative to guarantee their service reliability in complex operating environments. Using one-dimensional time series data to transform two-dimensional image for battery degradation feature extraction can improve the accuracy of battery SOH evaluation, reduce the complexity of evaluation model and the demand for the amount of test data. Although existing studies have attempted to apply image coding techniques to enhance the degradation features of original data, the advantages and disadvantages of different image coding methods have not been systematically compared. Therefore, in this work, five commonly used image coding methods including recurrence plots, Gramian angular summation field, Gramian angular difference field, relative position matrix, and time series data folding are selected and comprehensively compared. Firstly, the original one-dimensional voltage signal is encoded into a two-dimensional image, which is then inputted into the CNN-GRU-based SOH prediction model, and finally the future battery SOH value is output. The experimental results show that there are differences in the applicable stages and conditions of different coding methods, so they need to be adapted with specific application scenarios, which is the next research direction.

关键词： Lithium-ion batteries Degradation image coding Accuracy Time series analysis Evaluation models Estimation Voltage Feature extraction Encoding

来源：评论

学校读者我要写书评

暂无评论

UNIFYING GENERATION AND COMPRESSION: ULTRA-LOW BITRATE image coding VIA MULTI-STAGE TRANSFORMER

arXiv

引用

arXiv 2024年

作者： Xue, Naifu Mao, Qi Wang, Zijian Zhang, Yuan Ma, Siwei Communication University of China China Peking University China

Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios ( © 2024, CC BY-NC-ND.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Feature-Preserving Rate-Distortion Optimization in image coding for Machines

arXiv

引用

arXiv 2024年

作者： Fernández-Menduiña, Samuel Pavez, Eduardo Ortega, Antonio Department of Electrical and Computer Engineering University of Southern California Los AngelesCA United States

With the increasing number of images and videos consumed by computer vision algorithms, compression methods are evolving to consider both perceptual quality and performance in downstream tasks. Traditional codecs can tackle this problem by performing rate-distortion optimization (RDO) to minimize the distance at the output of a feature extractor. However, neural network non-linearities can make the rate-distortion landscape irregular, leading to reconstructions with poor visual quality even for high bit rates. Moreover, RDO decisions are made block-wise, while the feature extractor requires the whole image to exploit global information. In this paper, we address these limitations in three steps. First, we apply Taylor's expansion to the feature extractor, recasting the metric as an input-dependent squared error involving the Jacobian matrix of the neural network. Second, we make a localization assumption to compute the metric block-wise. Finally, we use randomized dimensionality reduction techniques to approximate the Jacobian. The resulting expression is monotonic with the rate and can be evaluated in the transform domain. Simulations with AVC show that our approach provides bit-rate savings while preserving accuracy in downstream tasks with less complexity than using the feature distance directly. © 2024, CC BY.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

BRIDGING THE GAP BETWEEN image coding FOR MACHINES AND HUMANS

arXiv

引用

arXiv 2024年

作者： Le, Nam Zhang, Honglei Cricri, Francesco Youvalari, Ramin G. Tavakoli, Hamed Rezazadegan Aksu, Emre Hannuksela, Miska M. Rahtu, Esa Nokia Technologies Finland Tampere University Finland

image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy. In many use cases, such as surveillance, it is also important that the visual quality is not drastically deteriorated by the compression process. Recent works on using neural network (NN) based ICM codecs have shown significant coding gains against traditional methods;however, the decompressed images, especially at low bitrates, often contain checkerboard artifacts. We propose an effective decoder finetuning scheme based on adversarial training to significantly enhance the visual quality of ICM codecs, while preserving the machine analysis accuracy, without adding extra bitcost or parameters at the inference phase. The results show complete removal of the checkerboard artifacts at the negligible cost of −1.6% relative change in task performance score. In the cases where some amount of artifacts is tolerable, such as when machine consumption is the primary target, this technique can enhance both pixel-fidelity and feature-fidelity scores without losing task performance. Copyright © 2024, The Authors. All rights reserved.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Improving image coding for Machines Through Optimizing Encoder Via Auxiliary Loss

Improving Image Coding for Machines Through Optimizing Encod...

引用

IEEE International Conference on image Processing

作者： Kei Iino Shunsuke Akamatsu Hiroshi Watanabe Shohei Enomoto Akira Sakamoto Takeharu Eda Graduate School of Fundamental Science and Engineering Waseda University Tokyo Japan NTT Software Innovation Center Tokyo Japan

ISBN: (数字)9798350349399

ISBN: (纸本)9798350349405

image coding for machines (ICM) aims to compress images for machine analysis using recognition models rather than human vision. Hence, in ICM, it is important for the encoder to recognize and compress the information necessary for the machine recognition task. There are two main approaches in learned ICM; optimization of the compression model based on task loss, and Region of Interest (ROI) based bit allocation. These approaches provide the encoder with the recognition capability. However, optimization with task loss becomes difficult when the recognition model is deep, and ROI-based methods often involve extra overhead during evaluation. In this study, we propose a novel training method for learned ICM models that applies auxiliary loss to the encoder to improve its recognition capability and rate-distortion performance. Our method achieves Bjøntegaard Delta rate improvements of $27.7 \%$ and $20.3 \%$ in object detection and semantic segmentation tasks, compared to the conventional training method.

关键词： Training Analytical models image coding image recognition Semantic segmentation Bit rate Rate-distortion

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：