image coding for Machines (ICM) aims to develop systems and frameworks for image compression and transmission tailored for computer vision tasks such as object detection and instance segmentation. This paper focuses o...
详细信息
image coding for Machines (ICM) aims to develop systems and frameworks for image compression and transmission tailored for computer vision tasks such as object detection and instance segmentation. This paper focuses on multi-task ICM not only for object detection and instance segmentation but also for image compression. In recent years, Contrastive Language-image Pre-training (CLIP) has demonstrated its powerful capability in extracting text and image features and has been applied to various vision tasks. Inspired by CLIP, channel-wise context model and mask convolutional neural network (PixelCNN), we propose a Blur CLIP context model (BCcm) for reducing bitrate usage and a two-hyperprior multitask framework augmented by BCcm (TMFBC). Through experiments, we have observed that down and up sample resize pairs can significantly reduce the bitrate usage. We integrated a plug-and-play down and up sample resize pair at both ends of TMFBC. Additionally, we propose a Resnet Pyramid Hierarchical Feature Extractor (RPHFE) based on down and up sample resize pairs to endow decoded images with richer multi-scale features. We term the aforementioned pipeline as TMFBC with down and up Sample Pairs (TMFBC-SP). We compare the proposed TMFBC-SP with state-of-the-art (SOTA) methods and demonstrate that we can achieve higher object detection and instance segmentation accuracy, measured by mean average precision (mAP), using fewer bitrates than existing approaches. We also compare ours with coarse-to-fine and context models on image compression, ours achieves higher PSNR and uses less bitrate.
Ultrasonic guided wave (UGW)-based damage detection is regarded as a leading technology in structural health monitoring (SHM) for assessing the integrity of composite structures. However, achieving accurate and effect...
详细信息
Ultrasonic guided wave (UGW)-based damage detection is regarded as a leading technology in structural health monitoring (SHM) for assessing the integrity of composite structures. However, achieving accurate and effective real-time damage detection remains a challenge. To address this issue, a novel UGW-based damage detection approach is proposed for real-time damage localization and quantification in composite plates. In the proposed approach, first, considering the expensive calculation of multipath UGW signals, an efficient UWG signal compression method is constructed on the basis of differential-driven piecewise aggregate approximation (DPAA) algorithm to further improve the calculation efficiency. Next, the Gramian angular field (GAF) image encoding feature extraction method is innovatively used to transform the concatenated 1-D guided wave signal into a 2-D image, which preserves the original time information and captures the temporal correlation between different timestamps in the guided wave signal. Then, by incorporating the specially designed partial group convolution (PGC) block and dynamic multiscale residual channel attention (DMRCA) mechanism, the proposed lightweight PGC-DMRCA network is capable of detecting damage in real-time with high accuracy and low computational complexity. Notably, the performance of the proposed lightweight network is verified using two real-world datasets and a publicly available dataset. Experimental results demonstrate that the proposed lightweight approach delivers exceptional performance in locating and quantifying damage, surpassing mainstream end-to-end damage detection methodologies in both accuracy and efficiency.
Lossy image coding is the art of computing that is principally bounded by the image’s rate-distortion function. This bound, though never accurately characterized, has been approached practically via deep learning tec...
详细信息
We propose a semantic priori-guided scalable image coding method for simultaneously supporting fast machine vision analysis and high quality human visual experience. To obtain high-performance machine vision analysis ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
We propose a semantic priori-guided scalable image coding method for simultaneously supporting fast machine vision analysis and high quality human visual experience. To obtain high-performance machine vision analysis results with a more compact base layer bitrate, our base layer directly encodes intermediate semantic features of the pre-trained machine vision task network, which effectively reduces the impact of the information needed for the human vision task. To improve model performance while quickly supporting machine vision tasks, we use structural re-parameterization technology to optimize the model. Considering that the base layer’s semantic features can effectively reflect the regions of important image content, we use the semantic prior provided by the base layer to guide the enhancement layer encoding and decoding, which allows us to pay more attention to the reconstruction of semantically important regions. In addition, we use the base layer features to predict the enhancement layer features for performing feature-domain residual coding, which further reduces the bitrate and also reduces the effect of noise compared to pixel-domain residual coding. Extensive experimental results show that our method achieves significant advantages in both object detection performance and image reconstruction tasks compared with BPG and state-of-the-art deep learning-based scalable image coding methods.
A review of vector quantization techniques used for encoding digital images is presented. First, the concept of vector quantization is introduced, then its application to digital images is explained. Spatial, predicti...
详细信息
A review of vector quantization techniques used for encoding digital images is presented. First, the concept of vector quantization is introduced, then its application to digital images is explained. Spatial, predictive, transform, hybrid, binary, and subband vector quantizers are reviewed. The emphasis is on the usefulness of the vector quantization when it is combined with conventional image coding techniques, or when it is used in different domains.< >
This paper deals with information compression for digitized image encoding. Among the different techniques, we discuss the Linear Predictive coding (LPC) technique. Because of the non-homogeneous nature of images, a s...
详细信息
This paper deals with information compression for digitized image encoding. Among the different techniques, we discuss the Linear Predictive coding (LPC) technique. Because of the non-homogeneous nature of images, a space-varying model would yield better results. Therefore, we propose a Linear Predictive image coding technique using a space-varying 2-D AR filter. Compression is achieved by an approximation of the AR model input (prediction residual) with a limited number of pulses. The multipulse techniques of Atal and Depreterre are used for the input estimation. First, the space-varying AR 2-D filter is presented for image modeling. Using space basis functions, the LPC parameters are estimated. Then, the problem of the estimation of the synthetic input (location and amplitude of pulses) is stated. Two solutions are given to overcome the computational difficulties that arise when excitation estimation methods (Atal and Depreterre) are applied with a space-varying image model. The efficiency of the proposed solutions is shown by experimental results. Zusammenfassung Dieser Beitrag beschäftigt sich mit der Informationskompression für die Decodierung von digitalisierten Bildern. Von den verschiedenen Techniken besprechen wir die Lineare Prädiktive Codierung (LPC). Wegen der nichthomogenen Eigenschaft von Bildern, würde ein ortsabhängiges Modell zu besseren Resultaten führen. Basierend auf dieser Feststellung, schlagen wir eine Lineare Prädiktive Bildcodierungstechnik vor, die ein ortsvariantes 2-D AR Filter benutzt. Die Kompression wird durch eine Approximation des AR Modelleingangs (prediction residual) mit einer begrenzten Anzahl von Pulsen erreicht. Die Multipulse-Techniken von Atal und Depreterre werden zur Schätzung des Eingangs benutzt. Zunächst werden ortsvariante AR 2-D Filter für die Bildmodellierung vorgestellt. Unter der Benutzung von örtlichen Basisfunktionen werden die LPC Parameter geschätzt. In einem weiteren Schritt wird das Problem der Schätzung des s
In this article, an efficient image coding scheme that takes advantages of feature vector in wavelet domain is proposed. First, a multi-stage discrete wavelet transform is applied on the image. Then, the wavelet featu...
详细信息
In this article, an efficient image coding scheme that takes advantages of feature vector in wavelet domain is proposed. First, a multi-stage discrete wavelet transform is applied on the image. Then, the wavelet feature vectors are extracted from the wavelet-decomposed subimages by collecting the corresponding wavelet coefficients. And finally, the image is coded into bit-stream by applying vector quantization (VQ) on the extracted wavelet feature vectors. In the encoder, the wavelet feature vectors are encoded with a codebook where the dimension of codeword is less than that of wavelet feature vector. By this way, the coding system can greatly improve its efficiency. However, to fully reconstruct the image, the received indexes in the decoder are decoded with a codebook where the dimension of codeword is the same as that of wavelet feature vector. Therefore, the quality of reconstructed images can be preserved well. The proposed scheme achieves good compression efficiency by the following three methods. (1) Using the correlation among wavelet coefficients. (2) Placing different emphasis on wavelet coefficients at different decomposing levels. (3) Preserving the most important information of the image by coding the lowest-pass sub-image individually. In our experiments, simulation results show that the proposed scheme outperforms the recent VQ-based image coding schemes and wavelet-based image coding techniques, respectively. Moreover, the proposed scheme is also suitable for very low bit rate image coding. (c) 2005 Wiley Periodicals, Inc.
This paper describes a new and efficient method for low bit-rate image coding which is based on recent development in the theory of multivariate nonlinear piecewise polynomial approximation. It combines a binary space...
详细信息
This paper describes a new and efficient method for low bit-rate image coding which is based on recent development in the theory of multivariate nonlinear piecewise polynomial approximation. It combines a binary space partition scheme with geometric wavelet (GW) tree approximation so as to efficiently capture curve singularities and provide a sparse representation of the image. The GW method successfully competes with state-of-the-art wavelet methods such as the EZW, SPIHT, and EBCOT algorithms. We report a gain of about 0.4 dB over the SPIHT and EBCOT algorithms at the bit-rate 0.0625 bits-per-pixels (bpp). It also outperforms other recent methods that are based on "sparse geometric representation." For example, we report a gain of 0.27 dB over the Bandelets algorithm at 0.1 bpp. Although the algorithm is computationally intensive, its time complexity can be significantely reduced by collecting a "global" GW n-term approximation to the image from a collection of GW trees, each constructed separately over tiles of the image.
Vector quantisation (VQ) shows a good performance for image coding with high-compression ratios. However, there are many difficulties for image coding with VQ, especially the edge degradation and high-computational co...
详细信息
Vector quantisation (VQ) shows a good performance for image coding with high-compression ratios. However, there are many difficulties for image coding with VQ, especially the edge degradation and high-computational complexity. To resolve these two problems, the authors propose a new coding method based on edge orientation patterns (EOPs) by classifying image blocks into nine classes according to their edge orientations. For colour image coding, 27 codebooks (nine for each colour component) are pre-designed based on a series training images. In the encoding stage, an input colour image is decomposed into Y, Cb, and Cr components, and each component image is divided into non-overlapping 4 x 4 blocks. For each block, eight edge orientation templates of size 4 x 4 are performed to determine its edge orientation. According to the edge orientation, each block is compressed by using the corresponding codebook. Essentially, the authors' scheme is a kind of classified VC (CVQ). Simulation results show that, their EOP-based CVQ can largely improve the compression efficiency as well as speeding up the encoding process and it is sufficient to establish effectiveness of the authors' algorithm as compared with the existing techniques.
This paper presents the study realized for the international mission of planetary exploration Phobos II (1988) concerning the coding of Phobos images on-board. The developed software based on Discrete Cosine Transform...
详细信息
This paper presents the study realized for the international mission of planetary exploration Phobos II (1988) concerning the coding of Phobos images on-board. The developed software based on Discrete Cosine Transform was implemented on an on-board computer, with success, taking into account the several constraints of this space mission.
暂无评论