Synthetic DNA has received much attention recently as a long-term archival medium alternative due to its high density and durability characteristics. However, most current work has primarily focused on using DNA as a ...
详细信息
ISBN:
(纸本)9781728185514
Synthetic DNA has received much attention recently as a long-term archival medium alternative due to its high density and durability characteristics. However, most current work has primarily focused on using DNA as a precise storage medium. In this work, we take an alternate view of DNA. Using neural-network-based compression techniques, we transform images into a latent-space representation, which we then store on DNA. By doing so, we transform DNA into an approximate image storage medium, as images generated back from DNA are only approximate representations of the original images. Using several datasets, we investigate the storage benefits of approximation, and study the impact of DNA storage errors (substitutions, indels, bias) on the quality of approximation. In doing so, we demonstrate the feasibility and potential of viewing DNA as an approximate storage medium.
We have witnessed the rapid development of learned image compression (LIC). The latest LIC models have outperformed almost all traditional image compression standards in terms of rate-distortion (RD) performance. Howe...
详细信息
ISBN:
(纸本)9781728185514
We have witnessed the rapid development of learned image compression (LIC). The latest LIC models have outperformed almost all traditional image compression standards in terms of rate-distortion (RD) performance. However, the time complexity of LIC model is still underdiscovered, limiting the practical applications in industry. Even with the acceleration of GPU, LIC models still struggle with long coding time, especially on the decoder side. In this paper, we analyze and test a few prevailing and representative LIC models, and compare their complexity with traditional codecs including H.265/HEVC intra and H.266/VVC intra. We provide a comprehensive analysis on every module in the LIC models, and investigate how bitrate changes affect coding time. We observe that the time complexity bottleneck mainly exists in entropy coding and context modelling. Although this paper pay more attention to experimental statistics, our analysis reveals some insights for further acceleration of LIC model, such as model modification for parallel computing, model pruning and a more parallel context model.
A model-based coding system has come under serious consideration for the next generation of image coding schemes, aimed at greater efficiency in TV-telephone and TV-conference systems [l]-[5]. In this model-based codi...
详细信息
ISBN:
(纸本)0819407429
A model-based coding system has come under serious consideration for the next generation of image coding schemes, aimed at greater efficiency in TV-telephone and TV-conference systems [l]-[5]. In this model-based coding system, the sender’s model image is transmitted and stored at the receiving side before the start of conversation. During the conversation, feature points are extracted from the facial image of the sender, and are transmitted to the receiver. The facial expression of the sender facial is reconstructed from the feature points received and a wireframe model constructed at the receiving side.
The rapid advancements in medical imaging have led to a growing demand for high-performance lossless compression of large 3D medical image datasets. Unlike natural images, medical images typically feature three-dimens...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
The rapid advancements in medical imaging have led to a growing demand for high-performance lossless compression of large 3D medical image datasets. Unlike natural images, medical images typically feature three-dimensional structures, and high bit-depth, necessitating specialized compression techniques. Based on a decoder-only transformer, we propose a learnable dual-decoder model for lossless compression of 3D medical images. Our approach packs voxels into patches, which are processed by a patch-level decoder to extract the patch feature. The voxels, along with the patch feature, are subsequently fed into a voxel-level decoder to model each voxel. This coarse-to-fine modeling strategy reduces the computational time for each voxel and enables long-range modeling dependencies. Experimental results demonstrate that our proposed model achieves state-of-the-art compression performance, with an approximately 15% improvement in compression performance over the traditional JP3D benchmark on various datasets.
Traffic sign recognition plays a crucial role in self-driving cars, but unfortunately, it is vulnerable to adversarial patches (AP). Although AP can efficiently fool DNN-based models in previous studies, the connectio...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Traffic sign recognition plays a crucial role in self-driving cars, but unfortunately, it is vulnerable to adversarial patches (AP). Although AP can efficiently fool DNN-based models in previous studies, the connection between image forensics and AP detection still needs to be explored. From a high-level point of view, their goals are the same. That is to find tampered regions and prevent false positives in the meantime. A natural question arises: "Is achieving application-agnostic anomaly detection possible?" In this paper, we propose image Forensics Defense Against Adversarial Patch (IDAP), a framework to defend against adversarial patches via generalizable features learned from tampered images. In addition, we incorporate the Hausdorff erosion loss into our network model for joint training to complete the shape of a predicted mask. Extensive experimental comparisons on three datasets, including COCO, DFG, and APRICOT demonstrate that IDAP outperforms state-of-the-art AP detection methods.
Scale-Invariant Feature Transform (SIFT) is one of the most well-known image matching methods, which has been widely applied in various visual fields. Because of the adoption of a difference of Gaussian (DoG) pyramid ...
详细信息
ISBN:
(纸本)9781728185514
Scale-Invariant Feature Transform (SIFT) is one of the most well-known image matching methods, which has been widely applied in various visual fields. Because of the adoption of a difference of Gaussian (DoG) pyramid and Gaussian gradient information for extrema detection and description, respectively, SIFT achieves accurate key points and thus has shown excellent matching results but except under adverse weather conditions like rain. To address the issue, in the paper we propose a divide-and-conquer SIFT key points recovery algorithm from a single rainy image. In the proposed algorithm, we do not aim to improve quality for a derained image, but divide the key point recovery problem from a rainy image into two sub-problems, one being how to recover the DoG pyramid for the derained image and the other being how to recover the gradients of derained Gaussian images at multiple scales. We also propose two separate deep learning networks with different losses and structures to recover them, respectively. This divide-and-conquer scheme to set different objectives for SIFT extrema detection and description leads to very robust performance. Experimental results show that our proposed algorithm achieves state-of-the-art performances on widely used image datasets in both quantitative and qualitative tests.
In this paper, we restate the model of spatio-chromatic sampling in single-chip digital cameras covered by Color Filter Array (CFA)(1). The model shows that a periodic arrangement of chromatic samples in the CFA gives...
详细信息
ISBN:
(纸本)9780819469946
In this paper, we restate the model of spatio-chromatic sampling in single-chip digital cameras covered by Color Filter Array (CFA)(1). The model shows that a periodic arrangement of chromatic samples in the CFA gives luminance and chromatic information that is localized in the Fourier domain. This representation allows defining a space invariant uniform demosaicking method which is based on the frequency selection of the luminance and chrominance information. We then show two extended methods which used the frequency representation of the Bayer CFA(2,3) to derive adaptive demosaicking. Finally, we will show the application of the model for CFA with random arrangement of chromatic samples, either using a linear method based on Wiener estimation(4) or with an adaptive method(5).
JPEG 2000 is the new standard for image compression. The features of this standard makes it is suitable for imaging and multimedia applications in this era of wireless and Internet communications. Discrete Wavelet Tra...
详细信息
ISBN:
(纸本)9780819469946
JPEG 2000 is the new standard for image compression. The features of this standard makes it is suitable for imaging and multimedia applications in this era of wireless and Internet communications. Discrete Wavelet Transform and embedded bit plane coding are the two key building blocks of the JPEG 2000 encoder. The JPEG 2000 architecture for image compression makes high quality compression possible in video mode also, i.e. motion JPEG 2000. In this paper, we of fixed present a study of the compression impact using variable code block size in different levels of DWT instead code block size as specified in the original standard. We also discuss the advantages of using variable code block sizes and its VLSI implementation.
When training a Learned image Compression model, the loss function is minimized such that the encoder and the decoder attain a target Rate-Distorsion trade-off. Therefore, a distinct model shall be trained and stored ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
When training a Learned image Compression model, the loss function is minimized such that the encoder and the decoder attain a target Rate-Distorsion trade-off. Therefore, a distinct model shall be trained and stored at the transmitter and receiver for each target rate, fostering the quest for efficient variable bitrate compression schemes. This paper proposes plugging Low-Rank Adapters into a transformer-based pre-trained LIC model and training them to meet different target rates. With our method, encoding an image at a variable rate is as simple as training the corresponding adapters and plugging them into the frozen pre-trained model. Our experiments show performance comparable with state-of-the-art fixed-rate LIC models at a fraction of the training and deployment cost. We publicly released the code at https://***/EIDOSLAB/ALICE.
In order to solve the disadvantage of ignoring the visual feature in image retrieval, a novel method based on mutual information was presented incorporating the different sensitivity along variant directions of human ...
详细信息
ISBN:
(纸本)9781424437092
In order to solve the disadvantage of ignoring the visual feature in image retrieval, a novel method based on mutual information was presented incorporating the different sensitivity along variant directions of human visual system. Firstly, combining the visual feature with principle of the block truncation code (BTC), the significant image feature is extracted and a novel descriptor is defined. Based on the analysis of the characters of the descriptors, the mutual information is chosen to image retrieval. Compared with the other algorithms, experiments show the proposed approach has good performance.
暂无评论