Exploiting spatial redundancy in images is responsible for a large gain in the performance of image and video compression. The main tool to achieve this is called intra-frame prediction. In most state-of-the-art video...
详细信息
Exploiting spatial redundancy in images is responsible for a large gain in the performance of image and video compression. The main tool to achieve this is called intra-frame prediction. In most state-of-the-art video coders, intra prediction is applied in a block-wise fashion. Up to now angular prediction was dominant, providing a low-complexity method covering a large variety of content. With deep learning, however, it is possible to create prediction methods covering a wider range of content, being able to predict structures which traditional modes can not predict accurately. Using the conditional autoencoder structure, we are able to train a single artificial neural network which is able to perform multi-mode prediction. In this paper, we derive the approach from the general formulation of the intra-prediction problem and introduce two extensions for spatial mode prediction and for chroma prediction support. Moreover, we propose a novel latent-space-based cross component prediction. We show the power of our prediction scheme with visual examples and report average gains of 1.13% in Bjontegaard delta rate in the luma component and 1.21% in the chroma component compared to VTM using only traditional modes.
In real-world industrial scenarios, fault detection faces the widely recognized challenge of data imbalance, which not only refers to the scarcity of fault data but also includes the imbalance in healthy data. This ar...
详细信息
In a criminal investigation, a writer identification is frequently performed to find out what kind of writer a certain letter was written. In comparing the similarity between the handwriting of both documents, it is n...
详细信息
ISBN:
(纸本)9781538658758
In a criminal investigation, a writer identification is frequently performed to find out what kind of writer a certain letter was written. In comparing the similarity between the handwriting of both documents, it is necessary to extract a person's characteristic writing style. Although the text-dependent methods have been studied show high discrimination performance for a writer identification, the situation assumed in the actual appraisal that the same character class does not exist is not supported. In this research, we propose a writer identification method taking into account the actual appraisal situation. We define the handwriting features without dependence on character class as "personal writing style" and construct the autoencoder with the condition of character class to extract personal writing style from a single character sample. In the latent space trained by the conditional autoencoder, similar personal writing styles are mapped on to neighboring points independent of the character class. At the time of writer identification, the similarity between the handwriting feature of unknown writer and reference writer in the latent space is evaluated. In order to confirm the effectiveness of the proposed method, we conducted a writer identification experiments using ETL-1 Character Database and NIST Special Database 19 2nd Edition. As a result, it is indicated that it is possible to extract personal writing style, which is an effective feature for writer identification, even under conditions close to the practical situation.
We propose a conditional latent factor asset pricing model for energy commodities (CAE) that uses a modified conditional autoencoder neural network to capture the non-linear relationship between latent factors and fac...
详细信息
We propose a conditional latent factor asset pricing model for energy commodities (CAE) that uses a modified conditional autoencoder neural network to capture the non-linear relationship between latent factors and factor loadings. In addition to spot prices, we incorporate 127 macroeconomic and 598 energy information characteristics to extract the factor loadings. The empirical results demonstrate the high-quality performance of the model in out-of-sample testing. Furthermore, by analyzing characteristic importance, we find that energy information characteristics, particularly coal, electricity, and crude oil and natural gas resource development, play a dominant role in explaining the excess returns of energy commodities.
In this paper, we present CAESR, an hybrid learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard. Our framework considers a low-resolution signal encoded with VVC in...
详细信息
ISBN:
(纸本)9781728185514
In this paper, we present CAESR, an hybrid learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard. Our framework considers a low-resolution signal encoded with VVC intra-mode as a base-layer (BL), and a deep conditional autoencoder with hyperprior (AE-HP) as an enhancement-layer (EL) model. The EL encoder takes as inputs both the upscaled BL reconstruction and the original image. Our approach relies on conditional coding that learns the optimal mixture of the source and the upscaled BL image, enabling better performance than residual coding. On the decoder side, a super-resolution (SR) module is used to recover high-resolution details and invert the conditional coding process. Experimental results have shown that our solution is competitive with the VVC full-resolution intra coding while being scalable.
Aiming at the problem of sample missing for magnetic flux leakage (MFL), a data reconstruction method based on conditional autoencoder (CVAE) and generative adversarial networks (GAN) is proposed. This method combines...
详细信息
ISBN:
(纸本)9781728159225
Aiming at the problem of sample missing for magnetic flux leakage (MFL), a data reconstruction method based on conditional autoencoder (CVAE) and generative adversarial networks (GAN) is proposed. This method combines the advantages of CVAE and GAN, and generates high-quality samples steadily. The proposed CVAE-GAN method can not only reconstruct the missing MFL samples, but also generate a large amount of real and diverse defect sample, which solves the problem of low accuracy of the defect detection model due to insufficient samples and lack of diversity of samples. The defect sample are collected from the domestic in-service oil pipelines in experiments. The experimental results illustrate that the proposed method can effectively generate high-quality samples.
Discovering and addressing unknown, including unanticipated, part-to-part variation sources is an important, yet challenging problem in manufacturing variation reduction. The state-of-art methods for solving this prob...
详细信息
Discovering and addressing unknown, including unanticipated, part-to-part variation sources is an important, yet challenging problem in manufacturing variation reduction. The state-of-art methods for solving this problem have focused solely on traditional mass manufacturing settings, in which abundant measurement data of parts with the same design are available. Applying these methods to custom manufacturing processes is problematic because the number of parts with the same design in custom manufacturing is often small. This paper proposes a new variation model that considers custom manufacturing parameters to aggregate measurement data across all custom parts. We also propose to estimate this model via a conditional autoencoder. The advantages of the proposed approach are demonstrated with a simulated toy-building brick example and a real cylindrical machining example. The approach successfully reveals unknown variation patterns even with a relatively small number of parts in these examples. Our approach is also generally applicable to any mainstream manufacturing processes that produce multiple part designs.
Dense retrievers utilize pretrained language models to encode queries and documents as high-dimensional embeddings for retrieval. Nevertheless, these high-dimensional embeddings usually result in more expensive index ...
详细信息
ISBN:
(纸本)9789819617098;9789819617104
Dense retrievers utilize pretrained language models to encode queries and documents as high-dimensional embeddings for retrieval. Nevertheless, these high-dimensional embeddings usually result in more expensive index storage and higher retrieval latency. In this paper, we further explore the potential of building a lightweight dense retrieval system by combining the dimension reduction in the encoding-indexing pipeline. Our experiments demonstrate that the encoding-compression-indexing-retrieval method can conduct an efficient dense retrieval system, reducing the retrieval latency of 96%, while maintaining comparable retrieval effectiveness. Our further analyses illustrate that the dimensional reduction method can broaden its retrieval effectiveness in different domains and cooperate with different index building methods.
Intra prediction has been an integral part of image and video coders for a long time. A predominant method is angular prediction that extends the reference area in a certain angle into the block. Recently many deep-le...
详细信息
ISBN:
(纸本)9781728180687
Intra prediction has been an integral part of image and video coders for a long time. A predominant method is angular prediction that extends the reference area in a certain angle into the block. Recently many deep-learning-based methods have been proposed. Since intra prediction uses multiple modes this usually requires training a large number of networks. With a conditional autoencoder we are able to generate an arbitrary number of modes with only one network. In this paper we introduce a novel loss function enforcing a spatially correlated latent space and extend the network structure to the same end. Thereby we are able to propose a simple spatial mode prediction scheme using most-probable-mode lists. By replacing matrix-based intra prediction in VVC with our method, we obtain average rate savings of 0.84% with peak gains of 2.37%.
conditional coding is a new video coding paradigm enabled by neural-network-based compression. It can be shown that conditional coding is in theory better than the traditional residual coding, which is widely used in ...
详细信息
conditional coding is a new video coding paradigm enabled by neural-network-based compression. It can be shown that conditional coding is in theory better than the traditional residual coding, which is widely used in video compression standards like HEVC or VVC. However, on closer inspection, it becomes clear that conditional coders can suffer from information bottlenecks in the prediction path, i.e., that due to the data processing inequality not all information from the prediction signal can be passed to the reconstructed signal, thereby impairing the coder performance. In this paper we propose the conditional residual coding concept, which we derive from information theoretical properties of the conditional coder. This coder significantly reduces the influence of bottlenecks, while maintaining the theoretical performance of the conditional coder. We provide a theoretical analysis of the coding paradigm and demonstrate the performance of the conditional residual coder in a practical example. We show that conditional residual coders alleviate the disadvantages of conditional coders while being able to maintain their advantages over residual coders. In the spectrum of residual and conditional coding, we can therefore consider them as "the best from both worlds."
暂无评论