Recently, convolutional neural networks (CNNs) have achieved great improvements in single image dehazing and attained much attention in research. Most existing learning-based dehazing methods are not fully end-to-end,...
详细信息
Texture synthesis using deep neural networks can generate high quality and diversified textures. However, it usually requires a heavy optimization process. The following works accelerate the process by using feed-forw...
详细信息
ISBN:
(数字)9781728171685
ISBN:
(纸本)9781728171692
Texture synthesis using deep neural networks can generate high quality and diversified textures. However, it usually requires a heavy optimization process. The following works accelerate the process by using feed-forward networks, but at the cost of scalability. diversity or quality. We propose a new efficient method that aims to simulate the optimization process while retains most of the properties. Our method takes a noise image and the gradients from a descriptor network as inputs, and synthesize a refined image with respect to the target image. The proposed method can synthesize images with better quality and diversity than the other fast synthesis methods do. Moreover, our method trained on a large scale dataset can generalize to synthesize unseen textures.
The audio-video based emotion recognition aims to classify a given video into basic emotions. In this paper, we describe our approaches in EmotiW 2019, which mainly explores emotion features and feature fusion strateg...
详细信息
Generative Adversarial Networks (GAN) have demonstrated the potential to recover realistic details for single image super-resolution (SISR). To further improve the visual quality of super-resolved results, PIRM2018-SR...
详细信息
In recent years, scene text recognition has achieved significant improvement and various state-of-the-art recognition approaches have been proposed. This paper focused on recognizing text in natural photos of equipmen...
In recent years, scene text recognition has achieved significant improvement and various state-of-the-art recognition approaches have been proposed. This paper focused on recognizing text in natural photos of equipment nameplates, which has wide applications in industrial automations. This task only receives little attentions in previous works. The challenge of this problem comes from multi-orientation, curved, noisy and blurry text patches in equipment nameplates. To address this problem, we propose a deep model for text recognition in multi-oriented nameplates, namely, Orientation Robust Scene Text recognition (ORSTR). Specifically, our model employs a rectification module to transform curved, distorted or multi-orientation text to near-horizontal text with a carefully designed rectification module. Once the near-horizontal text has been generated, recognition network will output the predictions of text patches. Our scene text recognition model achieves 90 . 8% recognition accuracy on equipment nameplate dataset which outperforms previous scene text recognition model (CRNN) about 0 . 8%. Several extensive experiments have been conducted to verify the effectiveness of our model.
Fine-grained classification is a challenging problem, due to subtle differences among highly-confused categories. Most approaches address this difficulty by learning discriminative representation of individual input i...
详细信息
Temporal convolution has been widely used for video classification. However, it is performed on spatio-temporal contexts in a limited view, which often weakens its capacity of learning video representation. To allevia...
详细信息
Multi-scale techniques have achieved great success in a wide range of computervision tasks. However, while this technique is incorporated in existing works, there still lacks a comprehensive investigation on variants...
详细信息
Recent studies have witnessed that self-supervised methods based on view synthesis obtain clear progress on multiview stereo (MVS). However, existing methods rely on the assumption that the corresponding points among ...
详细信息
Most of the existing video face super-resolution (VFSR) methods are trained and evaluated on VoxCeleb1, which is designed specifically for speaker identification and the frames in this dataset are of low quality. As a...
详细信息
暂无评论