Inverse tone mapping(iTM) is an operation to transform low-dynamic-range (LDR) content to high-dynamic-range (HDR) content, which is an effective technique to improve the visual experience. ITM has developed rapidly w...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
Inverse tone mapping(iTM) is an operation to transform low-dynamic-range (LDR) content to high-dynamic-range (HDR) content, which is an effective technique to improve the visual experience. ITM has developed rapidly with deep learning algorithms in recent years. However, the great majority of deep-learning-based iTM methods are aimed at images and ignore the temporal correlations of consecutive frames in videos. In this paper, we propose a multi-scale video iTM network with deformable alignment, which increases time consistency in videos. We first align the input consecutive LDR frames at the feature level by deformable convolutions and then simultaneously use multi-frame information to generate the HDR frame. Additionally, we adopt a multi-scale iTM architecture with a pyramid pooling module, which enables our network to reconstruct details as well as global features. The proposed network achieves better performance compared to other iTM methods on quantitative metrics and gain a significant visual improvement.
image object co-segmentation aims to segment common objects in a group of images. This paper proposes a novel neural network, which extracts multi-scale convolutional features at multiple layers via a modified VGG net...
详细信息
Virtual reality (VR), a computer-generated interactive environment, is provided to a user by projecting a peripheral image onto environmental surfaces. VR has an advantage of enhancing the immersive experience. Nowada...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
Virtual reality (VR), a computer-generated interactive environment, is provided to a user by projecting a peripheral image onto environmental surfaces. VR has an advantage of enhancing the immersive experience. Nowadays, VR has been widely applied in tourism and cultural experience. On the other hand, a recent integration of electroencephalography-based (EEG-based) brain-computer interface (BCI) and VR is capable of promoting the immersive virtual experience. Therefore, our study aims to propose an integrative framework to implement EEG-based BCI in a VR game to advance the cultural experience. A room escape game in a Tainan temple is created. EEG signals arc recorded while users arc playing the game. The online analyses of EEG signals arc used to interact with the VR display. This integrative framework can result in a better experience than the conventional setup.
Synthesizing images from text is an important problem and has various applications. Most of the existing studies of text-to-image generation utilize supervised methods and rely on a fully-labeled dataset, but detailed...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
Synthesizing images from text is an important problem and has various applications. Most of the existing studies of text-to-image generation utilize supervised methods and rely on a fully-labeled dataset, but detailed and accurate descriptions of images are onerous to obtain. In this paper, we introduce a simple but effective semi-supervised approach that considers the feature of unlabeled images as "Pseudo Text Feature". Therefore, the unlabeled data can participate in the following training process. To achieve this, we design a Modality-invariant Semantic- consistent Module which aims to make the image feature and the text feature indistinguishable and maintain their semantic information. Extensive qualitative and quantitative experiments on MNIST and Oxford-102 flower datasets demonstrate the effectiveness of our semi-supervised method in comparison to supervised ones. We also show that the proposed method can be easily plugged into other visual generation models such as image translation and performs well.
visual Secret Sharing (VSS) aims at sharing of secrecy among many shadows depending on the number of dealers. These encrypted shadows themselves cannot reveal secrecy. A group of a pre-qualified number of shadows is s...
详细信息
ISBN:
(数字)9781728149882
ISBN:
(纸本)9781728149899
visual Secret Sharing (VSS) aims at sharing of secrecy among many shadows depending on the number of dealers. These encrypted shadows themselves cannot reveal secrecy. A group of a pre-qualified number of shadows is sufficient to obtain the secret image. The quality of this secret is usually approximate to the original confidential image. This technique has evolved, offering solutions to the demands of the matured Internet domain. This paper analyzes the feature advancements in the VSS field. The VSS approaches remain distinct from the conventional cryptography, thereby offering new security solutions in many applications. Security being the central and challenging demand in contemporary information systems, the analysis of the state-of-the-art VSS stands significant.
Recently, a great variety of CNN-based methods have been proposed for single image super-resolution. But how to restore more high-frequency details is still an unsolved issue. It is easy to find that the low-frequency...
详细信息
ISBN:
(数字)9783030368029
ISBN:
(纸本)9783030368029;9783030368012
Recently, a great variety of CNN-based methods have been proposed for single image super-resolution. But how to restore more high-frequency details is still an unsolved issue. It is easy to find that the low-frequency information is similar in a pair of low-resolution and high-resolution images. So the model only needs to pay more attention to the high-frequency information to restore more realistic images which have abundant details and meet human visual system better. In this paper, we propose a deep residual-dense attention network (RDAN) for image super-resolution. Specially, we propose a channel attention module to change the weight of each channel and a spatial attention module to rescale the region weight in a channel map, which can make the model focus more on the high-frequency information. Experimental results on five benchmark datasets show that RDAN is superior to those state-of-the-art methods for both accuracy and visual performance.
In this paper, a three-channel convolutional neural network (CNN) constrained by multiple loss functions is designed for stereoscopic image quality assessment (SIQA). Given that both monocular and binocular informatio...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
In this paper, a three-channel convolutional neural network (CNN) constrained by multiple loss functions is designed for stereoscopic image quality assessment (SIQA). Given that both monocular and binocular information are crucial for SIQA, we take the patches of left images, right images and difference images as the inputs of the three channels respectively. Since using the ground truth as the labels of image patches cannot accurately characterize their quality, we propose to individually label each image patch to preserve the quality difference among different regions and views. Moreover, the multi-loss structure is adopted in the proposed method to consider both local features and global features simultaneously, which can constrain the feature learning from multiple perspectives. And the additional adaptive loss weights make the multi-loss network more flexible and universal. The experimental results show that the proposed method is superior to other existing SIQA methods with state-of-the-art performance.
Inter-frame prediction plays an important role in video coding by predicting the current frame from previously encoded pictures, called reference pictures. In the case of camera motion, the content of a current frame ...
详细信息
In this paper, we build up a new block-based color image compression scheme based on our proposed transform domain down-sampling method and deep convolutional reconstruction algorithm. Specifically, our proposed down-...
详细信息
Convolutional neural networks(CNN) are showing powerful performance on image recognition tasks. However, when CNN is applied to mobile devices, with limited computing and memory resource, it requires more compact desi...
详细信息
暂无评论