The spherical domain representation of 360 ∘ video/image presents many challenges related to the storage, processing, transmission and rendering of omnidirectional videos (ODV). Models of human visual attention can b...
详细信息
It is important and necessary to obtain high-frequency information and texture details in the image reconstruction applications, such as image super-resolution. Hence, it is proposed the multi-scale fusion network (MC...
详细信息
ISBN:
(纸本)9789811365041;9789811365034
It is important and necessary to obtain high-frequency information and texture details in the image reconstruction applications, such as image super-resolution. Hence, it is proposed the multi-scale fusion network (MCFN) in this paper. In the network, three pathways are designed for different receptive fields and scales, which are expected to obtain more texture details. Meanwhile, the local and global residual learning strategies are employed to prevent overfitting and to improve reconstruction quality. Compared with the classic convolutional neural network-based algorithms, the proposed method achieves better numerical and visual effects.
image captioning is a technology that generates textual descriptions of images by integrating computer vision and natural language processing. This review aims to provide a comprehensive overview of current state-of-t...
详细信息
Privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification data is mainly in the mapping between the image and its corr...
详细信息
Privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification data is mainly in the mapping between the image and its corresponding label, since this relation provides a great amount of information and can be used in other scenarios. In this paper, we propose the mapping distortion based protection (MDP) and its augmentation-based extension (AugMDP) to protect the data privacy by modifying the original dataset. In the modified dataset generated by MDP, the image and its label are not consistent (e.g., a cat-like image is labeled as the dog), whereas the DNNs trained on it can still achieve good performance on benign testing set. As such, this method can protect privacy when the dataset is leaked. Extensive experiments are conducted, which verify the effectiveness and feasibility of our method.
We propose an efficient edge-based stereo visual odometry (VO) using multiple quadtrees created according to image gradient orientations. To characterize edges, we classify them into eight orientation groups according...
详细信息
ISBN:
(纸本)9781728162126
We propose an efficient edge-based stereo visual odometry (VO) using multiple quadtrees created according to image gradient orientations. To characterize edges, we classify them into eight orientation groups according to their image gradient directions. Using the edge groups, we construct eight quadtrees and set overlapping areas belonging to adjacent quadtrees for robust and efficient matching. For further acceleration, previously visited tree nodes are stored and reused at the next iteration to warm-start. We propose an edge culling method to extract prominent edgelets and prune redundant edges. The camera motion is estimated by minimizing point-to-edge distances within a re-weighted iterative closest points (ICP) framework, and simultaneously, 3-D structures are recovered by static and temporal stereo settings. To analyze the effects of the proposed methods, we conduct extensive simulations with various settings. Quantitative results on public datasets confirm that our approach has competitive performance with state-of-the-art stereo methods. In addition, we demonstrate the practical values of our system in author-collected modern building scenes with curved edges only.
While people primarily communicate with text in mobile chat applications, they are increasingly using visual elements such as images, emojis, and memes. Using such visual elements could help users communicate clearly ...
详细信息
ISBN:
(纸本)9781450375160
While people primarily communicate with text in mobile chat applications, they are increasingly using visual elements such as images, emojis, and memes. Using such visual elements could help users communicate clearly and make chatting experience enjoyable. However, finding and inserting contextually appropriate images during the chat can be both tedious and distracting. We introduce MilliCat, a real-time image suggestion system that recommends images that match the chat content within a mobile chat application (i.e., autocomplete with images). MilliCat combines natural language processing (e.g., keyword extraction, dependency parsing) and mobile computing (e.g., resource and energy-efficiency) techniques to autonomously make image suggestions when users might want to use images. Through multiple user studies, we investigated the effectiveness of our design choices, the frequency and motivation of image usage by the participants, and the impact of MilliCat on mobile chat experiences. Our results indicate that MilliCat's realtime image suggestion enables users to quickly and conveniently select and display images on mobile chat by significantly reducing the latency in the image selection process (3.19x improvement) and consequently more frequent image usage (1.8x) than existing solutions. Our study participants reported that they used images more often with MilliCat as the images helped them convey information more effectively, emphasize their opinion, express emotions, and have fun chatting experience.
A significant expansion of the scope of computer vision, in particular in real-time systems, places very high demands on them in terms of productivity and efficiency of information processing, and in feedback systems,...
详细信息
Stereoscopic image quality assessment (SIQA) has always been challenging due to the remarkable distinction between human monocular and binocular vision. This paper proposes a novel gradient-based dictionary learning m...
详细信息
Nonlocal self-similarity (NSS) as a remarkable prior has been successfully applied to imageprocessing including image denoising. For each local patch in an image, we can search for many nonlocally similar patches und...
详细信息
ISBN:
(纸本)9781665409513
Nonlocal self-similarity (NSS) as a remarkable prior has been successfully applied to imageprocessing including image denoising. For each local patch in an image, we can search for many nonlocally similar patches under NSS prior and stack them into a group, and then process the group instead of a patch to better capture the nonlocal structure. Most existing methods consider the NSS either of the internal degraded image or from an external clean corpus, which may limit the denoising performance. In this paper, we propose a hybrid model to catch both the external and internal NSS. Specifically, we develop a model with two regularizers, the internal NSS is exploited by a low-rankness regularizer, and the external NSS is exploited by a sparse regularizer. An alternating minimization method is developed to solve our model. Experimental results demonstrate that our algorithm achieves better results compared with several state-of-the-art denoising methods.
Numerous imageprocessing methods have been proposed to help low vision people, often relied on contrast enhancement algorithms. Their assessment is usually performed by tests on low vision subjects, which are expensi...
详细信息
暂无评论