Self-supervised depth estimators have recently shown results comparable to the supervised methods on the challenging single image depth estimation (SIDE) task, by exploiting the geometrical relations between target an...
详细信息
The use of graphic symbols in documentary records from the 5th to the 9th century has so far received scant attention. What we mean by graphic symbols are graphic signs (including alphabetical ones) drawn as a visual ...
详细信息
ISBN:
(纸本)9783030399054;9783030399047
The use of graphic symbols in documentary records from the 5th to the 9th century has so far received scant attention. What we mean by graphic symbols are graphic signs (including alphabetical ones) drawn as a visual unit in a written text and representing something other or something more than a word of that text. The Project NOTAE represents the first attempt to investigate these graphic entities as a historical phenomenon from Late Antiquity to early medieval Europe in any written sources containing texts generated for pragmatic purposes (contracts, petitions, official and private letters, lists etc.). Identifying and classifying graphic symbols on such documents is a task that requires experience and knowledge of the field, but software applications may come in help by learning to recognize symbols from previously annotated documents and suggesting experts potential symbols and likely classification in newly acquired documents to be validated, thus easing the task. This contribution introduces the NOTAE system that, in addition to the aforementioned task, provides non expert users with tools to explore the documents annotated by experts.
Recently temporal convolutional networks have shown excellent qualities in sequence modeling tasks [1]. Taking this fact into account, in this paper we investigate the possibilities of replacing recurrent networks in ...
详细信息
ISBN:
(纸本)9783030354305;9783030354299
Recently temporal convolutional networks have shown excellent qualities in sequence modeling tasks [1]. Taking this fact into account, in this paper we investigate the possibilities of replacing recurrent networks in architectures targeted specifically at image captioning. We evaluate the solution on visual Genome dataset [2], which provides extensive set of labels and descriptions that thoroughly grounds visual concepts to natural language.
In previous research, it is shown that the decoding energy demand of several video codecs can be estimated accurately by using bit stream feature-based models. Therefore, we show in this paper that the visualization w...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
In previous research, it is shown that the decoding energy demand of several video codecs can be estimated accurately by using bit stream feature-based models. Therefore, we show in this paper that the visualization with the Decoding Energy Estimation Tool (DENESTO) can help to improve the understanding of the energy demand of the decoder.
imageprocessing methods are widely used to improvise the quality of an image to extract the hidden information in it. Phenomena of scattering and atmosphere absorption results inhaze smoke and fog. Weather conditions...
详细信息
ISBN:
(纸本)9789811331404;9789811331398
imageprocessing methods are widely used to improvise the quality of an image to extract the hidden information in it. Phenomena of scattering and atmosphere absorption results inhaze smoke and fog. Weather conditions majorly influence the visual system as well as detection and identification of the targets and degrade the picture quality. In the previous year, researchers have been focused on the high-quality images or videos for enhancement as well as to detect objects. In this paper, we have reviewed previous papers and compare based on used techniques and performance parameters.
Fisheye lenses provide major benefits for many applications due to their large field of view. However, they come at the cost of strong radial distortions leading to problems in a variety of signal processing tasks whi...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
Fisheye lenses provide major benefits for many applications due to their large field of view. However, they come at the cost of strong radial distortions leading to problems in a variety of signal processing tasks which have been developed with perspective lenses in mind. As such, while state-of-the-art image and video codecs excel in reducing redundancy and irrelevance in content captured with perspective lenses, the coding gain reduces significantly when fisheye lenses are applied. To improve the understanding of distortions introduced by fisheye lenses with respect to perspective lenses, we provide an interactive user interface for the visualization of fisheye block distortions.
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometric-visual descriptor improves standard image based localization approaches.
Feature extraction and representation are crucial stages for image classification. However, the derived features might not be robust discriminators, which radically alter image classification results. In this paper, w...
详细信息
ISBN:
(数字)9781728158358
ISBN:
(纸本)9781728158365
Feature extraction and representation are crucial stages for image classification. However, the derived features might not be robust discriminators, which radically alter image classification results. In this paper, we present an enhanced Bag of visual Words representation for better image classification. We are inspired by the Gestalt laws of grouping. So, our design goals are: (1) Introduce the spatial information into Bag of Words representation by grouping the aligned visual words (high order features), (2) Preserve the spatial relationships among the visual words groups against different geometric transformations, (3) Achieve the high-level representation of visual words by extracting visual phrases. Theses phrases convey the semantic meaning of object parts. The proposed approach has been evaluated, in terms of classification accuracy, on Caltech101 image database. Experimental results indicate that the classification performances are improved by using visual phrases.
In this paper, a sparse binocular fusion convolution neural network is proposed to evaluate the quality of stereo image. In order to simulate the long-Term fusion and processing of the left and right views in the brai...
详细信息
With the rapid development of three-dimensional (3D) technology, the effective stereoscopic image quality assessment (SIQA) methods are in great demand. Stereoscopic image contains depth information, making it much mo...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
With the rapid development of three-dimensional (3D) technology, the effective stereoscopic image quality assessment (SIQA) methods are in great demand. Stereoscopic image contains depth information, making it much more challenging in exploring a reliable SIQA model that fits human visual system. In this paper, a no-reference SIQA method is proposed, which better simulates binocular fusion and binocular rivalry. The proposed method applies convolutional neural network to build a dual-channel model and achieve a long-term process of feature extraction, fusion, and processing. What's more, both high and low frequency information are used effectively. Experimental results demonstrate that the proposed model outperforms the state-of-the-art no-reference SIQA methods and has a promising generalization ability.
暂无评论