Shadows can have a negative effect on the ability of computervision techniques for object detection, tracking, and recognition. therefore, ability to remove shadows and byproducts of illumination is an important prob...
详细信息
ISBN:
(数字)9781728150529
ISBN:
(纸本)9781728150529
Shadows can have a negative effect on the ability of computervision techniques for object detection, tracking, and recognition. therefore, ability to remove shadows and byproducts of illumination is an important problem to enable effective object recognition actions. As applications move into levels of higher information extraction and higher required processing speeds, efficient and sophisticated shadow detection and removal becomes even more necessary. In this study we propose a shadow removal method, parallelize using a Tesla P100 GPU, and achieve a speedup of 21.67x on an 18 megapixel (MP) resolution image compared to the same method implemented in Matlab.
We introduce our online panorama generation system for armored rescue vehicles in an extreme disaster. To make the whole system fireproof and heatproof, all the cameras are embedded around the vehicle body, and this m...
详细信息
ISBN:
(纸本)9781728132327
We introduce our online panorama generation system for armored rescue vehicles in an extreme disaster. To make the whole system fireproof and heatproof, all the cameras are embedded around the vehicle body, and this makes the panorama stitching suffer from binocular parallax. Because vehicle and remote operators should observe its current surroundings, the panorama image should be generated online in an acceptable time. We explain how to calculate the initial alignment of input camera image on the cylindrical canvas in advance, and how to merge the overlapped image by removing the binocular parallax. Structures for the data transfer is also discussed. Experiments show that the current implementation can generate and share the panoramic image with remote operators successfully.
Facial landmark detection aims at locating a sparse set of fiducial facial key-points. Two significant issues (i.e., Intra-Dataset Variation and Inter-Dataset Variation) remain in datasets which dramatically lead to p...
详细信息
Facial landmark detection aims at locating a sparse set of fiducial facial key-points. Two significant issues (i.e., Intra-Dataset Variation and Inter-Dataset Variation) remain in datasets which dramatically lead to performance degradation. Specifically, dataset variations will lead to severe over-fitting easily and perform poor generalization in recent in-the-wild datasets which severely harm the robustness of facial landmark detection algorithm. In this study, we show that model robustness can be significantly improved by lever-aging rich variations within and between different datasets. this is non-trivial because of the serious data bias within one certain dataset and inconsistent landmark definitions between different datasets, which make it an extraordinarily tough task. To address the mentioned problems, we proposed a novel Deep Coupling Neural Network (DCNN), which consists of two strong coupling sub-networks, e.g., Dataset-Across Network (DA-Net) and Candidate-Decision Network (CD-Net). In particular, DA-Net takes advantage of different characteristics and distributions across different datasets, while CD-Net makes a final decision on candidate hypotheses given by DA-Net to leverage variations within one certain dataset. Extensive evaluations show that our approach dramatically outperforms state-of-the-art methods on the challenging 300-W and WFLW dataset (C) 2019 Elsevier Ltd. All rights reserved.
Histogram based and correlation based algorithms for computer analysis and restoration of motion blurred images have been proposed. the histogram algorithm uses ridge detection method and direction distribution histog...
详细信息
Intrinsic image decomposition is a highly ill-posed problem in computervision referring to extract albedo and shading from an image. In this paper, we regard it as an image-to-image translation issue and propose a no...
详细信息
ISBN:
(数字)9783030225148
ISBN:
(纸本)9783030225148;9783030225131
Intrinsic image decomposition is a highly ill-posed problem in computervision referring to extract albedo and shading from an image. In this paper, we regard it as an image-to-image translation issue and propose a novel thought, which makes use of parallel convolutional neural networks (ParCNN) to learn albedo and shading with different spatial features and data distributions, respectively. At the same time, the energy is preserved as much as possible under the constraint of image reconstruction loss shared by the two networks. Moreover, we add the gradient prior based on the traditional image formation process into the loss function, which can lead to a performance improvement of our basic learning model by jointing advantages of the physically-based method and the data-driven method. We choose MPI Sintel dataset for model training and testing. Quantitative and qualitative evaluation results outperform the state-of-the-art methods.
the proceedings contain 99 papers. the special focus in this conference is on computers Helping People with Special Needs. the topics include: Stereo vision based distance estimation and logo recognition for the visua...
ISBN:
(纸本)9783319942766
the proceedings contain 99 papers. the special focus in this conference is on computers Helping People with Special Needs. the topics include: Stereo vision based distance estimation and logo recognition for the visually impaired;intersection navigation for people with visual impairment;indoor localization using computervision and visual-inertial odometry;hapticrein: Design and development of an interactive haptic rein for a guidance robot;echoVis: Training echolocation using binaural recordings – initial benchmark results;tactiBelt: Integrating spatial cognition and mobility theories into the design of a novel orientation and mobility assistive device for the blind;virtual navigation environment for blind and low vision people;visual shoreline detection for blind and partially sighted people;3D-printing of personalized assistive technology;camassia: Monocular interactive mobile way sonification;a proposed method for producing embossed dots graphics with a 3D printer;accessibility as prerequisite for the production of individualized aids through inclusive maker spaces;Hackability: A methodology to encourage the development of DIY assistive devices;Universal design tactile graphics production system BPLOT4 for blind teachers and blind staffs to produce tactile graphics and ink print graphics of high quality;a user study to evaluate tactile charts with blind and visually impaired people;concept-building in blind readers withthematic tactile volumes;augmented reality for people with visual impairments: Designing and creating audio-tactile content from existing objects;recording of fingertip position on tactile picture by the visually impaired and analysis of tactile information;designing an interactive tactile relief of the meissen table fountain;one-handed braille in the air.
the proceedings contain 179 papers. the special focus in this conference is on computers Helping People with Special Needs. the topics include: Stereo vision based distance estimation and logo recognition for the visu...
ISBN:
(纸本)9783319942735
the proceedings contain 179 papers. the special focus in this conference is on computers Helping People with Special Needs. the topics include: Stereo vision based distance estimation and logo recognition for the visually impaired;intersection navigation for people with visual impairment;indoor localization using computervision and visual-inertial odometry;hapticrein: Design and development of an interactive haptic rein for a guidance robot;echoVis: Training echolocation using binaural recordings – initial benchmark results;tactiBelt: Integrating spatial cognition and mobility theories into the design of a novel orientation and mobility assistive device for the blind;virtual navigation environment for blind and low vision people;visual shoreline detection for blind and partially sighted people;3D-printing of personalized assistive technology;camassia: Monocular interactive mobile way sonification;a proposed method for producing embossed dots graphics with a 3D printer;accessibility as prerequisite for the production of individualized aids through inclusive maker spaces;Hackability: A methodology to encourage the development of DIY assistive devices;Universal design tactile graphics production system BPLOT4 for blind teachers and blind staffs to produce tactile graphics and ink print graphics of high quality;a user study to evaluate tactile charts with blind and visually impaired people;concept-building in blind readers withthematic tactile volumes;augmented reality for people with visual impairments: Designing and creating audio-tactile content from existing objects;recording of fingertip position on tactile picture by the visually impaired and analysis of tactile information;designing an interactive tactile relief of the meissen table fountain;one-handed braille in the air.
the work focuses on human reidentification, i.e. identifying an unknown person using a photo from a surveillance camera. A base method that involves modifying the VGG16 neural network algorithm is proposed. Experiment...
详细信息
Modern data-driven computervision algorithms require a large volume, varied data for validation or evaluation. We utilize computergraphics techniques to generate a large volume foggy image dataset of road scenes wit...
详细信息
ISBN:
(纸本)9781538610343
Modern data-driven computervision algorithms require a large volume, varied data for validation or evaluation. We utilize computergraphics techniques to generate a large volume foggy image dataset of road scenes with different levels of fog. We compare with other popular synthesized datasets, including data collected both from the virtual world and the real world. In addition, we benchmark recent popular dehazing methods and evaluate their performance on different datasets, which provides us an objectively comparison of their limitations and strengths. To our knowledge, this is the first foggy and hazy dataset with large volume data which can be helpful for computervision research in the autonomous driving.
In this paper we show how a differentiable, physics-based renderer suitable for photometric vision tasks can be implemented as layers in a deep neural network. the layers include geometric operations for representatio...
详细信息
ISBN:
(纸本)9781538610343
In this paper we show how a differentiable, physics-based renderer suitable for photometric vision tasks can be implemented as layers in a deep neural network. the layers include geometric operations for representation transformations, reflectance evaluations with arbitrary numbers of light sources and statistical bidirectional reflectance distribution function (BRDF) models. We make an implementation of these layers available as a neural network library (PVNN) for theano. the layers can be incorporated into any neural network architecture, allowing parts of the photometric image formation process to be explicitly modelled in a network that is trained end to end via backpropagation. As an exemplar application, we show how to train a network with encoder-decoder architecture that learns to estimate BRDF parameters from a single image in an unsupervised manner.
暂无评论