this paper proposes a novel deep learning based approach for monocular visual odometry (VO) called FlowVO-Net. Our approach utilizes CNN to extract motion information between two consecutive frames and employs Bi-dire...
详细信息
the real-time face occlusion recognition is an important computervision problem, especially for the public safety field. In order to construct a real-time face occlusion recognition system, this paper first establish...
详细信息
ISBN:
(纸本)9783030314569;9783030314552
the real-time face occlusion recognition is an important computervision problem, especially for the public safety field. In order to construct a real-time face occlusion recognition system, this paper first established a large occlusion face database. then, this paper proposed a face occlusion recognition algorithm based on the fusion of histogram of oriented gradient(HOG) and local binary pattern(LBP), the experimental results show that the occlusion face recall rate and the unobstructed face recall rate are 92.03% and 93.58% respectively, the speed is about 12.26 ms. Finally, taking into account time factor, this paper established a lightweight deep neural network based on AlexNet with an occlusion face recall rate and an unobstructed face recall rate of 91.79% and 91.42% respectively, and the speed is approximately 22.92 ms. the experimental results show that the face occlusion recognition method based on HOG+LBP features not only improves the recognition rate of occlusion face, but also reduces the time complexity, and illustrates the effectiveness of the algorithm.
Withthe miniaturisation of underwater cameras, the volume of available underwater images has been considerably increasing. However, underwater images are degraded by the absorption and scattering of light in water. I...
详细信息
ISBN:
(纸本)9783030057923;9783030057916
Withthe miniaturisation of underwater cameras, the volume of available underwater images has been considerably increasing. However, underwater images are degraded by the absorption and scattering of light in water. Image processing methods exist that aim to compensate for these degradations, but there are no standard quality evaluation measures or testing datasets for a systematic empirical comparison. For this reason, we propose PUIQE, an online platform for underwater image quality evaluation, which is inspired by other computervision areas whose progress has been accelerated by evaluation platforms. PUIQE supports the comparison of methods through standard datasets and objective evaluation measures: quality scores for images uploaded on the platform are automatically computed and published in a leader-board, which enables the ranking of methods. We hope that PUIQE will stimulate and facilitate the development of underwater image processing algorithms to improve underwater images.
computervision techniques applied on images opportunistically captured from body-worn cameras or mobile phones offer tremendous potential for vision-based context awareness. In this paper, we evaluate the potential t...
详细信息
ISBN:
(纸本)9781728107882
computervision techniques applied on images opportunistically captured from body-worn cameras or mobile phones offer tremendous potential for vision-based context awareness. In this paper, we evaluate the potential to recognise the modes of locomotion and transportation of mobile users, by analysing single images captured by body-worn cameras. We evaluate this withthe publicly available Sussex-Huawei Locomotion and Transportation Dataset, which includes 8 transportation and locomotion modes performed over 7 months by 3 users. We present a baseline performance obtained through crowd sourcing using Amazon Mechanical Turk. Humans infered the correct modes of transportations from images with an F1-score of 52%. the performance obtained by five state-of-the-art Deep Neural Networks (VGG16, VGG19, ResNet50, MobileNet and DenseNet169) on the same task was always above 71.3% F1-score. We characterise the effect of partitioning the training data to fine-tune different number of blocks of the deep networks and provide recommendations for mobile implementations.
computervision has attracted more and more attention withthe fast development of deep learning. the instance segmentation area, which extends the Object detection, can better help us comprehend the surrounding envir...
详细信息
ISBN:
(数字)9781728131290
ISBN:
(纸本)9781728131306
computervision has attracted more and more attention withthe fast development of deep learning. the instance segmentation area, which extends the Object detection, can better help us comprehend the surrounding environments. In this paper, we ensembled the tricks that can strengthen the model performance for instance segmentation. We do the ablation experiments for the MS-COCO datasets and LVIS datasets. the results demonstrate that the selected tricks can greatly boost the performance. With our tricks, our model achieves the 7th on the LVIS Challenge Track for ICCV 2019 workshop.
the scale research of landscape pattern is an important basis for the study of spatiotemporal evolution of landscape pattern and the scientific and reasonable allocation of landscape pattern. this paper takes the midd...
详细信息
Most of existing object detectors usually adopt a small training batch size (e.g. 16), which severely hinders the whole community from exploring large-scale datasets due to the extremely long training procedure. ...
详细信息
the field of computervision is continuously increasing and becoming more complex and power demanding. Using feature detection and description allows a fast object detection without needing big databases. FPGAs are pr...
详细信息
ISBN:
(纸本)9781728148847
the field of computervision is continuously increasing and becoming more complex and power demanding. Using feature detection and description allows a fast object detection without needing big databases. FPGAs are predestined for different requirements, like real-time and power constraints, which are important in many application areas. this work proposes a new patternrecognition algorithm, based on an improved Accelerated KAZE (AKAZE) detector and Fast Retina Keypoint (FREAK) descriptor. Our software implementation increased the repeatability in comparison to the original algorithm using optimized configurations. the percentage of correct matching features between two images (repeatability) increased from 85.7% to 91.4%, while the computation time decreases from 70.3ms to 24.9ms. Furthermore, we present an efficient FPGA implementation of the FREAK descriptor. the accelerator processes 2048 features at 73.4 frames per second;achieving a repeatability of 90.9%, while being optimized for resource utilization and memory bandwidth consumption. Additionally, we show an efficient Integral Image implementation that processes four image pixels per clock cycle at a high frequency (204 MHz on xc7z020clg484-1) consuming minimum resources.
暂无评论