Background modeling is a fundamental problem in computervision and usually as the first step for high-level applications. Pixel based approaches usually ignore the spatial coherence, while region based approaches are...
详细信息
ISBN:
(纸本)9781479983407
Background modeling is a fundamental problem in computervision and usually as the first step for high-level applications. Pixel based approaches usually ignore the spatial coherence, while region based approaches are sensitive to region size and scene complexity. In this paper, we propose a robust background subtraction approach via multiple features based shared models. Each shared model is represented by a sequence of samples based on sample consensus. Each pixel dynamically searches a matched model around the neighborhood. This shared mechanism not only enhances the robustness for background noise and jitter but also significantly reduces the number of models and samples for each model. Besides, we concatenate color and texture features as multiple features according to the discriminability and complementarity, so that each pixel can find a proper model more easily. Finally, the shared models are updated by random selecting a pixel matched the model with an adaptive update rate. Experiments on ChangeDetection benchmark 2014 show that the proposed approach outperforms the state-of-the-art methods.
Recent work in monocular pedestrian detection is trying to improve the execution time while keeping the accuracy as high as possible.A popular and successful approach for monocular intensity pedestrian detection is ba...
Recent work in monocular pedestrian detection is trying to improve the execution time while keeping the accuracy as high as possible.A popular and successful approach for monocular intensity pedestrian detection is based on the approximation(instead of computation) of image features for multiple scales based on the features computed on set of predefined *** port this idea to the infrared *** contributions reside in the combination of four channel features,namely infrared,histogram of gradient orientations,normalized gradient magnitude and local binary patterns with the objective of detecting pedestrians for night vision applications dealing with far infrared *** scale feature computation is done by feature *** contribution is the study of different formulations for Local Binary patterns like uniform patterns and rotation invariant patterns and their effect on detection *** detection speed is also boosted by the aid of a fast morphological based region of interest *** vary the number of approximated scales per octave and study the impact on execution time and accuracy.A reasonable result hits a speed of 18 fps with a log average miss rate of 39%.
This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 emp...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661017
This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 employed the standard bicubic downscaling setup, while Tracks 2, 3 and 4 had realistic unknown downgrading operators simulating camera image acquisition pipeline. The operators were learnable through provided pairs of low and high resolution train images. The tracks had 145, 114, 101, and 113 registered participants, resp., and 31 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.
Foreground segmentation in video sequences is a classic topic in computervision. Due to the lack of semantic and prior knowledge, it is difficult for existing methods to deal with sophisticated scenes well. Therefore...
详细信息
ISBN:
(纸本)9781509060689
Foreground segmentation in video sequences is a classic topic in computervision. Due to the lack of semantic and prior knowledge, it is difficult for existing methods to deal with sophisticated scenes well. Therefore, in this paper, we propose an end-to-end two-stage deep convolutional neural network (CNN) framework for foreground segmentation in video sequences. In the first stage, a convolutional encoder-decoder sub-network is employed to reconstruct the background images and encode rich prior knowledge of background scenes. In the second stage, the reconstructed background and current frame are input into a multi-channel fully-convolutional sub-network (MCFCN) for accurate foreground segmentation. In the two-stage CNN, the reconstruction loss and segmentation loss are jointly optimized. The background images and foreground objects are output simultaneously in an end-to-end way. Moreover, by incorporating the prior semantic knowledge of foreground and background in the pre-training process, our method could restrain the background noise and keep the integrity of foreground objects at the same time. Experiments on CDNet 2014 show that our method outperforms the state-of-the-art by 4.9%.
Moving object detection (foreground and background) is an important problem in computervision. Most of the works in this problem are based on background subtraction. However, these approaches are not able to handle s...
详细信息
Moving object detection (foreground and background) is an important problem in computervision. Most of the works in this problem are based on background subtraction. However, these approaches are not able to handle scenarios with infrequent motion of object, illumination changes, shadow, camouflage etc. To overcome these, here a two stage robust and compact method for moving object detection (MOD) is proposed. In first stage, to generate the saliency map, background image is estimated using a temporal histogram technique with the help of several input frames. In the second stage, multiscale encoder-decoder network is used to learn multiscale semantic feature of estimated saliency for foreground extraction. The encoder is used to extract multi-scale features from multi-scale saliency map. The decoder part is designed to learn the mapping of low resolution multi-scale features into high resolution output frame. To observe the efficacy of proposed MsEDNet, experiments are conducted on two benchmark datasets (change detection (CDnet-2014) and Wallflower) for MOD. The precision, recall and F-measure are used as performance parameter for comparison with the existing state-of-the-art methods. Experimental results show a significant improvement in detection accuracy and decrement in execution time as compared to the state-of-the-art methods for MOD.
暂无评论