Fire flame detection using color information is an important problem for public security and has many applications in computervision and other domains. the color model based method used for fire flame detection has m...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
Fire flame detection using color information is an important problem for public security and has many applications in computervision and other domains. the color model based method used for fire flame detection has many advantages over conventional methods, such as simple, feasible and understandable. In order to improve the performance of fire flame detection based on video, we propose an effective color model based method for fire flame detection and build a corresponding fire flame detection system. Firstly, candidate fire flame regions are detected using the chromatic and dynamic measurements. Secondly, the fire flame regions are determined based on the area of the candidate regions. Finally, the fire flame detection system will give an alarm voice when the number of successive fire frames surpasses threshold. Experimental results show the effectiveness of our system on various fire-detection tasks in real-world environments.
Recently, skeleton based action recognition gains more popularity due to cost-effective depth sensors coupled with real-time skeleton estimation algorithms. Traditional approaches based on handcrafted features are lim...
详细信息
ISBN:
(纸本)9781538604571
Recently, skeleton based action recognition gains more popularity due to cost-effective depth sensors coupled with real-time skeleton estimation algorithms. Traditional approaches based on handcrafted features are limited to represent the complexity of motion patterns. Recent methods that use Recurrent Neural Networks (RNN) to handle raw skeletons only focus on the contextual dependency in the temporal domain and neglect the spatial configurations of articulated skeletons. In this paper, we propose a novel two-stream RNN architecture to model both temporal dynamics and spatial configurations for skeleton based action recognition. We explore two different structures for the temporal stream: stacked RNN and hierarchical RNN. Hierarchical RNN is designed according to human body kinematics. We also propose two effective methods to model the spatial structure by converting the spatial graph into a sequence of joints. To improve generalization of our model, we further exploit 3D transformation based data augmentation techniques including rotation and scaling transformation to transform the 3D coordinates of skeletons during training. Experiments on 3D action recognition benchmark datasets show that our method brings a considerable improvement for a variety of actions, i.e., generic actions, interaction activities and gestures.
Much research presented recently supports the idea that the human perception of attractiveness is data-driven and largely irrespective of the perceiver. this suggests using pattern analysis techniques for beauty analy...
详细信息
ISBN:
(纸本)9783642137716
Much research presented recently supports the idea that the human perception of attractiveness is data-driven and largely irrespective of the perceiver. this suggests using pattern analysis techniques for beauty analysis. Several scientific papers on this subject are appearing in image processing. computervision and pattern analysis contexts, or use techniques of these areas. In this paper, we will survey the recent studies on automatic analysis of facial beauty, and discuss research lines and practical applications.
In this paper we consider a mobile platform with two cameras directed towards the floor mounted the same distance from the ground, assuming planar motion and constant internal parameters. Earlier work related to this ...
详细信息
ISBN:
(纸本)9789897582769
In this paper we consider a mobile platform with two cameras directed towards the floor mounted the same distance from the ground, assuming planar motion and constant internal parameters. Earlier work related to this specific problem geometry has been carried out for monocular systems, and the main contribution of this paper is the generalization to a binocular system and the recovery of the relative translation and orientation between the cameras. the method is based on previous work on monocular systems, using sequences of inter-image homographies. Experiments are conducted using synthetic data, and the results demonstrate a robust method for determining the relative parameters.
Moving object detection is a challenging task for night security because of bad video quality. In this paper, we propose a robust real time objects detection method for night visual surveillance based on human visual ...
详细信息
ISBN:
(纸本)3540312447
Moving object detection is a challenging task for night security because of bad video quality. In this paper, we propose a robust real time objects detection method for night visual surveillance based on human visual system. By measuring contrast information variation in multiple successive frames, a spatio-temporal contrast change image (CCI) is formed. then the multi-frame correspondence technology is employed to robustly extract salient motions or moving objects from CCI. Since CCI is a statistical measurement of variation based on human visual system, the proposed method is effective at night and better than traditional detection methods. Experiments on real scene show that the method based on contrast feature is effective for night object detection and tracking, our approach is also robust to camera scale variation as well as low computation cost.
Surveillance cameras have been widely used in different scenes. Accordingly, a demanding need is to recognize a person under different cameras, which is called person re-identification. this topic has gained increasin...
详细信息
ISBN:
(纸本)9781538604571
Surveillance cameras have been widely used in different scenes. Accordingly, a demanding need is to recognize a person under different cameras, which is called person re-identification. this topic has gained increasing interests in computervision recently. However, less attention has been paid to video-based approaches, compared with image-based ones. Two steps are usually involved in previous approaches, namely feature learning and metric learning. But most of the existing approaches only focus on either feature learning or metric learning. Meanwhile, many of them do not take full use of the temporal and spatial information. In this paper, we concentrate on video-based person re-identification and build an end-to-end deep neural network architecture to jointly learn features and metrics. the proposed method can automatically pick out the most discriminative frames in a given video by a temporal attention model. Moreover, it integrates the surrounding information at each location by a spatial recurrent model when measuring the similarity with another pedestrian video. that is, our method handles spatial and temporal information simultaneously in a unified manner. the carefully designed experiments on three public datasets show the effectiveness of each component of the proposed deep network, performing better in comparison withthe state-of-the-art methods.
Several decades of research in computer and primate vision have resulted in many models (some specialized for one problem, others more general) and invaluable experimental data. Here, to help focus research efforts on...
详细信息
ISBN:
(纸本)9781479951178
Several decades of research in computer and primate vision have resulted in many models (some specialized for one problem, others more general) and invaluable experimental data. Here, to help focus research efforts onto the hardest unsolved problems, and bridge computer and human vision, we define a battery of 5 tests that measure the gap between human and machine performances in several dimensions (generalization across scene categories, generalization from images to edge maps and line drawings, invariance to rotation and scaling, local/global information with jumbled images, and object recognition performance). We measure model accuracy and the correlation between model and human error patterns. Experimenting over 7 datasets, where human data is available, and gauging 14 well-established models, we find that none fully resembles humans in all aspects, and we learn from each test which models and features are more promising in approaching humans in the tested dimension. Across all tests, we find that models based on local edge histograms consistently resemble humans more, while several scene statistics or "gist" models do perform well with both scenes and objects. While computervision has long been inspired by human vision, we believe systematic efforts, such as this, will help better identify shortcomings of models and find new paths forward.
Brain-inspired computervision (BICV) has evolved rapidly in recent years and it is now competitive with traditional CV approaches. However, most of BICV algorithms have been developed on high power-and-performance pl...
详细信息
ISBN:
(纸本)9781479943098
Brain-inspired computervision (BICV) has evolved rapidly in recent years and it is now competitive with traditional CV approaches. However, most of BICV algorithms have been developed on high power-and-performance platforms (e.g. workstations) or special purpose hardware. We propose two different algorithms for counting people in a classroom, both based on Convolutional Neural Networks (CNNs), a state-of-art deep learning model that is inspired on the structure of the human visual cortex. Furthermore, we provide a standalone parallel C library that implements CNNs and use it to deploy our algorithms on the embedded mobile ARM big. LITTLE-based Odroid-XU platform. Our performance and power measurements show that neuromorphic vision is feasible on off-the-shelf embedded mobile platforms, and we show that it can reach very good energy efficiency for non-time-critical tasks such as people counting.
Convolutional Neural Networks (ConvNets) have become the state-of-the-art for many classification and regression problems in computervision. When it comes to regression, approaches such as measuring the Euclidean dis...
详细信息
ISBN:
(纸本)9781538604571
Convolutional Neural Networks (ConvNets) have become the state-of-the-art for many classification and regression problems in computervision. When it comes to regression, approaches such as measuring the Euclidean distance of target and predictions are often employed as output layer. In this paper, we propose the coupling of a Gaussian mixture of linear inverse regressions with a ConvNet, and we describe the methodological foundations and the associated algorithm to jointly train the deep network and the regression function. We test our model on the headpose estimation problem. In this particular problem, we show that inverse regression outperforms regression models currently used by state-of-the-art computervision methods. Our method does not require the incorporation of additional data, as it is often proposed in the literature, thus it is able to work well on relatively small training datasets. Finally, it outperforms state-of-the-art methods in head-pose estimation using a widely used head-pose dataset. To the best of our knowledge, we are the first to incorporate inverse regression into deep learning for computervision applications.
Generalizing object recognition to be invariant to geometric transformation is a traditional challenge in vision. Inspired by information, processing mechanism in biology vision, we develop a invariant recognition mod...
详细信息
ISBN:
(纸本)7121002159
Generalizing object recognition to be invariant to geometric transformation is a traditional challenge in vision. Inspired by information, processing mechanism in biology vision, we develop a invariant recognition model by exploiting temporal correlation. Maximizing spatial independence leads to emergence of simple cell properties. Subsequently minimizing the varitation of simple cells outputs over time leads to emergence of invariant features typical of complex cells. Experiments on character images testify recognition is rotation and translation invariant. Our model's plausibility in neurobiology, view is also discussed.
暂无评论