The current state-of-the-art for egomotion estimation with omnidirectional cameras is to map the optical flow to the sphere and then apply egomotion algorithms for spherical projection. In this paper, we propose to ba...
详细信息
The current state-of-the-art for egomotion estimation with omnidirectional cameras is to map the optical flow to the sphere and then apply egomotion algorithms for spherical projection. In this paper, we propose to back-project image points to a virtual curved retina that is intrinsic to the geometry of the central panoramic camera, and compute the optical flow on this retina: the so-called back-projection flow. We show that well-known egomotion algorithms can be easily adapted to work with the back-projection flow. We present extensive simulation results showing that in the presence of noise, egomotion algorithms perform better by using back-projection flow when the camera translation is in the X-Y plane. Thus, the proposed method is preferable in applications where there is no Z-axis translation, such as ground robot navigation.
The motion estimation computation in the image sequences is a significant problem in image processing. Many researches were carried out on this subject in the image sequences with a traditional camera. These technique...
详细信息
The motion estimation computation in the image sequences is a significant problem in image processing. Many researches were carried out on this subject in the image sequences with a traditional camera. These techniques were applied in omnidirectional image sequences. But the majority of these methods are not adapted to this kind of sequences. Indeed they suppose the flow is locally constant but the omnidirectional sensor generates distortions which contradict this assumption. In this paper, we propose a fast method to compute the optical flow in omnidirectional image sequences. This method is based on a Brightness Change Constraint Equation decomposition on a wavelet basis. To take account of the distortions created by the sensor, we replace the assumption of flow locally constant used in traditional images by a hypothesis more appropriate.
The calibration of a line-based panoramic camera can be split into two independent subtasks: first calibrate the effective focal length and the principal row, and second, calibrate the off-axis distance and the princi...
详细信息
The calibration of a line-based panoramic camera can be split into two independent subtasks: first calibrate the effective focal length and the principal row, and second, calibrate the off-axis distance and the principal angle. The paper provides solutions for three different methods, and compares these methods based on experiments using a superhigh resolution line-based panoramic camera. It turns out that the second subtask is solved best if a straight-segment based approach is used, compared to point-based or correspondence-based calibration methods, all already known for traditional (planar) pinhole cameras, but not yet previously discussed for panoramic cameras.
Electroencephalogram (EEG) signals recorded from a persons scalp have been used to control binary cursor movements. Multiple choice paradigms will require more sophisticated protocols involving multiple mental tasks a...
详细信息
Electroencephalogram (EEG) signals recorded from a persons scalp have been used to control binary cursor movements. Multiple choice paradigms will require more sophisticated protocols involving multiple mental tasks and signal representations that capture discriminatory characteristics of the EEG signals. In this study, six-channel EEG is recorded from a subject performing two mental tasks. The signals are transformed via the Karhunen-Loéve or maximum noise fraction transformations and classified by quadratic discriminant analysis. In addition, classification accuracy is tested for all subsets of the six EEG channels. Best results are approximately 90% correct when training and testing data are recorded on the same day and 75% correct when training and testing data are recorded on different days.
Development of OCRs for Indian script is an active area of research today. Indian scripts present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which...
详细信息
Development of OCRs for Indian script is an active area of research today. Indian scripts present great challenges to an OCR designer due to the large number of letters in the alphabet, the sophisticated ways in which they combine, and the complicated graphemes they result in. The problem is compounded by the unstructured manner in which popular fonts are designed. There is a lot of common structure in the different Indian scripts. In this paper, we argue that a number of automatic and semi-automatic tools can ease the development of recognizers for new font styles and new scripts. We discuss briefly three such tools we developed and show how they have helped build new OCRs. An integrated approach to the design of OCRs for all Indian scripts has great benefits. We are building OCRs for many Indian languages following this approach as part of a system to provide tools to create content in them.
Preserving cultural heritage and historic sites is an important problem. These sites are subject to erosion, vandalism, and as long-lived artifacts, they have gone through many phases of construction, damage and repai...
详细信息
Preserving cultural heritage and historic sites is an important problem. These sites are subject to erosion, vandalism, and as long-lived artifacts, they have gone through many phases of construction, damage and repair. It is important to keep an accurate record of these sites using 3-D model building technology as they currently are, so preservationists can track changes, foresee structural problems, and allow a wider audience to "virtually" see and tour these sites. Due to the complexity of these sites, building 3-D models is time consuming and difficult, usually involving much manual effort. This paper discusses new methods that can reduce the time to build a model using automatic methods. Examples of these methods are shown in reconstructing a model of the Cathedral of Saint-Pierre in Beauvais, France.
In this paper, we devise a Propagation Net (P-Net) as a new mechanism for the representation and recognition of multi-stream activity. Most of daily activities can be represented by temporally partial ordered interval...
详细信息
In this paper, we devise a Propagation Net (P-Net) as a new mechanism for the representation and recognition of multi-stream activity. Most of daily activities can be represented by temporally partial ordered intervals where each interval has not only temporal constraint, i.e., before/after/duration, but also a logical relationship such as a and b both must happen. P-Net associates a node for each interval that is probabilistically triggered function dependent upon the state of its parent nodes. Each node is also associated with an observation distribution function that associates perceptual evidence. This evidence, generated by lower level vision modules, is a positive indicator of the elemental action. Using this architecture, we devise an iterative temporal sequencing algorithm that interprets a multi-dimensional observation sequence of visual evidence as a multi-stream propagation through the P-Net. Simple vision and motion-capture data experiments demonstrate the capabilities of our algorithm.
This paper presents a system designed to cooperatively track and share the information about moving objects using a multi-robot team. Every robot of the team is fitted with a different omnidirectional vision system ru...
详细信息
This paper presents a system designed to cooperatively track and share the information about moving objects using a multi-robot team. Every robot of the team is fitted with a different omnidirectional vision system running at different frame rates. The information gathered from every robot is broadcast to all the other robots and every robot fuses its own measurements with the information received from the teammates, building its own "vision of the world". The cooperation of the vision sensors enhances the capabilities of the single vision sensor. This work was implemented in the RoboCup domain, using our team of heterogeneous robot, but the approach is very general and can be used in any application where a team of robot has to track multiple objects. The system is designed to work with heterogeneous vision systems both in the camera design and in computational resources. Experiments in real game scenarios are presented.
In this paper, we propose a novel framework for semantic medical event characterization and detection by using principal video shots and semantic principal video shot classification. Specifically, the framework includ...
详细信息
In this paper, we propose a novel framework for semantic medical event characterization and detection by using principal video shots and semantic principal video shot classification. Specifically, the framework includes: (a) A semantic medical event characterization technique by using principal video shots in a specific surgery education video domain. (b) An automatic principal video shot detection algorithm by determining the domain-dependent and event-driven salient objects. (c) A semantic medical event detection technique by using Bayesian classifier, where the classifier parameters and structure are determined automatically by an adaptive Expectation-Maximization (EM) algorithm. For semantic medical event detection in a specific surgery education video domain, our technique achieves overall \approx 87:3% accuracy for four pre-defined semantic medical events.
This paper presents real-time, or near real-time, probabilistic event detection methods for broadcast sports video using cinematic features, such as shot classes and slow-motion replays. Novel algorithms have been dev...
详细信息
This paper presents real-time, or near real-time, probabilistic event detection methods for broadcast sports video using cinematic features, such as shot classes and slow-motion replays. Novel algorithms have been developed for probabilistic detection of soccer goal events and basketball play-break events in a generic framework. The proposed framework includes generic algorithms for automatic dominant (field) color region detection and shot boundary detection, and domain-specific shot classification algorithms for soccer and basketball. Finally, the detected goal events in soccer and play events in basketball are employed to generate summaries of long games. The efficiency and effectiveness of the proposed system and the algorithms have been shown over more than 13 hours of sports video captured by the broadcasters from different regions around the world.
暂无评论