We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotati...
详细信息
ISBN:
(纸本)9781424469840
We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotation-Invariant, Fast Feature (RIFF) descriptor. We demonstrate that RIFF is fast enough for real-time tracking, while robust enough for large scale retrieval tasks. At 26x the speed, our tracking-scheme obtains a more accurate global affine motion-model than the Kanade Lucas Tomasi (KLT) tracker. the same descriptors can achieve 94% retrieval accuracy from a database of 10(4) images.
Identifying handled objects, i.e. objects being manipulated by a user, is essential for recognizing the person's activities. An egocentric camera as worn on the body enjoys many advantages such as having a natural...
详细信息
ISBN:
(纸本)9781424469840
Identifying handled objects, i.e. objects being manipulated by a user, is essential for recognizing the person's activities. An egocentric camera as worn on the body enjoys many advantages such as having a natural first-person view and not needing to instrument the environment. It is also a challenging setting, where background clutter is known to be a major source of problems and is difficult to handle withthe camera constantly and arbitrarily moving. In this work we develop a bottom-up motion-based approach to robustly segment out foreground objects in egocentric video and show that it greatly improves object recognition accuracy. Our key insight is that egocentric video of object manipulation is a special domain and many domain-specific cues can readily help. We compute dense optical flow and fit it into multiple affine layers. We then use a max-margin classifier to combine motion with empirical knowledge of object location and background movement as well as temporal cues of support region and color appearance. We evaluate our segmentation algorithm on the large Intel Egocentric Object recognition dataset with 42 objects and 100K frames. We show that, when combined with temporal integration, figure-ground segmentation improves the accuracy of a SIFT-based recognition system from 33% to 60%, and that of a latent-HOG system from 64% to 86%.
Staff removal is an important preprocessing step of the Optical Music recognition (OMR). the process aims to remove the stafflines from a musical document and retain only the musical symbols, later these symbols are u...
详细信息
the checkerboard pattern is widely used in computervision techniques for camera calibration and simple geometry acquisition, both in practical use and research. However, most of the current techniques fail to recogni...
详细信息
We present a new form of least squares (LS), called "hyperLS", for geometric problems that frequently appear in computervision applications. Doing rigorous error analysis, we maximize the accuracy by introd...
详细信息
Visual object categorisation (VOC) has become one of the most actively investigated topic in computervision. In the mainstream studies, the topic is considered as a supervised problem, but recently, the ultimate chal...
详细信息
In this paper, an efficient approach to segment Persian off-line handwritten text-line into characters is presented. the proposed algorithm first traces the baseline of the input text-line image and straightens it. Su...
详细信息
We consider the problem of classifying documents containing multiple unordered pages. For this purpose, we propose a novel bag-of-pages document representation. To represent a document, one assigns every page to a pro...
详细信息
the proceedings contain 229 papers. the topics discussed include: performance of suboptimal beamforming with full knowledge of part of the channel matrix;fuzzy model of control for quantum-controlled mobile robots;mul...
ISBN:
(纸本)9781424486809
the proceedings contain 229 papers. the topics discussed include: performance of suboptimal beamforming with full knowledge of part of the channel matrix;fuzzy model of control for quantum-controlled mobile robots;multilevel inverter modulation method to reduce common-mode voltage and overvoltage at the motor terminals;thirteen level cascaded NPC inverter;anomaly detection in sonar images based on wavelet domain noncausal AR-ARCH random field modeling;waveguide phenomena in wideband indoor radio channel;generic phase shifted PWM algorithm for thirteen level cascaded H-bridge NPC inverter;an FPGA-based pattern classifier using data compression;increasing the image recognition accuracy in machine vision systems with added noise due to technological issues;analysis and control synthesis of continuous-time passive switched linear systems;rigid and competitive fault tolerance for logical information structures in networks;and an online single trial analysis of the P300 event related potential for the disabled.
Maximally Stable Extremal Regions (MSERs) are one of the most prominent interest region detectors in computervision due to their powerful properties and low computational demands. In general MSERs are detected in sin...
详细信息
暂无评论