The development of vehicles that perceive their environment, in particular those using computervision, indispensably requires large databases of sensor recordings obtained from real cars driven in realistic traffic s...
详细信息
The development of vehicles that perceive their environment, in particular those using computervision, indispensably requires large databases of sensor recordings obtained from real cars driven in realistic traffic situations. These datasets should be time shaped for enabling synchronization of sensor data from different sources. Furthermore, full surround environment perception requires high frame rates of synchronized omnidirectional video data to prevent information loss at any speeds. This paper describes an experimental setup and software environment for recording such synchronized multi-sensor data streams and storing them in a new open source format. The dataset consists of sequences recorded in various environments from a car equipped with an omnidirectional multi-camera, height sensors, an IMU, a velocity sensor, and a GPS. The software environment for reading these data sets will be provided to the public, together with a collection of long multi-sensor and multi-camera data streams stored in the developed format.
This paper explores the enhancement by locality constraint to both learning and coding schemes, more specifically, discriminative low-rank dictionary learning and auto-encoder. Previous Fisher discriminative based dic...
详细信息
ISBN:
(纸本)9781467367608
This paper explores the enhancement by locality constraint to both learning and coding schemes, more specifically, discriminative low-rank dictionary learning and auto-encoder. Previous Fisher discriminative based dictionary learning has led to interesting results by learning more discerning sub-dictionaries. Also, the low-rank regularization term has been introduced to take advantage of the global structure of the data. However, such methods fail to consider data's intrinsic manifold structure. To this end, first, we apply locality constraint on dictionary learning to explore whether the identification capability will be enhanced or not by using the geometric structure information. Moreover, inspired by the recent advances from auto-encoders for learning compact feature spaces, we propose a locality-constrained collaborative auto-encoder (LCAE) for feature extraction. The improvement from applying locality to dictionary learning and auto-encoder is evaluated on several datasets. Experimental results have demonstrated the effectiveness of locality information compared with state-of-the-art methods.
The problem of object recognition is addressed. In the literature this task has been generally considered in a "passive" perspective, where everything is static and there is no definite relation between the ...
详细信息
The problem of object recognition is addressed. In the literature this task has been generally considered in a "passive" perspective, where everything is static and there is no definite relation between the object and its environment. We propose an "active" approach for object recognition, based on the capability of the observer to move and give a better description of the object under consideration and also to take advantage of the relations between the objects and the environment. This can be accomplished at the task level and at the sensor level. The face recognition problem, based on the face-space approach, is considered to demonstrate the advantage of adopting an active retina to sample the face, build a database and perform the recognition task. By using an active space-variant retina the size of the database is considerably reduced and consequently the processing time for recognition. A comparative experiment using the active and static approach is presented.< >
We present a novel approach to localizing parts in images of human faces. The approach combines the output of local detectors with a non-parametric set of global models for the part locations based on over one thousan...
详细信息
We present an approach for identifying the occluding contour and determining its sidedness using an active (i.e., moving) observer. It is based on the non-stationarity property of the visible rim: When the observer...
详细信息
We present an approach for identifying the occluding contour and determining its sidedness using an active (i.e., moving) observer. It is based on the non-stationarity property of the visible rim: When the observer's viewpoint is changed, the visible rim is a collection of curves that "slide," rigidly or non-rigidly over the surface. We show that the absenter can deterministically choose three views on the tangent plane of selected surface points to distinguish such curves from stationary surface curves (i.e., surface markings). Our approach demonstrates that the occluding contour can be identified directly, i.e., without first computing surface shape (distance and curvature).< >
C-arm fluoroscopy is ubiquitous in contemporary surgery, but it lacks the ability to accurately reconstruct three-dimensional information, attributable to the difficulty in obtaining the pose of X-ray images in 3D spa...
详细信息
C-arm fluoroscopy is ubiquitous in contemporary surgery, but it lacks the ability to accurately reconstruct three-dimensional information, attributable to the difficulty in obtaining the pose of X-ray images in 3D space. We propose a unified mathematical framework to address the issues of intra-operative pose estimation, correspondence and reconstruction, using simple elliptic curves. In contrast to other fiducial-based tracking methods, our method uses a single ellipse to constrain 5 out of 6 degrees of freedom of C-arm pose, along with randomly distributed unknown points in the imaging volume (either naturally present or induced by randomly placed beads or other markers in the image space) from two images/views to completely recover the Carm pose. Preliminary phantom experiments indicate an average C-arm tracking accuracy of 0.51. and 0.12. STD. The method appears to be sufficiently accurate and appealing for many clinical applications, since it uses a simple elliptic fiducial coupled with patient information and has very minimal interference with the workspace.
In this paper we consider the problem of aligning multiple non-rigid surface mesh sequences into a single temporally consistent representation of the shape and motion. A global alignment graph structure is introduced ...
详细信息
Morphological operations are used for segmentation, feature generation, and location extraction. A recursive adaptive thresholding algorithm transforms a gray-level image into a set of multiple level regions of object...
详细信息
ISBN:
(纸本)0818608625
Morphological operations are used for segmentation, feature generation, and location extraction. A recursive adaptive thresholding algorithm transforms a gray-level image into a set of multiple level regions of objects. A distance transformation algorithm then is used to transform a binary image into the minimum distance from each object point to the object's boundary. This algorithm uses a morphological erosion with a large structuring element which may correspond to Euclidean, city-block, or chessboard distance measures. For rotation-invariance and precision measurements, the Euclidean distance should be chosen. The large Euclidean structuring element can be decomposed into the maximum of recursive dilations with multiple small structuring components, which allows the easy implementation of this algorithm. A shape library database with hierarchical features is automatically generated. The features extracted are the shape number and the skeletal local-maximum points radii and coordinates. Object recognition is achieved by comparing the shape number and the hierarchical radii. Object location is detected by a hierarchical morphological bandpass filter.
In this paper we present the Women in computervision Workshop - WiCV 2019, organized in conjunction with CVPR 2019. This event is meant for increasing the visibility and inclusion of women researchers in computer vis...
详细信息
ISBN:
(数字)9781728125060
ISBN:
(纸本)9781728125077
In this paper we present the Women in computervision Workshop - WiCV 2019, organized in conjunction with CVPR 2019. This event is meant for increasing the visibility and inclusion of women researchers in computervision field. computervision and machine learning have made incredible progress over the past years, but the number of female researchers is still low both in the academia and in the industry. WiCV is organized especially for this reason: to raise visibility of female researchers, to increase collaborations between them, and to provide mentorship to female junior researchers in the field. In this paper, we present a report of trends over the past years, along with a summary of statistics regarding presenters, attendees, and sponsorship for the current workshop.
The recognition of text in everyday scenes is made difficult by viewing conditions, unusual fonts, and lack of linguistic context. Most methods integrate a priori appearance information and some sort of hard or soft c...
详细信息
暂无评论