Recognition of three dimensional (3d) objects is a challenging problem, especially in cluttered or occluded scenes. Many existing methods focus on a specific type of object or scene, or require prior segmentation. We ...
详细信息
Recognition of three dimensional (3d) objects is a challenging problem, especially in cluttered or occluded scenes. Many existing methods focus on a specific type of object or scene, or require prior segmentation. We describe a robust and efficient general purpose 3d object recognition method that combines machine learning procedures with 3d local features, without a requirement for a priori object segmentation. Experiments validate our method on various object types from engineering and street data scans.
This paper presents a semi-automatic method to fit a template mesh to high-resolution normal data, which is generated using spherical gradient illuminations in a light stage. Template fitting is an important step to b...
详细信息
This paper presents a semi-automatic method to fit a template mesh to high-resolution normal data, which is generated using spherical gradient illuminations in a light stage. Template fitting is an important step to build a 3d morph able face model, which can be employed for image-Based facial performance capturing. In contrast to existing 3d reconstruction approaches, we omit the structured light scanning step to obtain low-frequency 3d information and rely solely on normal data from multiple views. This reduces the acquisition time by over 50 percent. In our experiments the proposed algorithm is successfully applied to real faces of several subjects. Experiments with synthetic data show that the fitted face template can closely resemble the ground truth geometry.
Streaming of 3ddata sets is a key technology for remote rendering andvisualization of huge and complex geometrical models like large scale city models. Even with high speed network it is still difficult to share the...
详细信息
Streaming of 3ddata sets is a key technology for remote rendering andvisualization of huge and complex geometrical models like large scale city models. Even with high speed network it is still difficult to share the 3d information of a city scene among different users. In our work, we have devised a new data representation mechanism andtransmission scheme to render the 3d models with less content and less computational cost. Our approach uses light weight building geometry and multi-level textured LOd representation to transmit data via streaming. In addition, we suggest a technique for rendering with selective LOds by emphasizing the regions of interest (ROIs) depending on the importance of the building for a specific user. The LOddistribution is updated over the ROI as the gaze point of the viewer moves over the model surfaces. Preliminary tests and evaluations reveal the feasibility of our method to be extended to large scale mobile applications.
In this paper we present a novel method that performs spatiotemporal editing of 3d multi-party interaction scenes for free-viewpoint browsing, from separately captureddata. The main idea is to first propose the augme...
详细信息
In this paper we present a novel method that performs spatiotemporal editing of 3d multi-party interaction scenes for free-viewpoint browsing, from separately captureddata. The main idea is to first propose the augmented Motion History Volume (aMHV) for motion representation. Then by modeling the correlations between different aMHVs we can define a multi-party interaction dictionary, describing the spatiotemporal constraints for different types of multiparty interaction events. Finally, constraint satisfaction and global optimization methods are proposed to synthesize natural and continuous multi-party interaction scenes. Experiments with real data illustrate the effectiveness of our method.
The availability of active 3d sensing devices such as LidAR has significantly increased the collection of 3d urban scenes with rich details. The sheer amount of data brings a lot of opportunities but also poses tremen...
详细信息
The availability of active 3d sensing devices such as LidAR has significantly increased the collection of 3d urban scenes with rich details. The sheer amount of data brings a lot of opportunities but also poses tremendous challenges for both academic research and industrial applications on point cloud classification and building reconstruction. In this paper, we present an online algorithm to automatically detect and segment buildings from large scale unorganized3d point clouds of urban scenes acquired by ground-Based LidAR devices. The core idea is that buildings can be observed in a street view separated by empty spaces such as alleys. By progressively projecting 3d points onto views along the scanning path, buildings can be detected as large regions with dense points. Experiments on several large scale datasets show that our approach can efficiently produce satisfactory results.
Surface motion capture (Surf Cap) enables 3d reconstruction of human performance with detailed cloth and hair deformation. However, there is a lack of tools that allow flexible editing of Surf Cap sequences. In this p...
详细信息
Surface motion capture (Surf Cap) enables 3d reconstruction of human performance with detailed cloth and hair deformation. However, there is a lack of tools that allow flexible editing of Surf Cap sequences. In this paper, we present a Laplacian editing technique that constrains the mesh deformation to plausible surface shapes learnt from a set of examples. A part-Based representation of the mesh enables learning of surface deformation locally in the space of Laplacian coordinates, avoiding correlations between body parts while preserving surface details. This extends the range of animation with natural surface deformation beyond the whole-body poses present in the Surf Cap data. We illustrate successful use of our tool on three different characters.
The Internet contains a wealth of rich geographic information about our world, including 3d models, street maps, and many other data sources. This information is potentially useful for computer vision applications, su...
详细信息
The Internet contains a wealth of rich geographic information about our world, including 3d models, street maps, and many other data sources. This information is potentially useful for computer vision applications, such as scene understanding for outdoor Internet photos. However, leveraging this data for vision applications requires precisely aligning input photographs, taken from the wild, within a geographic coordinate frame, by estimating the position, orientation, and focal length. To address this problem, we propose a system for aligning 3d structure-from-motion point clouds, produced from Internet imagery, to existing geographic information sources, including Google Street View photos and Google Earth 3d models. We show that our method can produce accurate alignments between these data sources, resulting in the ability to accurately project geographic data into images gathered from the Internet, by ``Googling'' a depth map for an image using sources such as Google Earth.
An object recognition system Based on registering repeatable interest segments from 3d surfaces is presented. The strength of this approach lies in its independence of local features, which can be unreliable when corr...
详细信息
An object recognition system Based on registering repeatable interest segments from 3d surfaces is presented. The strength of this approach lies in its independence of local features, which can be unreliable when corrupted by noise, and indistinct for certain objects and surfaces. The proposed framework is Based on recent advances in segmenting 3ddata into repeatable interest segments, followed by efficient surface registration of model and scene segments, where pose clustering returns the best pose candidates. A quality measure Based on reprojection of the model points and pose refinement are then used to select the best pose. The proposed method is demonstrated experimentally to be both accurate and robust when tested against a variety of partially occluded free-form objects in cluttered scenes, achieving an average accuracy of 93% on an accurate and high resolution LidAR data set, and 81% on a noisy and low resolution Kinect data set.
In this paper, we propose a novel method of skeleton estimation for the purpose of constructing and manipulating individualized hand models via a data glove. To reconstruct actual hand accurately, we derive motion con...
详细信息
In this paper, we propose a novel method of skeleton estimation for the purpose of constructing and manipulating individualized hand models via a data glove. To reconstruct actual hand accurately, we derive motion constraints in fully 6-dOF at the joints without assuming either a center of rotation or a joint axis. The constraints are derived by regression analysis on the configuration of the finger segments with respect to the sensor data. In order to acquire pairs of the configuration and the sensor data, we introduce graspable reference objects for reproducing postures with and without the data glove. To achieve accurate regression on fewer samples, we introduce a regression model according to a detailed investigation using a fused motion capture system, which enabled us to perform optical motion capture and sensor data collection simultaneously. The effectiveness of the method is demonstrated through practical applications involving grasped reference objects.
3d modeling of building architecture from point-cloud scans is a rapidly advancing field. These models are used in augmented reality, navigation, and energy simulation applications. State-of-the-art scanning produces ...
详细信息
3d modeling of building architecture from point-cloud scans is a rapidly advancing field. These models are used in augmented reality, navigation, and energy simulation applications. State-of-the-art scanning produces accurate point-clouds of building interiors containing hundreds of millions of points. Current surface reconstruction techniques either do not preserve sharp features common in a man-made structures, do not guarantee water tightness, or are not constructed in a scalable manner. This paper presents an approach that generates watertight triangulated surfaces from input point-clouds, preserving the sharp features common in buildings. The input point-cloud is converted into a voxelized representation, utilizing a memory-efficient data structure. The triangulation is produced by analyzing planar regions within the model. These regions are represented with an efficient number of elements, while still preserving triangle quality. This approach can be applied to data of arbitrary size to result in detailed models. We apply this technique to several data sets of building interiors and analyze the accuracy of the resulting surfaces with respect to the input point-clouds.
暂无评论