Aiming at the complex operation of active 3dreconstruction technology, which is easily affected by external environment and equipment, this paper adopts a series of monocular image sequences taken by cell phone camer...
详细信息
Research on 3d face modeling has been ongoing in the fields of Computer Vision (CV) and Computer Graphics (CG), supporting applications ranging from the creation of synthetic data to the transfer of facial expressions...
详细信息
ISBN:
(数字)9798350356816
ISBN:
(纸本)9798350356823
Research on 3d face modeling has been ongoing in the fields of Computer Vision (CV) and Computer Graphics (CG), supporting applications ranging from the creation of synthetic data to the transfer of facial expressions in virtual avatars. This work introduces a framework for 3d face generation that separates identity from expression elements to give fine control over facial expressions. The model generates high-fidelity 3d faces with remarkable appearance and shape by combining the Wasserstein Generative Adversarial Network (WGAN) and Supervised Auto-Encoder (SAE) architectures. In particular, texture synthesis is performed using the Progressive Growing of GANs (ProGAN) technique, and shape formation is made possible by the SAE architecture, which allows for the capture of complex facial traits. The model may provide reconstructions with more realism while maintaining fine-grained features and overall structure because to the fusion of WGAN and VAE. The suggested method is evaluated using both quantitative and qualitative methods. Metric standarddeviation are used for the quantitative analysis, while visual examination of the rebuilt faces is used for the qualitative analysis. The outcomes of the experiments show that the suggested approach performs better in creating realistically shaped and texture 3d face reconstructions.
3d Gaussian Splatting has shown exceptional potential in novel view synthesis and3dreconstruction by leveraging fast rasterization techniques for real-time rendering. despite its advancements, incorporating semantic...
详细信息
ISBN:
(数字)9798331515447
ISBN:
(纸本)9798331515454
3d Gaussian Splatting has shown exceptional potential in novel view synthesis and3dreconstruction by leveraging fast rasterization techniques for real-time rendering. despite its advancements, incorporating semantic information into Gaussians remains challenging due to 2d Object-level Ambiguity and3d Object-level Ambiguity, significantly impacting the performance of point-level 3d scene understanding anddownstream tasks like segmentation and scene editing in 3d space. To address these issues, we propose ConsisGaussian, which ensures all Gaussians corresponding to the same object share a consistent feature. Our method unifies multi-view semantic information, resolving 2d Object-level Ambiguity by classifying objects into a single class and performing class-wise feature fusion. Additionally, we extend the differentiable renderer of 3dGS to create an object-z map, enabling inverse projection to assign consistent features to Gaussians in 3d space and overcome 3d Object-level Ambiguity. Our semantic segmentation and scene editing experiments demonstrate that this approach offers a robust solution for precise 3d scene understanding.
3d vehicle shape completion is a key task in urban environment reconstruction, since the available sensors and3d scanning procedures (such as Mobile Laser Scanning) in an outdoor city scene cannot usually extract the...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
3d vehicle shape completion is a key task in urban environment reconstruction, since the available sensors and3d scanning procedures (such as Mobile Laser Scanning) in an outdoor city scene cannot usually extract the entire car shapes, which can be necessary for specifying their geometric properties for further analysis or realistic visualization. In this paper, we propose a novel multi-view based3d object point cloud completion technique. In contrast to existing approaches, our method operates on 2d images formed by projecting the point cloud from several virtual camera positions around the object of interest. Both color and geometrical information is consideredduring the process, generating dense textured point clouds, displaying realistic patterns in the missing regions from the partial inputs. We present both quantitative and qualitative tests on various synthetic and real laser scanned vehicle point clouds, which demonstrate that our method surpasses existing state-of-the-art approaches. By applying it to vehicles from the Shapenet dataset, our approach outperforms recent techniques in terms of Earth Mover's distance (EMd) and Chamfer distance (Cd) by 43.8% and 12.17%, respectively.
As a popular research direction in deep learning in recent years, virtual try-on technology has gained more acceptance through 2d image-based methods due to their lower cost and time requirements compared to their 3d ...
详细信息
Recent works based on convolutional encoder-decoder architecture and3dMM parameterization have shown great potential for canonical view reconstruction from a single input image. Conventional CNN architectures benefit...
详细信息
ISBN:
(纸本)9781450392037
Recent works based on convolutional encoder-decoder architecture and3dMM parameterization have shown great potential for canonical view reconstruction from a single input image. Conventional CNN architectures benefit from exploiting the spatial correspondence between the input and output pixels. However, in 3d face reconstruction, the spatial misalignment between the input image (e.g. face) and the canonical/UV output makes the feature encoding-decoding process quite challenging. In this paper, to tackle this problem, we propose a new network architecture, namely the Affine Convolution Networks, which enables CNN based approaches to handle spatially non-corresponding input and output images and maintain high-fidelity quality output at the same time. In our method, an affine transformation matrix is learned from the affine convolution layer for each spatial location of the feature maps. In addition, we represent 3d human heads in UV space with multiple components, including diffuse maps for texture representation, position maps for geometry representation, and light maps for recovering more complex lighting conditions in the real world. All the components can be trained without any manual annotations. Our method is parametric-free and can generate high-quality UV maps at resolution of 512 x 512 pixels, while previous approaches normally generate 256 x 256 pixels or smaller. Our code will be released once the paper got accepted.
The application of mixed reality (MR) technology is highly attractive for the emerging concept of image-guided surgery, as MR is capable of providing the surgeons with overlaid information of anatomical structure and ...
详细信息
ISBN:
(数字)9798331515447
ISBN:
(纸本)9798331515454
The application of mixed reality (MR) technology is highly attractive for the emerging concept of image-guided surgery, as MR is capable of providing the surgeons with overlaid information of anatomical structure and navigation information, thus leading to more accurate surgical decision making. The accuracy of the overlaid information is dependent on the registration and tracking of holograms. Nevertheless, current registration methods either suffer from insufficient accuracy or require complex preparation, hindering their direct translation to a clinical scenario. To address these, we propose ’HL-SfM’, a marker-free, fully-automated, and high-precision hologram registration and tracking framework based on the 3dreconstruction with an optical see-through MR device–HoloLens2. Specifically, we use a series of obtained RGB images for target region matching, followed by real-scale scene reconstruction via high-precision initial SLAM data. Afterwards, we apply point clouds-based registration algorithm for hologram alignment. Additionally, we propose a novel "detect-LoFTR" module, an efficient target feature matching method, for feature selection in the 3dreconstruction process. In the end, HL-SfM achieves an average error of 2.6 mm and 2.4°, outperforming the results directly obtained from the point clouds of the HoloLens2 AHaT camera, which is also state of the art for marker-free registration in HoloLens2-only methods. Our supplementary materials can be found at Baidu Netdisk or Google drive.
Text-guided3d face reconstruction is an emerging field that leverages artificial intelligence to generate detailed and animatable 3d facial models from textual descriptions. This technology holds promise for revoluti...
详细信息
ISBN:
(数字)9798350386660
ISBN:
(纸本)9798350386677
Text-guided3d face reconstruction is an emerging field that leverages artificial intelligence to generate detailed and animatable 3d facial models from textual descriptions. This technology holds promise for revolutionizing digital content creation, avatar design, andvirtual interaction by allowing users to convert textual imaginations into realistic 3d representations. This survey paper provides a comprehensive overview of the state-of-the-art methods in text-guided3d face reconstruction, including ClipFace, dreamFace, TG-3dFace, and E3-FaceNet. We discuss their unique features, advantages, limitations, and the innovative techniques they employ to bridge the gap between natural language and3d visual space. We also highlight the challenges that remain, such as achieving high fidelity and efficiency in the rendering process, and the potential for future advancements in this domain.
Reconstruct 3d model from 2d image is an important task in the field of deep learning, which aims to make computers have the ability to perceive the 3d world like human-beings. In this paper, a lightweight method is p...
详细信息
Radiance field methods such as Neural Radiance Fields (NeRFs) or 3d Gaussian Splatting (3dGS), have revo-lutionized graphics and novel view synthesis. Their ability to synthesize new viewpoints with photo-realistic qu...
详细信息
ISBN:
(数字)9798350377705
ISBN:
(纸本)9798350377712
Radiance field methods such as Neural Radiance Fields (NeRFs) or 3d Gaussian Splatting (3dGS), have revo-lutionized graphics and novel view synthesis. Their ability to synthesize new viewpoints with photo-realistic quality, as well as capture complex volumetric and specular scenes, makes them an ideal visualization for robotic teleoperation setups. direct camera teleoperation provides high-fidelity operation at the cost of maneuverability, while reconstruction-based approaches offer controllable scenes with lower fidelity. With this in mind, we propose replacing the traditional reconstruction-visualization components of the robotic teleoperation pipeline with online Radiance Fields, offering highly maneuverable scenes with photorealistic quality. As such, there are three main contributions to state of the art: (1) online training of Radiance Fields using live data from multiple cameras, (2) support for a variety of radiance methods including NeRF and3dGS, (3) visualization suite for these methods including a virtual reality scene. To enable seamless integration with existing setups, these components were tested with multiple robots in multiple configurations and were displayed using traditional tools as well as the VR headset. The results across methods and robots were compared quantitatively to a baseline of mesh reconstruction, and a user study was conducted to compare the different visualization methods. The code and additional samples are available at https://***/***/.
暂无评论