The neural radiance field (NeRF) has shown promising results in preserving the fine details of objects and scenes. However, unlike explicit shape representations e.g., mesh, it remains an open problem to build dense c...
详细信息
The neural radiance field (NeRF) has shown promising results in preserving the fine details of objects and scenes. However, unlike explicit shape representations e.g., mesh, it remains an open problem to build dense correspondences across different NeRFs of the same category, which is essential in many downstream tasks. The main difficulties of this problem lie in the implicit nature of NeRF and the lack of ground-truth correspondence annotations. In this paper, we show it is possible to bypass these challenges by leveraging the rich semantics and structural priors encapsulated in a pre-trained NeRF-based GAN. Specifically, we exploit such priors from three aspects, namely (1) a dual deformation field that takes latent codes as global structural indicators, (2) a learning objective that regards generator features as geometric-aware local descriptors, and (3) a source of infinite object-specific NeRF samples. Our experiments demonstrate that such priors lead to 3D dense correspondence that is accurate, smooth, and robust. We also show that established dense correspondence across NeRFs can effectively enable many NeRF-based downstream applications such as texture transfer.
The authors propose Point'n Move, a method that achieves interactive scene object manipulation with exposed region inpainting. Interactivity here further comes from intuitive object selection and real-time editing...
详细信息
The authors propose Point'n Move, a method that achieves interactive scene object manipulation with exposed region inpainting. Interactivity here further comes from intuitive object selection and real-time editing. To achieve this, Gaussian Splatting Radiance Field is adopted as the scene representation and its explicit nature and speed advantage are fully leveraged. Its explicit representation formulation allows to devise a 2D prompt points to 3D masks dual-stage self-prompting segmentation algorithm, perform mask refinement and merging, minimize changes, and provide good initialization for scene inpainting and perform editing in real-time without per-editing training;all lead to superior quality and performance. The method was tested by editing both forward-facing and 360 scenes. The method is also compared against existing methods, showing superior quality despite being more capable and having a speed advantage. We propose Point'n Move, a method that achieves interactive scene object manipulation with exposed region inpainting. Interactivity here refers to intuitive object selection and real-time editing. This is achieved by devising a pipeline that fully exploits the explicit nature of our adopted scene representation. Our method achieves superior quality against existing object removal methods despite being more capable and having a speed advantage. image
In the field of scientific visualization, the upscaling of time-varying volume is meaningful. It can be used in in situ visualization to help scientists overcome the limitations of I/O speed and storage capacity when ...
详细信息
In the field of scientific visualization, the upscaling of time-varying volume is meaningful. It can be used in in situ visualization to help scientists overcome the limitations of I/O speed and storage capacity when analysing and visualizing large-scale, time-varying simulation data. This paper proposes self-attention residual network-based spatial super-resolution (SARN-SSR), a spatial super-resolution model based on self-attention residual networks that can generate time-varying data with temporal coherence. SARN-SSR consists of two components: a generator and a discriminator. The generator takes the low-resolution volume sequences as the input and gives the corresponding high-resolution volume sequences as the output. The discriminator takes both synthesized and real high-resolution volume sequence as the input and gives a matrix to predict the realness as the output. To verify the validity of SARN-SSR, four sets of time-varying volume datasets are applied from scientific simulation. In addition, SARN-SSR is compared on these datasets, both qualitatively and quantitatively, with two deep learning-based techniques and one traditional technique. The experimental results show that by using this method, the closest time-varying data to the ground truth can be obtained. This paper proposes a novel self-attention residual network-based spatial super-resolution (SARN-SSR) framework for upscaling time-varying volume data in scientific visualization. It utilizes a generator and discriminator based on generative adversarial networks to generate high-resolution volume sequences. Comparative evaluations demonstrate that SARN-SSR outperforms state-of-the-art techniques in generating accurate time-varying volume datasets. image
Due to varied personal, social, or even cultural situations, people sometimes conceal or mask their true emotions. These suppressed emotions can be expressed in a very subtle way by brief movements called microexpress...
详细信息
Due to varied personal, social, or even cultural situations, people sometimes conceal or mask their true emotions. These suppressed emotions can be expressed in a very subtle way by brief movements called microexpressions. We investigate human subjects' perception of hidden emotions in virtual faces, inspired by recent psychological experiments. We created animations with virtual faces showing some facial expressions and inserted brief secondary expressions in some sequences, in order to try to convey a subtle second emotion in the character Our evaluation methodology consists of two sets of experiments, with three different sets of questions. The first experiment verifies that the accuracy and concordance of the participant's responses with synthetic faces matches the empirical results done with photos of real people in the paper by X.-b. Shen, Q. Wu, and X.-I. Fu, 2012, "Effects of the duration of expressions on the recognition of microexpressions," Journal of Zhejiang University Science 8, 13(3), 221-230. The second experiment verifies whether participants could perceive and identify primary and secondary emotions in virtual faces. The third experiment tries to evaluate the participant's perception of realism, deceit, and valence of the emotions. Our results show that most of the participants recognized the foreground (macro) emotion and most of the time they perceived the presence of the second (micro) emotion in the animations, although they did not identify it correctly in some samples. This experiment exposes the benefits of conveying microexpressions in computer graphics characters, as they may visually enhance a character's emotional depth through subliminal microexpression cues, and consequently increase the perceived social complexity and believability.
We present a novel approach for generating isotropic surface triangle meshes directly from unoriented 3D point clouds, with the mesh density adapting to the estimated local feature size (LFS). Popular reconstruction p...
详细信息
We present a novel approach for generating isotropic surface triangle meshes directly from unoriented 3D point clouds, with the mesh density adapting to the estimated local feature size (LFS). Popular reconstruction pipelines first reconstruct a dense mesh from the input point cloud and then apply remeshing to obtain an isotropic mesh. The sequential pipeline makes it hard to find a lower-density mesh while preserving more details. Instead, our approach reconstructs both an implicit function and an LFS-aware mesh sizing function directly from the input point cloud, which is then used to produce the final LFS-aware mesh without remeshing. We combine local curvature radius and shape diameter to estimate the LFS directly from the input point clouds. Additionally, we propose a new mesh solver to solve an implicit function whose zero level set delineates the surface without requiring normal orientation. The added value of our approach is generating isotropic meshes directly from 3D point clouds with an LFS-aware density, thus achieving a trade-off between geometric detail and mesh complexity. Our experiments also demonstrate the robustness of our method to noise, outliers, and missing data and can preserve sharp features for CAD point clouds.
The limited of texture details information in low-resolution facial or eye images presents a challenge for gaze estimation. To address this, FSKT-GE (feature maps similarity knowledge transfer for low-resolution gaze ...
详细信息
The limited of texture details information in low-resolution facial or eye images presents a challenge for gaze estimation. To address this, FSKT-GE (feature maps similarity knowledge transfer for low-resolution gaze estimation) is proposed, a gaze estimation framework consisting of both a high resolution (HR) network and low resolution (LR) network with the identical structure. Rather than mere feature imitation, this issue is addressed by assessing the cosine similarity of feature layers, emphasizing the distribution similarity between the HR and LR networks. This enables the LR network to acquire richer knowledge. This framework utilizes a combination loss function, incorporating cosine similarity measurement, soft loss based on probability distribution difference and gaze direction output, along with a hard loss from the LR network output layer. This approach on low-resolution datasets derived from Gaze360 and RT-Gene datasets is validated, demonstrating excellent performance in low-resolution gaze estimation. Evaluations on low-resolution images obtained through 2x, 4x, and 8x down-sampling are conducted on two datasets. On the Gaze360 dataset, the lowest mean angular errors of 10.97 degrees, 11.22 degrees, and 13.61 degrees were achieved, while on the RT-Gene dataset, the lowest mean angular errors of 6.73 degrees, 6.83 degrees, and 7.75 degrees were obtained. Here, a novel approach called feature map similarity-based knowledge transfer for low-resolution gaze estimation (FSKT-GE) is proposed. The motivation behind this work is to address the challenge of accurately estimating gaze direction for low-resolution facial images encountered in unconstrained outdoor environments. image
Electroencephalography (EEG) is a novel modality for investigating perceptual graphics problems. Until recently, EEG has predominantly been used for clinical diagnosis, in psychology, and by the brain-computer-interfa...
详细信息
Electroencephalography (EEG) is a novel modality for investigating perceptual graphics problems. Until recently, EEG has predominantly been used for clinical diagnosis, in psychology, and by the brain-computer-interface community. Researchers are extending it to help understand the perception of visual output from graphics applications and to create approaches based on direct neural feedback. Researchers have applied EEG to graphics to determine perceived image and video quality by detecting typical rendering artifacts, to evaluate visualization effectiveness by calculating the cognitive load, and to automatically optimize rendering parameters for images and videos on the basis of implicit neural feedback.
Exploring topologically distinctive trajectories provides more options for robot motion planning. Since computing time grows greatly with environment complexity, improving exploration efficiency and picking the optima...
详细信息
Exploring topologically distinctive trajectories provides more options for robot motion planning. Since computing time grows greatly with environment complexity, improving exploration efficiency and picking the optimal trajectory in complex environments are critical issues. To this end, this paper proposes a Graphic-and Timed-Elastic-Band-based approach (GraphicTEB) with spatial completeness and high computing efficiency. The environment is analyzed utilizing computer graphics, where obstacles are extracted as nodes and their relationships are built as edges. Three contributions are presented. 1) By assembling directed detours formed by nodes and segmented paths formed by edges, a generalized path consisting of nodes and edges derives various normal paths efficiently. 2) By multiplying two vectors starting from the obstacle point closest to the waypoint and the boundary point farthest from the waypoint, an novel obstacle gradient is introduced to guide safer optimization. 3) By assigning edges with asymmetric Gaussian model, a trajectory evaluation strategy is designed to reflect the motion tendency and motion uncertainty of dynamic obstacles. Qualitative and quantitative simulations demonstrate that the proposed GraphicTEB achieves spatial completeness, higher scene pass rate, and fastest computing efficiency. Experiments are implemented in long corridor and broad room scenarios, where the robot goes through gaps safely, finds trajectories quickly, and passes pedestrians politely Note to Practitioners-The motivation stems from the fact that our daily cruising robot occasionally gets trapped in a corridor with piled obstacles or in a complex dynamic crowd due to the lack of a reliable trajectory. The solution is to search for more topologically distinctive trajectories and pick the optimal one. Considering that existing open-source approaches are either incomplete or highly time-consuming, a method for clustering and searching trajectories in the obstacle-occupied r
This paper proposes a projection system that optically removes the cast shadow in projection mapping. Specifically, we realize the large-aperture (LA) projection using a large-format Fresnel lens to suppress cast shad...
详细信息
This paper proposes a projection system that optically removes the cast shadow in projection mapping. Specifically, we realize the large-aperture (LA) projection using a large-format Fresnel lens to suppress cast shadows by condensing the projection light from a wide viewing angle. However, the resolution and contrast of the projected results are significantly degraded by defocus blur, veiling glare, and stray light caused by the aberration of an LA Fresnel lens. To solve the technical problems, we employ two different approaches: optical and digital image processing methods. First, we introduce a residual projector with a typical aperture lens on the same optical axis as the LA projector, projecting the residual (i.e., high-frequency) components attenuated in the LA projection. These projectors play different roles in shadow suppression and blur compensation, both achieved by projecting simultaneously. Secondly, we optimize the pair of projection images that can balance the shadow suppression and deblurring performance of our projection system. We implemented a proof-of-concept prototype and validated the above-mentioned techniques through projection experiments and a user study.
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current tr...
详细信息
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current trend towards using more and more data, our aim is not to fit the motion capture markers with a parameterized (blendshape) model or to smoothly interpolate a surface through the marker positions, but rather to find an instance in the high resolution dataset that contains local geometry to fit each marker. Just as is true for typical machine learning applications, this approach benefits from a plethora of data, and thus we also consider augmenting the dataset via specially designed physical simulations that target the high resolution dataset such that the simulation output lies on the same so-called manifold as the data targeted.
暂无评论