We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. Our self-supe...
详细信息
ISBN:
(纸本)9781665445092
We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. Our self-supervised approach learns to jointly inpaint geometry and color by correlating an incomplete RGB-D scan with a more complete version of that scan. Notably, rather than relying on 3D reconstruction losses to inform our 3D geometry and color reconstruction, we propose adversarial and perceptual losses operating on 2D renderings in order to achieve high-resolution, high-quality colored reconstructions of scenes. This exploits the high-resolution, self-consistent signal from individual raw RGB-D frames, in contrast to fused 3D reconstructions of the frames which exhibit inconsistencies from view-dependent effects, such as color balancing or pose inconsistencies. Thus, by informing our 3D scene generation directly through 2D signal, we produce high-quality colored reconstructions of 3D scenes, outperforming state of the art on both synthetic and real data.
Variational models are among the state-of-the-art formulations for the resolution of ill-posed inverse problems. Following recent advances in learning-based variational settings, we investigate the end-to-end learning...
详细信息
ISBN:
(纸本)9781728176055
Variational models are among the state-of-the-art formulations for the resolution of ill-posed inverse problems. Following recent advances in learning-based variational settings, we investigate the end-to-end learning of variational models, more precisely of the regularization term given some observation model, jointly to the associated solver, so that we can optimize the reconstruction performance. In the proposed end-to-end setting, both the variational cost and the gradient-based solver are stated as neural networks using automatic differentiation for the latter. We consider an application to inverse problems with incompletedatasets (image inpainting and multivariate time series interpolation). We experimentally illustrate that this framework can lead to a significant gain in terms of reconstruction performance, including w.r.t. the direct minimization of the variational formulation derived from the known generative model.
Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facia...
详细信息
ISBN:
(数字)9798350362459
ISBN:
(纸本)9798350362466
Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars fromimages without assuming to have access to precise facial expression tracking. At the core of our method, we leverage a 3D-aware generative model that is trained to reproduce the distribution of facial expressions from the training data. To train this appearance model, we only assume to have a collection of 2D images with the corresponding camera parameters. For controlling the model, we learn a mapping from 3DMM facial expression parameters to the latent space of the generative model. This mapping can be learned by sampling the latent space of the appearance model and reconstructing the facial parameters from a normalized frontal view, where facial expression estimation performs well. With this scheme, we decouple 3D appearance reconstruction and animation control to achieve high fidelity in image synthesis. In a series of experiments, we compare our proposed technique to state-of-the-art monocular methods and show superior quality while not requiring expression tracking of the training data.
X-ray computed tomography (CT) is widely used in diagnostic imaging. Due to the growing number of CT scans worldwide, the consequent increase in populational dose is of concern. Therefore, strategies for dose reductio...
详细信息
ISBN:
(纸本)9781510640207
X-ray computed tomography (CT) is widely used in diagnostic imaging. Due to the growing number of CT scans worldwide, the consequent increase in populational dose is of concern. Therefore, strategies for dose reduction are investigated. One strategy is to perform interior computed tomography (iCT), where X-ray attenuation data are collected only from an internal region-of-interest. The resulting incomplete measurement is called a truncated sinogram (TS). Missing datafrom the surrounding structures results in reconstruction artifacts with traditional methods. In this work, a deep learning framework for iCT is presented. TS is extended with a U-net convolutional neural network, and the extended sinogram is reconstructed with filtered backprojection (FBP). U-net was trained for 300 epochs with Ll loss. Truncated and full sinograms were simulated from CT angiography slice images for training data.1097/193/152 sinograms from 500 patients were used in the training, validation, and test sets, respectively. Our method was compared with FBP applied to TS (TS-FBP), adaptive sinogram de-truncation followed by FBP (ADT-FBP), total variation regularization applied to TS, and FBPConvNet using TS-FBP as input. The best root-mean-square error (0.04 +/- 0.01, mean +/- SD) and peak signal-tonoise-ratio (29.5 +/- 2.9) dB in the test set were observed with the proposed method. However, slightly higher structural similarity indices were observed with FBPConvNet (0.97 +/- 0.01) and ADT-FBP (0.97 +/- 0.01) than with our method (0.96 +/- 0.01). This work suggests that extension of truncated sinogram data with U-Net is a feasible way to reconstruct iCT data without artifacts that render image quality undesirable for medical diagnostics.
In this paper, we introduce an efficient algorithm for reconstructing incompleteimages based on optimal least-squares (LS) approximation. Generally, LS method requires a low-rank basis set that can represent the over...
详细信息
Restricted by the scanning environment and the shape of the target to be detected, the obtained projection datafrom computed tomography (CT) are usually incomplete, which leads to a seriously ill-posed problem, such ...
详细信息
ISBN:
(纸本)9789811671890;9789811671883
Restricted by the scanning environment and the shape of the target to be detected, the obtained projection datafrom computed tomography (CT) are usually incomplete, which leads to a seriously ill-posed problem, such as limited-angle CT reconstruction. In this situation, the classical filtered back-projection (FBP) algorithm loses efficacy especially when the scanning angle is seriously limited. By comparison, the simultaneous algebraic reconstruction technique (SART) can deal with the noise better than FBP, but it is also interfered by the limited-angle artifacts. At the same time, the total variation (TV) algorithm has the ability to address the limited-angle artifacts, since it takes into account a priori information about the target to be reconstructed, which alleviates the ill-posedness of the problem. Nonetheless, the current algorithms exist limitations when dealing with the limited-angle CT reconstruction problem. This paper analyses the distribution of the limited-angle artifacts, and it emerges globally. Then, motivated by TV algorithm, tight frame wavelet decomposition and group sparsity, this paper presents a regularization model based on sparse multi-level information groups of the images to address the limited-angle CT reconstruction, and the corresponding algorithm called modified proximal alternating linearized minimization (MPALM) is presented to deal with the proposed model. Numerical implementations demonstrate the effectiveness of the presented algorithms compared with the above classical algorithms.
In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete ...
ISBN:
(纸本)9781713871088
In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multiple operators. While this idea has been successfully applied in various applications, a precise characterization of the conditions for learning is still lacking. In this paper, we fill this gap by presenting necessary and sufficient conditions for learning the underlying signal model needed for reconstruction which indicate the interplay between the number of distinct measurement operators, the number of measurements per operator, the dimension of the model and the dimension of the signals. Furthermore, we propose a novel and conceptually simple unsupervised learning loss which only requires access to incomplete measurement data and achieves a performance on par with supervised learning when the sufficient condition is verified. We validate our theoretical bounds and demonstrate the advantages of the proposed unsupervised loss compared to previous methods via a series of experiments on various imaging inverse problems, such as accelerated magnetic resonance imaging, compressed sensing and image inpainting.
With the advent of interferometric instruments with 4 telescopes at the VLTI and 6 telescopes at CHARA, the scientific possibility arose to routinely obtain milli-arcsecond scale images of the observed targets. Such a...
详细信息
ISBN:
(纸本)9781510636804
With the advent of interferometric instruments with 4 telescopes at the VLTI and 6 telescopes at CHARA, the scientific possibility arose to routinely obtain milli-arcsecond scale images of the observed targets. Such an imagereconstruction process is typically performed in a Bayesian framework where the function to minimize is made of two terms: the data likelihood and the Bayesian prior. This prior should be based on our prior knowledge of the observed source. Up to now, this prior was chosen from a set of generic and arbitrary functions, such as total variation for example. Here, we present an imagereconstruction framework using generative adversarial networks where the Bayesian prior is defined using state-ofthe-art radiative transfer models of the targeted objects. We validate this new imagereconstruction algorithm on synthetic data with added noise. The generated images display a drastic reduction of artefacts and allow a more straightforward astrophysical interpretation. The results can be seen as a first illustration of how neural networks can provide significant improvements to the imagereconstruction post processing of a variety of astrophysical sources.
A deep learning approach will be used to recover ancient pictures that have suffered significant damage. Unlike typical reconstruction processes that are easily handled by supervised learning methods, real-world pictu...
详细信息
A deep learning approach will be used to recover ancient pictures that have suffered significant damage. Unlike typical reconstruction processes that are easily handled by supervised learning methods, real-world picture degradation seems to be complex, and the system is unable to generalize due to domain differences between synthetic pictures and actual old pictures. Therefore, using huge amounts of synthetic image pairs combined with real photos, Therefore, using huge amounts of synthetic picture pairs combined with real photos, A unique triplet domain translation network. Two variational autoencoders (VAEs) have been trained to create latent spaces from both fresh and old images, respectively. The translation between two regions is thenmanaged to learn using artificially paired data. This translation normalizes well to actual photographs as the domain gap is filled in the compact latent space. The translation between these two various latent regions has been taught using artificially paired data. This translation normalizes well to images found in the real world because the compact latent space is filled with the domain gap. A global division with an incomplete nonlocal block will target structural issues like cuts and bruises and a local division attacking unstructured defects like unwanted noise and poor contrast to handle the various degradations mixed throughout an old photograph. The latent space fusion of two branches increases the ability to correct numerous flaws in old images. Convolutional neural networks (CNNs) outperform multiple-layer sequenced neural network models at identifying distinct marks, forms, and patterns in images, making them the most efficient method for processing data. The filters are applied by CNN to every pixel in the image. When it comes to visual quality, the suggested method for repairing old photographs performs better than cutting-edge techniques.
Recent works have used deep learning for accurate parameter estimation in diffusion-weighted magnetic resonance imaging (DWMRI). However, no prior study has addressed the fetal brain, mainly because obtaining reliable...
详细信息
ISBN:
(纸本)9783030872342;9783030872335
Recent works have used deep learning for accurate parameter estimation in diffusion-weighted magnetic resonance imaging (DWMRI). However, no prior study has addressed the fetal brain, mainly because obtaining reliable fetal DW-MRI data with accurate ground truth parameters is very challenging. To overcome this obstacle, we present a novel method that uses both fetal scans as well as high-quality pre-term newborn scans. We use the newborn scans to estimate accurate parameter maps. We then use these parameter maps to generate DW-MRI data that match the measurement scheme and noise distributions that are characteristic of fetal scans. To demonstrate the effectiveness and reliability of the proposed data generation pipeline, we use the generated data to train a convolutional neural network for estimating color fractional anisotropy. We show that the proposed machine learning pipeline is significantly superior to standard estimation methods in terms of accuracy and expert assessment of reconstruction quality. Our proposed methods can be adapted for estimating other diffusion parameters for fetal brain.
暂无评论