Neural network (NN) denoisers are an essential building block in many common tasks, ranging fromimagereconstruction to image generation. However, the success of these models is not well understood from a theoretical...
详细信息
ISBN:
(纸本)9781713899921
Neural network (NN) denoisers are an essential building block in many common tasks, ranging fromimagereconstruction to image generation. However, the success of these models is not well understood from a theoretical perspective. In this paper, we aim to characterize the functions realized by shallow ReLU NN denoisers- in the common theoretical setting of interpolation (i.e., zero training loss) with a minimal representation cost (i.e., minimal l(2) norm weights). First, for univariate data, we derive a closed form for the NN denoiser function, find it is contractive toward the clean data points, and prove it generalizes better than the empirical MMSE estimator at a low noise level. Next, for multivariate data, we find the NN denoiser functions in a closed form under various geometric assumptions on the training data: data contained in a low-dimensional subspace, data contained in a union of one-sided rays, or several types of simplexes. These functions decompose into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples. We empirically verify this alignment phenomenon on synthetic data and real images.
In this paper, first, the structure of a linear sparse periodic array for two-dimensional scanning is described. Then, based on its characteristics, an algorithm is presented for fast imagereconstruction of the scene...
详细信息
ISBN:
(纸本)9781510661844;9781510661851
In this paper, first, the structure of a linear sparse periodic array for two-dimensional scanning is described. Then, based on its characteristics, an algorithm is presented for fast imagereconstruction of the scene in a near-field (NF) multistatic terahertz imaging scenario. Although the basis of this algorithm is developed in the Fourier domain, it is compatible with the non-uniform structure of the array and also takes into account the phase deviations caused by multistatic imaging in NF. The performance of the proposed approach is evaluated with numerical data obtained from electromagnetic simulations in FEKO as well as experimental data. The results are discussed in terms of computational time on the central processing unit and graphics processing unit as well as the quality of the reconstructed image.
Recovering the geometry of a human head from a single image, while factorizing the materials and illumination, is a severely ill-posed problem that requires prior information to be solved. Methods based on 3D Morphabl...
详细信息
ISBN:
(纸本)9781665493468
Recovering the geometry of a human head from a single image, while factorizing the materials and illumination, is a severely ill-posed problem that requires prior information to be solved. Methods based on 3D Morphable Models (3DMM), and their combination with differentiable renderers, have shown promising results. However, the expressiveness of 3DMMs is limited, and they typically yield over-smoothed and identity-agnostic 3D shapes limited to the face region. Highly accurate full head reconstructions have recently been obtained with neural fields that parameterize the geometry using multilayer perceptrons. The versatility of these representations has also proved effective for disentangling geometry, materials and lighting. However, these methods require several tens of input images. In this paper, we introduce SIRA, a method which, from a single image, reconstructs human head avatars with high fidelity geometry and factorized lights and surface materials. Our key ingredients are two data-driven statistical models based on neural fields that resolve the ambiguities of single-view 3D surface reconstruction and appearance factorization. Experiments show that SIRA obtains state of the art results in 3D head reconstruction while at the same time it successfully disentangles the global illumination, and the diffuse and specular albedos. Furthermore, our reconstructions are amenable to physically-based appearance editing and head model relighting.
Scene text image super-resolution (STISR) is a super-resolution task for specific imagedata, with the objective of enhancing the clarity of low-resolution scene text images, thereby improving the performance of downs...
详细信息
High dynamic range (HDR) images capture much more intensity levels than standard ones. Current methods predominantly generate HDR images from 8-bit low dynamic range (LDR) sRGB images that have been degraded by the ca...
ISBN:
(纸本)9798350307184
High dynamic range (HDR) images capture much more intensity levels than standard ones. Current methods predominantly generate HDR images from 8-bit low dynamic range (LDR) sRGB images that have been degraded by the camera processing pipeline. However, it becomes a formidable task to retrieve extremely high dynamic range scenes from such limited bit-depth data. Unlike existing methods, the core idea of this work is to incorporate more informative Raw sensor data to generate HDR images, aiming to recover scene information in hard regions (the darkest and brightest areas of an HDR scene). To this end, we propose a model tailor-made for Raw images, harnessing the unique features of Raw data to facilitate the Raw-to-HDR mapping. Specifically, we learn exposure masks to separate the hard and easy regions of a high dynamic scene. Then, we introduce two important guidances, dual intensity guidance, which guides less informative channels with more informative ones, and global spatial guidance, which extrapolates scene specifics over an extended spatial domain. To verify our Raw-to-HDR approach, we collect a large Raw/HDR paired dataset for both training and testing. Our empirical evaluations validate the superiority of the proposed Raw-to-HDR reconstruction model, as well as our newly captured dataset in the experiments.
image compression is crucial for efficient storage and transmission of visual data. Traditional methods like JPEG use transform coding, which may result in loss of fine details. Compressed sensing (CS) offers an alter...
详细信息
image keypoint descriptions that are discriminative and matchable over large changes in viewpoint are vital for 3D reconstruction. However, descriptions output by learned descriptors are typically not robust to camera...
详细信息
ISBN:
(纸本)9798350353013;9798350353006
image keypoint descriptions that are discriminative and matchable over large changes in viewpoint are vital for 3D reconstruction. However, descriptions output by learned descriptors are typically not robust to camera rotation. While they can be made more robust by, e.g., data augmentation, this degrades performance on upright images. Another approach is test-time augmentation, which incurs a significant increase in runtime. Instead, we learn a linear transform in description space that encodes rotations of the input image. We call this linear transform a steerer since it allows us to transform the descriptions as if the im-age was rotated. from representation theory, we know all possible steerers for the rotation group. Steerers can be optimized (A) given a fixed descriptor, (B) jointly with a descriptor or (C) we can optimize a descriptor given a fixed steerer. We perform experiments in these three settings and obtain state-of-the-art results on the rotation invariant image matching benchmarks AIMS and Roto-360. We publish code and model weights at this https url.
Structured illumination microscopy is a widely popular super-resolution technique for live cell imaging capable of surpassing the diffraction limit. Its temporal resolution is limited by the need to capture multiple l...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
Structured illumination microscopy is a widely popular super-resolution technique for live cell imaging capable of surpassing the diffraction limit. Its temporal resolution is limited by the need to capture multiple low-resolution images to reconstruct a single high-resolution image. When observing rapid biological processes, the local movement between frames leads to the formation of reconstruction artifacts, which subsequently impair the data interpretation. We propose to include this type of movement in the definition of the image formation forward problem. The motion can then be estimated from the original data using optical flow, and the optimization problem is solved using the alternating direction method of multipliers. Our approach is tested against other reconstruction techniques on both synthetic and real biological data.
Machine unlearning is a promising paradigm for removing unwanted data samples from a trained model, towards ensuring compliance with privacy regulations and limiting harmful biases. Although unlearning has been shown ...
详细信息
Understanding the finer details of a 3D object, its contours, is the first step toward a physical understanding of an object. Many real-world application domains require adaptable 3D object shape recognition models, u...
详细信息
ISBN:
(纸本)9781665493468
Understanding the finer details of a 3D object, its contours, is the first step toward a physical understanding of an object. Many real-world application domains require adaptable 3D object shape recognition models, usually with little training data. For this purpose, we develop the first automatically generated contour labeled dataset, bypassing manual human labeling. Using this dataset, we study the performance of current state-of-the-art instance segmentation algorithms on detecting and labeling the contours. We produce promising visual results with accurate contour prediction and labeling. We demonstrate that our finely labeled contours can help downstream tasks in computer vision, such as 3D reconstructionfrom a 2D image.
暂无评论