In the domain of computer graphics, achieving high visual quality in real-time rendering remains a formidable challenge due to the inherent time-quality tradeoff. Conventional real-time rendering engines sacrifice vis...
详细信息
In the domain of computer graphics, achieving high visual quality in real-time rendering remains a formidable challenge due to the inherent time-quality tradeoff. Conventional real-time rendering engines sacrifice visual fidelity for interactive performance, while image generation using path-tracing techniques can be exceedingly time-consuming. In this article, we introduce RenderGAN, a deep learning-based solution designed to address this critical challenge in real-time rendering. RenderGAN uses G-Buffers and information from a real-time rendering engine as inputs to produce output images with exceptional visual fidelity. Its encoder-decoder architecture, trained using the Generative Adversarial Network (GAN) framework with perceptual loss, enhances image realism. To evaluate RenderGAN's effectiveness, we quantitatively compare the generated images with those of a path-tracing engine, obtaining a remarkable Universal Image Quality Index (UIQI) value of 0.898. RenderGAN's open source nature fosters collaboration, driving advancements in real-time computer graphics and rendering techniques. By bridging the gap between real-time and path-tracing rendering, RenderGAN opens new horizons for accelerated image generation, inspiring innovation and unlocking the full potential of real-time visual experiences. Project page: https://***/marcomameli1992/RenderNet
Multi-object tracking (MOT) is a fundamental problem in computer vision that involves tracing the trajectories of foreground targets throughout a video sequence while establishing correspondences for identical objects...
详细信息
Multi-object tracking (MOT) is a fundamental problem in computer vision that involves tracing the trajectories of foreground targets throughout a video sequence while establishing correspondences for identical objects across frames. With the advancement of deep learning techniques, methods based on deep learning have significantly improved accuracy and efficiency in MOT. This paper reviews several recent deep learning-based MOT methods and categorises them into three main groups: detection-based, single-object tracking (SOT)-based, and segmentation-based methods, according to their core technologies. Additionally, this paper discusses the metrics and datasets used for evaluating MOT performance, the challenges faced in the field, and future directions for research.
Underwater imaging techniques have been a focus of research for computer vision. Underwater imaging frequently encounters challenges for poor image quality and slow restoration speed, thereby hindering human underwate...
详细信息
Underwater imaging techniques have been a focus of research for computer vision. Underwater imaging frequently encounters challenges for poor image quality and slow restoration speed, thereby hindering human underwater exploration endeavors. To enhance the quality and improve the real-time performance of underwater image restoration, the paper proposes a lightweight underwater color image restoration network based on multiscale depthwise separable convolution. First, the algorithm tackles the problems of difficult convergence and slow training by improving the AdamW optimizer. Then, we propose a multiscale depthwise separable convolution module with RGB channel, which allows efficient extraction of image features based on the underwater light propagation properties. The MDSCN can effectively improve the processing speed and recovery effect of underwater images. Through experimentation and analysis, our algorithm outperforms traditional image processing methods and recent deep learning approaches in terms of visual effects and objective evaluation metrics. Furthermore, our algorithm also has a better performs than existing deep learning methods in processing speed, which demonstrates excellent generalizability and practicality. The research in the article is highly informative for the field of underwater computer vision. The dataset, training weights files and codes are publicly available https://***/raining-li/underwater-image-processing/tree/master.
With the advancement of graphic engines, real-life structures can be digitized with more realistic representations than before. Virtual models obtained from LiDAR (Light Detection and Ranging) data in real-time applic...
详细信息
With the advancement of graphic engines, real-life structures can be digitized with more realistic representations than before. Virtual models obtained from LiDAR (Light Detection and Ranging) data in real-time applications can be inspected in graphic engines without rendering a point cloud. Well-known proprietary software is used to convert scanning from LiDAR into meshes of triangles that work the best on graphic pipelines. However proprietary software is usually expensive, hard to learn, and requires manual interaction. The proposed methodology generates virtual models from LiDAR with little manual interaction employing open-source software in an automated workflow for generic conversion. The point cloud is registered for geo-reference, processed for building textured models, and implemented in Unreal Engine 5 for Virtual Reality deployment. Specific improvements were made to the selected study case of the Castro of Santa Trega. Visualization of the model is overall more realistic than the rendering of every point in a cloud. The average framerate is improved upon a 229% when rendering optimized meshes compared to point clouds, leading to an enriched visualization quality and reduced data size. A Virtual Reality (VR) experience was implemented with an average of 143 FPS, surpassing the standard 90 FPS recommended to avoid motion sickness.
Neural networks have become foundational in modern technology, driving advancements across diverse domains such as medicine, law enforcement, and information technology. By enabling algorithms to learn from data and p...
详细信息
Neural networks have become foundational in modern technology, driving advancements across diverse domains such as medicine, law enforcement, and information technology. By enabling algorithms to learn from data and perform tasks autonomously, they eliminate the need for explicit programming. A significant challenge in this field is replicating the uniquely human capacity for creativity-envisioning and realizing novel concepts and tangible creations. Generative Adversarial Networks (GANs), a leading approach in this effort, are especially notable for synthesizing realistic human facial images. Despite the success of GANs, comprehensive comparative studies of face-generating GAN methodologies are limited. This paper addresses this gap by analyzing the scope and capabilities of facial generation, detailing the principles of the original GAN framework, and reviewing prominent GAN variants specifically designed for facial synthesis. Through performance evaluations and fidelity analysis of generated images, this study contributes to a deeper understanding of GAN potential in advancing artificial intelligence creativity through performance evaluations and fidelity analysis of generated images.
Rendering 3D virtual scenarios has become a popular alternative for generating per-pixel-labeled image datasets, especially in fields like autonomous driving. The approach is valuable for training neural perception mo...
详细信息
Rendering 3D virtual scenarios has become a popular alternative for generating per-pixel-labeled image datasets, especially in fields like autonomous driving. The approach is valuable for training neural perception models, such as semantic segmentation models, particularly when data might be scarce, expensive, or difficult to collect. However, fundamental questions persist within the research community regarding the generation and processing of these synthetic images, particularly a better understanding of the key factors influencing the performance of deep learning models trained with such synthetic images. In response, we conducted a series of experiments to elucidate the impact that common aspects involved in the generation of rendered synthetic images may have on the performance of neural semantic segmentation tasks. Our study used a recent autonomous driving synthetic dataset as our main testbed, allowing us to investigate the effect of different approaches when modeling their geometric, material, and lighting details. We also studied the impact of rendering noise, typically produced by path-tracing algorithms, as well as the impact of using different color transformations and tonemapping algorithms.
Removing shadows in images is often a necessary pre-processing task for improving the performance of computer vision applications. Deep learning shadow removal approaches require a large-scale dataset that is challeng...
详细信息
Removing shadows in images is often a necessary pre-processing task for improving the performance of computer vision applications. Deep learning shadow removal approaches require a large-scale dataset that is challenging to gather. To address the issue of limited shadow data, we present a new and cost-effective method of synthetically generating shadows using 3D virtual primitives as occluders. We simulate the shadow generation process in a virtual environment where foreground objects are composed of mapped textures from the Places-365 dataset. We argue that complex shadow regions can be approximated by mixing primitives, analogous to how 3D models in computer graphics can be represented as triangle meshes. We use the proposed synthetic shadow removal dataset, DLSUSynthPlaces-100K, to train a feature-attention-based shadow removal network without explicit domain adaptation or style transfer strategy. The results of this study show that the trained network achieves competitive results with state-of-the-art shadow removal networks that were trained purely on typical SR datasets such as ISTD or SRD. Using a synthetic shadow dataset of only triangular prisms and spheres as occluders produces the best results. Therefore, the synthetic shadow removal dataset can be a viable alternative for future deep-learning shadow removal methods. The source code and dataset can be accessed at this link: https://***/SynthShadowRemoval/.
Keyframes are a standard representation for kinematic motion specification. Recent learned motion-inbetweening methods use keyframes as a way to control generative motion models, and are trained to generate life-like ...
详细信息
Keyframes are a standard representation for kinematic motion specification. Recent learned motion-inbetweening methods use keyframes as a way to control generative motion models, and are trained to generate life-like motion that matches the exact poses and timings of input keyframes. However, the quality of generated motion may degrade if the timing of these constraints is not perfectly consistent with the desired motion. Unfortunately, correctly specifying keyframe timings is a tedious and challenging task in practice. Our goal is to create a system that synthesizes high-quality motion from keyframes, even if keyframes are imprecisely timed. We present a method that allows constraints to be retimed as part of the generation process. Specifically, we introduce a novel model architecture that explicitly outputs a time-warping function to correct mistimed keyframes and spatial residuals that add pose details. We demonstrate how our method can automatically turn approximately timed keyframe constraints into diverse, realistic motions with plausible timing and detailed submovements.
A highly integrated Earth-observing satellite can possess several maneuverable payloads to perform different missions simultaneously, which brings some challenges to the method of task scheduling. This paper addresses...
详细信息
A highly integrated Earth-observing satellite can possess several maneuverable payloads to perform different missions simultaneously, which brings some challenges to the method of task scheduling. This paper addresses the selection and scheduling problem of an agile satellite with several independently maneuverable optical payloads. Some differences compared to the traditional scheduling problem of agile satellites are presented and considered in a constrained optimization model. A two-stage method is proposed to accomplish the scheduling of the satellite and payloads in different stages. Clusters are generated from preprocessed tasks by a clique partition algorithm, and their centers are used to calculate the pointing direction of the satellite in the first stage. A multiobjective local search algorithm is introduced to schedule tasks in each selected cluster in the second stage. Considering the time-dependent property of the transition time, the problem of determining the start observation time is transformed into linear programming in a proposed insertion operator that guarantees the feasibility of generated solutions. Two types of instances are created and tested to demonstrate the effectiveness of the proposed method, and some analyses are conducted based on the experimental results.
In this special issue of IEEE Transactions on Visualization and computer graphics (TVCG), we are pleased to present the top papers from the 32nd IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR 2025)...
详细信息
In this special issue of IEEE Transactions on Visualization and computer graphics (TVCG), we are pleased to present the top papers from the 32nd IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR 2025), held March 8–12, 2025, in Saint-Malo, France.
暂无评论