Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powe...
详细信息
Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have been proposed to address various problems in this area, especially automatic visual speech recognition and generation. To push forward future research on visual speech, this paper will present a comprehensive review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance. Besides, we also identify gaps in current research and discuss inspiring future research directions.
The multilayer reflectors of insect epidermis can produce unique structural color through interactions with light. Many fossilized insects, like amber-entombed wasps, present structural colors. However, how this multi...
详细信息
The multilayer reflectors of insect epidermis can produce unique structural color through interactions with light. Many fossilized insects, like amber-entombed wasps, present structural colors. However, how this multilayer structure and structural colors are preserved during the fossilization process is still being determined. We use a transfer matrix method (TMM) and a Non-Standard Finite Difference Time Domain (NS-FDTD) simulations to analyze the effects of both expected compressions and expansions of the epidermis layer thickness during fossilization on its structural colors. We estimate the variations of epidermis layer thickness due to the fossilization by measuring their color distances. Surprisingly, we find that the structural coloration of the multilayer reflectors, ranging from blue to green, emitted by many insects remained unchanged from about +5% expansion to -12% compression of their thickness. These findings suggest that, first, insects might have kept their original colors during the fossilization process. Second, the appearance of these structural colors in insects might not just be by chance, but could also be a result of specific evolutionary choices.
Global Illumination (GI) is a technique that is employed in computer graphics to enhance realism. Various methods have been used to achieve this using computer-generated imagery. The most precise method involves conve...
详细信息
Global Illumination (GI) is a technique that is employed in computer graphics to enhance realism. Various methods have been used to achieve this using computer-generated imagery. The most precise method involves conventional ray tracing, which yields highly realistic results but is computationally intensive and unsuitable for real-time applications. Alternatively, faster algorithms utilize post-processing on rasterization, making them more suitable for real-time scenarios. However, these algorithms are also resource-intensive and may produce inaccurate lighting due to limited information on screen-space features. our proposal involves utilizing a Generative Adversarial Network (GAN) approach to achieve real-time GI effects, following the methodology of conventional screen-space GI techniques. We take surrounding graphical information into account by going beyond screen-space and producing consistent GI effects that are comparatively closer to their physically correct ray-tracing counterpart. Moreover, our model provides a better quality of generated output than the other recent model which utilized a similar approach by scoring 0.90811 in SSIM, 0.00093 in MSE, and 30.30576 dB in PSNR on our developed dataset.
Knowing the pole axis of an asteroid is vital to autonomous asteroid exploration efforts. Ground-based initial pole estimation methods are time and data intensive and produce estimates with large uncertainties. These ...
详细信息
Knowing the pole axis of an asteroid is vital to autonomous asteroid exploration efforts. Ground-based initial pole estimation methods are time and data intensive and produce estimates with large uncertainties. These errors have a significant impact on proximity navigation, shape modeling, and scientific data for small body missions. In this paper, a new method of obtaining this information from onboard spacecraft imagery is presented. The proposed method estimates the pole from onboard infrared imagery using the camera-asteroid geometry. This method does not require a prior and is designed to work in a vast majority approach trajectories due to the use of infrared images. The method is applied to simulated infrared images of asteroids 101955 Bennu and 25143 Itokawa as well as real infrared images of asteroid 162173 Ryugu from the Hayabusa2 mission. The average pole errors using this method on Bennu and Itokawa images are approximately 2 and 6 deg, respectively. The pole estimate error on the Ryugu images is approximately 8 deg. The algorithm is shown to be sensitive to the percentage of spin period imaged and the spacing between the images.
Distributed ray tracing algorithms are widely used when rendering massive scenes, where data utilization and load balancing are the keys to improving performance. One essential observation is that rays are temporally ...
详细信息
Distributed ray tracing algorithms are widely used when rendering massive scenes, where data utilization and load balancing are the keys to improving performance. One essential observation is that rays are temporally coherent, which indicates that temporal information can be used to improve computational efficiency. In this paper, we use temporal coherence to optimize the performance of distributed ray tracing. First, we propose a temporal coherence-based scheduling algorithm to guide the task/data assignment and scheduling. Then, we propose a virtual portal structure to predict the radiance of rays based on the previous frame, and send the rays with low radiance to a precomputed simplified model for further tracing, which can dramatically reduce the traversal complexity and the overhead of network data transmission. The approach was validated on scenes of sizes up to 355 GB. Our algorithm can achieve a speedup of up to 81% compared to previous algorithms, with a very small mean squared error.
Edge-aware image smoothing refers to the removal of details with edges preserved. It is an essential topic in the field of image processing and computer graphics. In this paper, in order to achieve better edge preserv...
详细信息
Edge-aware image smoothing refers to the removal of details with edges preserved. It is an essential topic in the field of image processing and computer graphics. In this paper, in order to achieve better edge preservation than the existing models, we propose a robust edge-preserving image filtering method based on a complementary weighting scheme. Both isotropic and anisotropic weights are involved in our model to adapt the fidelity and the regularization terms. To efficiently solve the proposed model, we introduce an effective algorithm based on additive half quadratic minimization, alternating direction of multipliers, and Fourier domain optimization strategies. We experimentally validate the proposed filter on several low-level vision tasks. Both quantitative and qualitative experimental results show significant superiority of our proposed filter compared to existing techniques. Furthermore, the filter exhibits high efficiency and is able to process 720P color images (over 10 fps) in real-time on an NVIDIA RTX 3070. Therefore, it is practical for real-world applications.
To address the significant challenges of high false positive and false negative rates in existing algorithms for detecting cervical fluid-based cells, an enhanced Yolov5s network is introduced. This paper details a no...
详细信息
To address the significant challenges of high false positive and false negative rates in existing algorithms for detecting cervical fluid-based cells, an enhanced Yolov5s network is introduced. This paper details a novel approach that dynamically adjusts the weights of channels and the spatial attention in modules, substantially improving feature extraction from small objects and boosting the detection capabilities of the network. Furthermore, Mixup data augmentation technology is incorporated to counter the issue of imbalanced data categories in the custom dataset. The Complete Intersection over Union loss function is also employed to refine coordinate localization accuracy during training. Tested on the proprietary cervical cytology dataset, the modified Yolov5s achieves a mean Average Precision of 92.1%, surpassing the previous state-of-the-art by 5.6%. This enhancement substantiates the efficacy of the proposed model. Code and models are accessible at .
Structure from Motion (SfM) is a computer vision technique used to reconstruct three-dimensional (3D) structures from a series of two-dimensional (2D) images or video frames. However, SfM tools struggle with transpare...
详细信息
Structure from Motion (SfM) is a computer vision technique used to reconstruct three-dimensional (3D) structures from a series of two-dimensional (2D) images or video frames. However, SfM tools struggle with transparent objects, reflective surfaces, and low-resolution frames. In such situations, image-based interactive 3D modeling software packages are employed to model 3D objects and measure dimensions. Our contributions to this work are twofold. First, we have introduced new tools to improve 3D modeling software packages;such tools are aimed at easing the workload for users. Second, we have conducted a comprehensive user study to evaluate the efficacy of popular 3d modeling software packages. The task is to measure certain dimensions for which ground truth measurements are already known. A relative error is calculated for every measurement. The evaluation of each software tool is done through survey form, event logs, and measurement relative error. The results of this user study clearly show that our approach to 3D modeling using multiple images has a lower relative error and produces higher quality 3D models than other software packages. In addition, it shows our new tools reduce the required time for completing a task.
Rubber trees in coastal habitats are exposed to a high degree of wind stress. An algorithm-hardware synergetic methodology was developed for investigating and predicting rubber tree phenotyping excited by strong winds...
详细信息
Rubber trees in coastal habitats are exposed to a high degree of wind stress. An algorithm-hardware synergetic methodology was developed for investigating and predicting rubber tree phenotyping excited by strong winds. The framework includes (1) a custom-designed industrial fan that recreates a variable airflow field at wind speeds of 15, 30 and 45 m/s coupled with a terrestrial laser scanner and bundled motion sensors to acquire point clouds and vibration data;(2) a graphic model that approximates tree canopies based on foliage clumps with phenotypic traits that are derived from point clouds captured while trees are subjected to aerodynamic drag;and (3) the wind characteristic parameters of forest canopies were calculated by a developed forest-specialized k-epsilon turbulence model combining the constructed tree models and grid-scale subdivision of the wind fluid field. (4) A digital twin model that incorporates detailed tree phenotypic traits and considers plant mechanical characteristics was established, depicting the related wind-induced actions of target trees under various wind influences. The results show that tree crowns with spreading forms are prone to yield larger pendulum amplitudes than compact crowns, but trees directly exposed to wind exhibit greater crown volume reductions than trees in sheltered areas. Within tree canopies, a one-fold increase in inlet wind speed intensified crown compression (approximately 17 % decrease in crown volume), generated 2.1-fold pressure gradients and increased turbulence kinetic energy by approximately 60 %. Moreover, the entire scenario of the adaptation of experimental trees to wind perturbations was visually restored using digital twin techniques, serving as an integral behaviour dataset for further data-driven decision-making. In summary, this paper presents a comprehensive methodology that can decipher the phenotypic manifestations of trees' reactions to wind hazards, with potential applications in phenotyping or e
This special section contains three papers, which are extended contributions of original works presented at the 25th SIBGRAPI -Conference on graphics, Patterns and Images. Started in 1988, SIBGRAPI has been the main s...
详细信息
This special section contains three papers, which are extended contributions of original works presented at the 25th SIBGRAPI -Conference on graphics, Patterns and Images. Started in 1988, SIBGRAPI has been the main scientific event in computer graphics, Image Processing and related areas in Brazil. In 2012, celebrating the 25th anniversary of SIBGRAPI, it was held in the historical city of Ouro Preto, Minas Gerais. Each year, best papers are selected and authors are invited to submit extended versions to high quality journals. After a rigorous peer-reviewing process, the three contributions published in this section were selected among four invited submissions to be published in computers & graphics.
暂无评论