Recent advances in volumetric video technologies are opening the door to new interactive and immersive experiences. This paper presents novel technological enablers to provide realistic and volumetric reconstructions ...
详细信息
ISBN:
(纸本)9798400717949
Recent advances in volumetric video technologies are opening the door to new interactive and immersive experiences. This paper presents novel technological enablers to provide realistic and volumetric reconstructions of real-world elements. On the one hand, novel Neural Radiance Field (NeRF) based methods are proposed to increase the quality of 3D scenes and objects reconstructions, with improved smoothness of the surfaces. On the other hand, a real-time volumetric video pipeline is proposed to overcome limitations in terms of visual quality and resources usage (bandwidth and processing) of state-of-the-art solutions adopted in the holographic communications domain. Preliminary results not only prove the benefits of such enablers when compared to state-of-the-art solutions, but the potential of their joint integration to provide new compelling and engaging interactive virtual experiences.
In recent years, with the development of digital technology, digital imageprocessing has been widely and deeply applied in the field of computer graphics. Digital imageprocessing system is a complex real-time system...
详细信息
ISBN:
(纸本)9781665487894
In recent years, with the development of digital technology, digital imageprocessing has been widely and deeply applied in the field of computer graphics. Digital imageprocessing system is a complex real-time system, it from the camera, fax machine and other scanning equipment to obtain image information, after digital transformation, digital image information coding, filtering, enhancement, recovery, compression, storage and other processing, finally generate visual image. This design uses TMS320C6748 as the core processor of the system, SAA7113 as the video decoding chip of the system, CPLD as the sampling controller, DDR2 chip as the external expansion memory. The ROM expansion uses NAND flash memory chip. On the basis of hardware design, combined with software algorithm to complete imageprocessing. The system can be used in information communication, image recognition, news scene and other fields of imageprocessing and transmission. This paper analyzes the hardware structure and data processing algorithm of the system in detail. The experimental results show that the system can not only obtain higher compression ratio, but also reduce the distortion of the reconstructed image. The imageprocessing system has certain practicability.
With the rapid development of target segmentation techniques, the YOLO family of algorithms has become popular due to its efficiency. In this paper, we propose an improved YOLOv8 model aimed at improving the performan...
详细信息
ISBN:
(纸本)9798400707032
With the rapid development of target segmentation techniques, the YOLO family of algorithms has become popular due to its efficiency. In this paper, we propose an improved YOLOv8 model aimed at improving the performance of instance segmentation of aircraft images. We enhance the model's ability to capture the global dependencies of the aircraft in the image by introducing a Non-local attention mechanism, while integrating a bidirectional feature pyramid network (BiFPN) for finer feature fusion. Experimental results conducted on the publicly available COCO dataset aircraft category show that the improved YOLOv8 model outperforms the original model in several performance metrics, especially the significantly improved detection accuracy in complex backgrounds. These improvements provide effective technical support for real-time aircraft detection and segmentation, demonstrating the potential of attentional mechanisms and advanced feature fusion techniques for practical applications.
imageprocessing and analysis make extensive use of image scaling. In digital imageprocessing, resizing an image is referred to as image scaling. When images are magnified, one of the most important factors is their ...
详细信息
video Snapshot Compressive Imaging (SCI) uses a low-speed 2D camera to capture high-speed scenes as snapshot compressed measurements, followed by a reconstruction algorithm to retrieve the high-speed video frames. The...
详细信息
The real-time obstacle detection and path adjustment system for autonomous robots presented in this paper was created using OpenCV. The combination of imageprocessing techniques enables the robot to identify and navi...
详细信息
real-world image recognition systems often face corrupted input images, which cause distribution shifts and degrade the performance of models. These systems often use a single prediction model in a central server and ...
详细信息
ISBN:
(纸本)9781728198354
real-world image recognition systems often face corrupted input images, which cause distribution shifts and degrade the performance of models. These systems often use a single prediction model in a central server and process images sent from various environments, such as cameras distributed in cities or cars. Such single models face images corrupted in heterogeneous ways in test time. Thus, they require to instantly adapt to the multiple corruptions during testing rather than being re-trained at a high cost. Test-time adaptation (TTA), which aims to adapt models without accessing the training dataset, is one of the settings that can address this problem. Existing TTA methods indeed work well on a single corruption. However, the adaptation ability is limited when multiple types of corruption occur, which is more realistic. We hypothesize this is because the distribution shift is more complicated, and the adaptation becomes more difficult in case of multiple corruptions. In fact, we experimentally found that a larger distribution gap remains after TTA. To address the distribution gap during testing, we propose a novel TTA method named Covariance-Aware Feature alignment (CAFe). We empirically show that CAFe outperforms prior TTA methods on image corruptions, including multiple types of corruptions.
With the rapid development of mobile Internet, in recent years, mobile short video platforms such as Douyin and Kuaishou have shown a strong development trend in China. In order to alleviate and solve the problem of &...
详细信息
ISBN:
(纸本)9798331527662
With the rapid development of mobile Internet, in recent years, mobile short video platforms such as Douyin and Kuaishou have shown a strong development trend in China. In order to alleviate and solve the problem of "information overload" of short videos, effective recommendation algorithms must be available to help users find short videos they are interested in. Traditional recommendation algorithms can be basically divided into three categories: collaborative filtering based recommendation algorithm, content-based recommendation algorithm and popularity-based recommendation algorithm. In addition to popularity-based recommendation algorithms, both traditional recommendation algorithms and deep learning recommendation algorithms use the similarities between users or items to "cluster" things and "group" people, so they can achieve personalized recommendation. The traditional recommendation algorithm is simple, the model parameters are small, and the online real-time performance is high, but the expression ability is weak. The deep learning recommendation algorithm has strong expression ability, and can mine more hidden patterns in the data, but the model parameters are large, and occupy more spatial resources such as memory and video memory. Not only is the training time and testing time of offline model much higher than that of traditional recommendation model, but also brings great challenges to the real-time performance of online recommendation system. In this paper, we propose a short video personalized recommendation algorithm based on the improved THACIL model, which has done important work in the attention mechanism of the original THACIL model. Compared with the original THACIL model, the improved algorithm can not only make use of the advantages of deep learning, but also reduce the complexity of the model and greatly reduce the scale of model parameters. When the relevant indicators such as AUC, accuracy rate and recall rate are slightly higher than the o
real-time Magnetic Resonance Imaging (rtMRI) is frequently used in speech production studies as it provides a complete view of the vocal tract during articulation. This study investigates the effectiveness of rtMRI in...
详细信息
We present the system architecture for real-timeprocessing of data that originates in large format tiled imaging arrays used in wide area motion imagery ubiquitous surveillance. High performance and high throughput i...
详细信息
ISBN:
(纸本)9798350305081
We present the system architecture for real-timeprocessing of data that originates in large format tiled imaging arrays used in wide area motion imagery ubiquitous surveillance. High performance and high throughput is achieved through approximate computing and fixed point variable precision (6 bits to 18 bits) arithmetic. The architecture implements a variety of processing algorithms in what we consider today as Third Wave AI and Machine Intelligence ranging from convolutional networks (CNNs) to linear and non-linear morphological processing, probabilistic inference using exact and approximate Bayesian methods and Deep Neural Networks based classification. The processing pipeline is implemented entirely using event based neuromorphic and stochastic computational primitives. An emulation of the system architecture demonstrated processing in real-time 160 x 120 raw pixel data running on a reconfigurable computing platform (5 Xilinx Kintex-7 FPGAs). The reconfigurable computing implementation was developed to emulate the computational structures for a 2.5D System chiplet design, that was fabricated in the 55nm GF CMOS technology. To optimize for energy efficiency of a mixed level system, a general energy aware methodology is applied through the design process at all levels from algorithms and architecture all the way down to technology and devices, while at the same time keeping the operational requirements and specifications for the task at focus.
暂无评论