This paper investigates video frame extrapolation, which can predict future frames from current and past frames. Although there have been many studies on video frame extrapolation in recent years, most of them suffer ...
详细信息
This paper investigates video frame extrapolation, which can predict future frames from current and past frames. Although there have been many studies on video frame extrapolation in recent years, most of them suffer from the unsatisfactory image quality of the predicted frames such as severe blurring because it is difficult to predict the movement of future pixels for multi-modal video frames, especially with fast changing frames. An additional process such as frame alignment or recurrent prediction can improve the quality of the predicted frames, but it hinders real-time extrapolation. Motivated by the significant progress in video frame interpolation using deep learning-based flow estimation, a simplified video frame extrapolation scheme using deep learning-based uni-directional flow estimation is proposed to reduce the processingtime compared to conventional video frame extrapolation schemes without compromising the image quality of the predicted frames. In the proposed scheme, the uni-directional flow is first estimated from the current and past frames through a flow network consisting of four flow blocks and the current frame is forward-warped through the estimated flow to predict a future frame. The proposed flow network is trained and evaluated using the Vimeo-90K triplet dataset. The performance of the proposed scheme is analyzed using the trained flow network in terms of prediction time as well as the similarity between predicted and ground truth frames such as the structural similarity index measure and mean absolute error of pixels, and compared to that of the state-of-the-art schemes such as Iterative and cycleGAN schemes. Extensive experiments show that the proposed scheme improves prediction quality by 2.1% and reduces prediction time by 99.7% compared to the state-of-the-art scheme.
Autonomous vehicles require real-timeimageprocessing to improve their capabilities by allowing them to understand and respond appropriately to their environment. This paper examines the present state of real-time im...
详细信息
Edge-device-based object detection is crucial in many real-world applications, such as self-driving cars, ADAS, driver behavior analysis. Although deep learning (DL) has become the de-facto approach for object detecti...
详细信息
Edge-device-based object detection is crucial in many real-world applications, such as self-driving cars, ADAS, driver behavior analysis. Although deep learning (DL) has become the de-facto approach for object detection, the limited computing resources of embedded devices and the large model size of current DL-based methods increase the difficulty of real-time object detection on edge devices. To overcome these difficulties, in this work a novel YOLOv4-dense model is proposed to detect objects in an accurate, fast manner, which is built on top of the YOLOv4 framework but with substantial improvements. More specifically, lots of CSP layers are pruned since it will decrease inference speed. And to address the losing small objects problem, a dense block is introduced. In addition, a lightweight two-stream YOLO head is also designed to further reduce the computational complexity of the model. Experimental results on NVIDIA JETSON TX2 embedded platform demonstrate that YOLOv4-dense can achieve a higher accuracy, faster speed with smaller model size. For instance, on the KITTI dataset, YOLOv4-dense obtains 84.3% mAP and 22.6 FPS with only 20.3 M parameters, surpassing the state-of-the-art models with comparable parameter budget such as YOLOv3-tiny, YOLOv4-tiny, PP-YOLO-tiny by a large margin.
Addressing the limitations of current video display solutions in terms of channel capacity, this article introduces a multi-channel independent video merging and real-time display system powered by Field Programmable ...
详细信息
This paper discusses the application of computer vision and advanced calculation algorithm in evaluating the teaching risk and teaching effect of track and field. Because of the inherent uncertainty and risk of PE and...
详细信息
This paper discusses the application of computer vision and advanced calculation algorithm in evaluating the teaching risk and teaching effect of track and field. Because of the inherent uncertainty and risk of PE and sports activities (especially track and field), it is necessary to establish an effective of analyzing and processingvideo data to detect and track moving objects, so as to identify potential risks in realtime. This method not only improves the safety of students in track and field classes, but also provides valuable insights for improving teaching methods and reducing sports injuries. This paper discusses the background subtraction motion detection algorithm, which is very important for dynamic image modeling and shadow suppression, and can realize accurate motion state detection. The ultimate goal is to ensure the healthy development of school sports and optimize the teaching results of track and field sports.
With the increasing demand for multimedia on the Internet, video technology has gradually become the mainstream of multimedia transmission on the Internet. In order to avoid overflow and underflow of buffer in transmi...
详细信息
This research paper presents a novel Virtual Gym Tracker AI Pose Estimation system designed to enhance virtual fitness experiences. Leveraging advanced deep learning techniques and real- timeimage analysis, the syste...
详细信息
For safety and security reasons, the indoor/outdoor working environments of various industries require the use of many cameras for automated surveillance. In such context, a major challenge for automated monitoring sy...
详细信息
The arrival of Sora marks a new era for text-to-video diffusion models, bringing significant advancements in video generation and potential applications. However, Sora, along with other text-to-video diffusion models,...
暂无评论