检索结果-内蒙古大学图书馆

Special issue on smart cameras for real-time image and video processing

JOURNAL OF real-time image processing 2020年第6期17卷 1755-1756页

作者： Shan, Caifeng Brea, Victor Manuel Velipasalar, Senem Shandong Univ Sci & Technol Qingdao Peoples R China Univ Santiago de Compostela La Coruna Spain Syracuse Univ Syracuse NY USA

来源：评论

学校读者我要写书评

暂无评论

A video Frame Extrapolation Scheme Using Deep Learning-Based Uni-Directional Flow Estimation and Pixel Warping

引用

IEEE ACCESS 2023年 11卷 105885-105891页

作者： Ban, Tae-Won Gyeongsang Natl Univ Dept Intelligent Commun Engn Jinju 52828 Gyeongsangnam South Korea

This paper investigates video frame extrapolation, which can predict future frames from current and past frames. Although there have been many studies on video frame extrapolation in recent years, most of them suffer from the unsatisfactory image quality of the predicted frames such as severe blurring because it is difficult to predict the movement of future pixels for multi-modal video frames, especially with fast changing frames. An additional process such as frame alignment or recurrent prediction can improve the quality of the predicted frames, but it hinders real-time extrapolation. Motivated by the significant progress in video frame interpolation using deep learning-based flow estimation, a simplified video frame extrapolation scheme using deep learning-based uni-directional flow estimation is proposed to reduce the processing time compared to conventional video frame extrapolation schemes without compromising the image quality of the predicted frames. In the proposed scheme, the uni-directional flow is first estimated from the current and past frames through a flow network consisting of four flow blocks and the current frame is forward-warped through the estimated flow to predict a future frame. The proposed flow network is trained and evaluated using the Vimeo-90K triplet dataset. The performance of the proposed scheme is analyzed using the trained flow network in terms of prediction time as well as the similarity between predicted and ground truth frames such as the structural similarity index measure and mean absolute error of pixels, and compared to that of the state-of-the-art schemes such as Iterative and cycleGAN schemes. Extensive experiments show that the proposed scheme improves prediction quality by 2.1% and reduces prediction time by 99.7% compared to the state-of-the-art scheme.

关键词： video frame extrapolation video frame prediction flow estimation flow network deep learning

来源：评论

学校读者我要写书评

暂无评论

real time image processing for Autonomous Vehicles

Real Time Image Processing for Autonomous Vehicles

引用

2024 Asian conference on Intelligent Technologies, ACOIT 2024

作者： Chandrashekhar, Mhamane Sanjeev Bachute, Bhagyashri Amol Gadgoli, Amruta K. Rankhamb, Dinesh Dattatraya Madri, Shrinivas Department of Electronics and Telecommunication Engineering Shree Siddheshwar Women's College of Engineering Maharashtra Solapur413002 India Department of Artificial Intelligence & Data Science VVP Institute of Technology Maharashtra Solapur413008 India

ISBN: (纸本)9798350374933

Autonomous vehicles require real-time image processing to improve their capabilities by allowing them to understand and respond appropriately to their environment. This paper examines the present state of real-time image processing for self-driving vehicles, including the techniques employed, challenges, and advancements. This article investigates methods such as semantic segmentation, object recognition, categorization, and depth estimation, with an emphasis on enhancing vehicle perception, navigation, and decision-making. The article delves into the implementations of significant algorithms on embedded platforms, their computational efficiency, and their deployment in real-world scenarios. In conclusion, the report investigates prospective avenues for additional research to improve the reliability and efficiency of the real-time image processing systems of autonomous automobiles. © 2024 IEEE.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

YOLOv4-dense: A smaller and faster YOLOv4 for real-time edge-device based object detection in traffic scene

引用

IET image processing 2023年第2期17卷 570-580页

作者： Jiang, Yue Li, Wenjing Zhang, Jun Li, Fang Wu, Zhongcheng Chinese Acad Sci High Magnet Field Lab HFIPS Hefei 230031 Peoples R China Univ Sci & Technol China Hefei Peoples R China High Magnet Field Lab Anhui Prov Hefei Peoples R China

Edge-device-based object detection is crucial in many real-world applications, such as self-driving cars, ADAS, driver behavior analysis. Although deep learning (DL) has become the de-facto approach for object detection, the limited computing resources of embedded devices and the large model size of current DL-based methods increase the difficulty of real-time object detection on edge devices. To overcome these difficulties, in this work a novel YOLOv4-dense model is proposed to detect objects in an accurate, fast manner, which is built on top of the YOLOv4 framework but with substantial improvements. More specifically, lots of CSP layers are pruned since it will decrease inference speed. And to address the losing small objects problem, a dense block is introduced. In addition, a lightweight two-stream YOLO head is also designed to further reduce the computational complexity of the model. Experimental results on NVIDIA JETSON TX2 embedded platform demonstrate that YOLOv4-dense can achieve a higher accuracy, faster speed with smaller model size. For instance, on the KITTI dataset, YOLOv4-dense obtains 84.3% mAP and 22.6 FPS with only 20.3 M parameters, surpassing the state-of-the-art models with comparable parameter budget such as YOLOv3-tiny, YOLOv4-tiny, PP-YOLO-tiny by a large margin.

关键词： traffic engineering computing NVIDIA JETSON TX2 embedded platform feature extraction dense block YOLOv4-dense obtains edge-device-based object detection automobiles driver behavior analysis current DL-based methods smaller YOLOv4 Neural nets YOLOv4-tiny faster YOLOv4 Optical, image and video signal processing Computer vision and image processing techniques convolutional neural nets Traffic engineering computing object detection deep learning (artificial intelligence) smaller model size novel YOLOv4-dense model YOLOv4 framework edge devices YOLOv3-tiny embedded devices real-time object detection real-time edge-device computational complexity

来源：评论

学校读者我要写书评

暂无评论

Implementation of Highly Modular and Scalable video processing System Design Based on FPGA 17

Implementation of Highly Modular and Scalable Video Processi...

引用

17th International Congress on image and Signal processing, BioMedical Engineering and Informatics, CISP-BMEI 2024

作者： Mei, Weichun Shen, Shuhua Wang, Fucheng Liang, Laili Shen, Shuxian Yang, Zhenyu Zhu, Genwang Lu, Zhiyang Sun, Maoyun School of Communication and Electronics Engineering East China Normal University Shanghai200241 China School of Computer Engineering Jiangsu University of Technology Jiangsu Changzhou213001 China College of Physics and Information Engineering Fuzhou University Fujian Fuzhou350108 China

ISBN: (纸本)9798331507398

Addressing the limitations of current video display solutions in terms of channel capacity, this article introduces a multi-channel independent video merging and real-time display system powered by Field Programmable Gate Array (FPGA). The highly modular system enables scalable channel expansion through subsystem cascading, catering to diverse video processing needs. It accepts video signals ranging from 480P to 1080P, supporting both progressive and interlaced scanning formats. The videos are processed, reconstructed, and merged for seamless display, theoretically supporting any number of channels that are multiples of four. The subsystems can be cascaded flexibly without specific order constraints, and they can operate independently, ensuring stable performance. Parallel processing and DDR3-SDRAM caching technology enable real-time video processing and stable display. The Intel Cyclone V FPGA controls the entire process, including input/output, scaling, splicing, and cascading. This system finds diverse practical applications across various industries, including video surveillance, gaming, meetings, and medical environments. © 2024 IEEE.

关键词： video signal processing

来源：评论

学校读者我要写书评

暂无评论

Computer Vision and Advanced Computational Algorithms for Risk Assessment and Performance Enhancement in Track and Field Teaching

引用

INTERNATIONAL JOURNAL OF E-COLLABORATION 2025年第1期21卷

作者： Cui, Liming Li, Lemin Jiaozuo Univ Fac Tai Chi Boxing Jiaozuo Peoples R China Jiaozuo Normal Coll Fac Foreign Languages & Business Jiaozuo 454000 Peoples R China

This paper discusses the application of computer vision and advanced calculation algorithm in evaluating the teaching risk and teaching effect of track and field. Because of the inherent uncertainty and risk of PE and sports activities (especially track and field), it is necessary to establish an effective of analyzing and processing video data to detect and track moving objects, so as to identify potential risks in real time. This method not only improves the safety of students in track and field classes, but also provides valuable insights for improving teaching methods and reducing sports injuries. This paper discusses the background subtraction motion detection algorithm, which is very important for dynamic image modeling and shadow suppression, and can realize accurate motion state detection. The ultimate goal is to ensure the healthy development of school sports and optimize the teaching results of track and field sports.

关键词： Instructional Effect Assessment Target Detection Track and Field Sports video Analysis

来源：评论

学校读者我要写书评

暂无评论

Research of Technology about video Control Based on real-time Transmission 7

Research of Technology about Video Control Based on Real-tim...

引用

7th IEEE International conference on Information Systems and Computer Aided Education, ICISCAE 2024

作者： Wang, Haoxue Li, Jinglan Liu, Huaiqiang Linyi University Linyi China

ISBN: (纸本)9798350350760

With the increasing demand for multimedia on the Internet, video technology has gradually become the mainstream of multimedia transmission on the Internet. In order to avoid overflow and underflow of buffer in transmission and to ensure the continuity of image of human visual pair, this paper proposes an algorithm to force the transmission rate to be adjusted, considering that the video bit rate cannot be completely the same as the change of network bandwidth. Combining with the characteristics of VTP adaptive coding and packaging scheme, an improved packaging scheme is proposed for real-time video control. The experimental results of end-to-end video transmission system show that video transmission control technology can improve packet loss rate and real-time transmission. ©2024 IEEE.

关键词： Packet loss

来源：评论

学校读者我要写书评

暂无评论

Virtual Gym Tracker: AI Pose Estimation 2

Virtual Gym Tracker: AI Pose Estimation

引用

2nd IEEE International conference on Advances in Information Technology, ICAIT 2024

作者： Rane, Milind Date, Ameya Deshmukh, Vaishnavi Deshpande, Prutha Dharmadhikari, Aryan VIT Pune India

ISBN: (纸本)9798350383867

This research paper presents a novel Virtual Gym Tracker AI Pose Estimation system designed to enhance virtual fitness experiences. Leveraging advanced deep learning techniques and real- time image analysis, the system accurately tracks and reconstructs user movements in real time. It offers precise pose estimation, enabling form correction, performance analysis, and interactive coaching. Extensive experiments confirm its accuracy and adaptability across diverse exercise scenarios, making it a valuable tool for virtual gym environments. © 2024 IEEE.

关键词： AI deep-learning video - processing Virtual Gym Tracker

来源：评论

学校读者我要写书评

暂无评论

image Monitoring System Based on Deep Neural Network 2

Image Monitoring System Based on Deep Neural Network

引用

2nd International conference on image processing, Computer Vision and Machine Learning, ICICML 2023

作者： Weng, Junhong Zeng, Lingfeng Song, Yingjie Shenzhen Power Supply Bureau CO. LTD China Southern Power Grid Guangdong Shenzhen China

ISBN: (纸本)9798350331417

For safety and security reasons, the indoor/outdoor working environments of various industries require the use of many cameras for automated surveillance. In such context, a major challenge for automated monitoring system is achieving high-precision real-time performance for image classification and object detection. In this paper, we present a novel image surveillance system based on a combined approach derived from YOLO V5. The system first detects moving targets using background subtraction. Then, we propose a modified YOLO V5 algorithm for accurately detecting and categorizing different objects in images captured in a video stream. The system runs in real time and could analyze multiple video streams simultaneously. The results of the experiments show that this system has good performance and could be widely applied on several areas, such as security, surveillance, and traffic management. © 2023 IEEE.

关键词： deep nueral network monitoring system YOLOv5

来源：评论

学校读者我要写书评

暂无评论

VidProM: A Million-scale real Prompt-Gallery Dataset for Text-to-video Diffusion Models 38

VidProM: A Million-scale Real Prompt-Gallery Dataset for Tex...

引用

38th conference on Neural Information processing Systems, NeurIPS 2024

作者： Wang, Wenhao Yang, Yi University of Technology Sydney Australia Zhejiang University China

The arrival of Sora marks a new era for text-to-video diffusion models, bringing significant advancements in video generation and potential applications. However, Sora, along with other text-to-video diffusion models, is highly reliant on prompts, and there is no publicly available dataset that features a study of text-to-video prompts. In this paper, we introduce VidProM, the first large-scale dataset comprising 1.67 Million unique text-to-video Prompts from real users. Additionally, this dataset includes 6.69 million videos generated by four state-of-the-art diffusion models, alongside some related data. We initially discuss the curation of this large-scale dataset, a process that is both time-consuming and costly. Subsequently, we underscore the need for a new prompt dataset specifically designed for text-to-video generation by illustrating how VidProM differs from DiffusionDB, a large-scale prompt-gallery dataset for image generation. Our extensive and diverse dataset also opens up many exciting new research areas. For instance, we suggest exploring text-to-video prompt engineering, efficient video generation, and video copy detection for diffusion models to develop better, more efficient, and safer models. The project (including the collected dataset VidProM and related code) is publicly available at https://*** under the CC-BY-NC 4.0 License. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：