image recognition and processing technology is an important application direction of artificial intelligence technology. With the growth of demand for various types of video intelligent analysis, the importance of usi...
详细信息
Reconstructing High Dynamic Range (HDR) video from image sequences captured with alternating exposures is challenging, especially in the presence of large camera or object motion. Existing methods typically align low ...
详细信息
ISBN:
(纸本)9798350353006
Reconstructing High Dynamic Range (HDR) video from image sequences captured with alternating exposures is challenging, especially in the presence of large camera or object motion. Existing methods typically align low dynamic range sequences using optical flow or attention mechanism for deghosting. However, they often struggle to handle large complex motions and are computationally expensive. To address these challenges, we propose a robust and efficient flow estimator tailored for real-time HDR video reconstruction, named HDRFlow. HDRFlow has three novel designs: an HDR-domain alignment loss (HALoss), an efficient flow network with a multi-size large kernel (MLK), and a new HDR flow training scheme. The HALoss supervises our flow network to learn an HDR-oriented flow for accurate alignment in saturated and dark regions. The MLK can effectively model large motions at a negligible cost. In addition, we incorporate synthetic data, Sintel, into our training dataset, utilizing both its provided forward flow and backward flow generated by us to supervise our flow network, enhancing our performance in large motion regions. Extensive experiments demonstrate that our HDRFlow outperforms previous methods on standard benchmarks. To the best of our knowledge, HDRFlow is the first real-time HDR video reconstruction method for video sequences captured with alternating exposures, capable of processing 720p resolution inputs at 25ms. Project website: https://***/HDRFlow/.
The proceedings contain 86 papers. The topics discussed include: robust real-time monitoring of complex human activities using multi modal video analytics;a robust approach for classifying laparoscopic video distortio...
ISBN:
(纸本)9798331506520
The proceedings contain 86 papers. The topics discussed include: robust real-time monitoring of complex human activities using multi modal video analytics;a robust approach for classifying laparoscopic video distortions using ResNet-50;enhancing x-ray image classification through neural architecture;revolutionary MRI imaging for Alzheimer’s: cutting-edge GANs and vision transformer solutions;advanced deep learning strategies for breast cancer image analysis;identifying surgical instruments in pedagogical cataract surgery videos through an optimized aggregation network;enhancing auxiliary cancer classification task for multi-task breast ultrasound diagnosis network;and bioinspired computer vision for effective extended reality applications.
The paper provided a brief analysis of video denoising characteristics, discussed and analyzed various existing video denoising methods, and proposed a new video denoising algorithm based on bidirectional time fusion ...
详细信息
When capturing videos with cameras, noise can occur due to variations in lighting conditions, movements of subjects or cameras, and the quality of camera sensors. The presence of noise complicates object detection and...
详细信息
With the development of video array imaging technology, the accuracy and stability of videoimage acquisition have been improved. In this situation, this study proposes novel approach with monitoring matrix-based imag...
详细信息
The Sign Language Recognition System has been designed to capture video input, process it to detect hand gestures, and translate these gestures into readable text. The project consists of several key components and st...
详细信息
The proceedings contain 39 papers. The topics discussed include: a zero-reference nighttime road image enhancement method;SFPCDDFuse: an enhanced CDDFUSE infrared and visible image fusion model;recognition of side-sca...
ISBN:
(纸本)9781510688384
The proceedings contain 39 papers. The topics discussed include: a zero-reference nighttime road image enhancement method;SFPCDDFuse: an enhanced CDDFUSE infrared and visible image fusion model;recognition of side-scan sonar images under long-tail distribution;comparison of different masking strategies of MAE;real-time traffic sign recognition for driving: a hybrid approach integrating efficient Mamba models with dilated convolution;Mask R-CNN-based method for recognizing external breakage of transmission lines;light-aware luminance adaptive enhancement network for RGBT video object detection;and wireless multi-path VR audio and video file synchronous transmission technology based on two-way transmission.
This paper presents an integrated image processor architecture designed for real-time interfacing and processing of high-resolution thermal video obtained from an uncooled infrared focal plane array (IRFPA) utilizing ...
详细信息
ISBN:
(纸本)9798350344196
This paper presents an integrated image processor architecture designed for real-time interfacing and processing of high-resolution thermal video obtained from an uncooled infrared focal plane array (IRFPA) utilizing a modern system-on-chip field-programmable gate array (SoC FPGA). Our processor provides a one-chip solution for incorporating non-uniformity correction (NUC) algorithms and contrast enhancement methods (CEM) to be performed seamlessly. We have employed NUC algorithms that utilize multiple coefficients to ensure robust image quality, free from ghosting effects and blurring. These algorithms include polynomial modeling-based thermal drift compensation (TDC), two-point correction (TPC), and run-time discrete flat field correction (FFC). To address the memory bottlenecks originating from the parallel execution of NUC algorithms in real-time, we designed accelerators and parallel caching modules for pixel-wise algorithms based on a multi-parameter polynomial expression. Furthermore, we designed a specialized accelerator architecture to minimize the interrupted time for run-time FFC. The implementation on the XC7Z020CLG400 SoC FPGA with the QuantumRed VR thermal module demonstrates that our imageprocessing module achieves a throughput of 60 frames per second (FPS) when processing 14-bit 640x480 resolution infrared video acquired from an uncooled IRFPA.
real-time identification of road damage, particularly potholes, is essential for enhancing road safety and minimizing vehicle damage. Traditional road inspection methods are often labor-intensive and inefficient. To o...
详细信息
ISBN:
(纸本)9783031837920;9783031837937
real-time identification of road damage, particularly potholes, is essential for enhancing road safety and minimizing vehicle damage. Traditional road inspection methods are often labor-intensive and inefficient. To overcome these limitations, we propose an automated system that leverages deep learning to detect potholes in real-time from video input. This system can detect potholes under varying lighting conditions, specifically the YOLO (You Only Look Once) model, utilizing advanced computer vision techniques, including daylight, darkness, and vehicle headlights. The system aims to improve road monitoring by automatically identifying potholes and notifying authorities of their locations. By integrating geolocation services, the system pinpoints pothole locations using latitude and longitude coordinates and sends real-time alerts via a Telegram bot. Additionally, image enhancement techniques are employed to optimize detection performance in low-light conditions. The system has demonstrated high accuracy in detecting large potholes and identifying multiple potholes within a single frame. Through automated pothole detection and location sharing, this solution has the potential to significantly enhance road maintenance efficiency, thereby improving road safety and reducing accidents caused by road damage.
暂无评论