检索结果-内蒙古大学图书馆

real-time Multi-Task ADAS Implementation on Reconfigurable Heterogeneous MPSoC Architecture

IEEE ACCESS 2023年 11卷 80741-80760页

作者： Tatar, Guner Bayar, Salih Fatih Sultan Mehmet Vakif Univ Dept Elect Elect Engn TR-34445 Istanbul Turkiye Marmara Univ Dept Elect & Elect Engn TR-34840 Istanbul Turkiye

The rapid adoption of Advanced Driver Assistance Systems (ADAS) in modern vehicles, aiming to elevate driving safety and experience, necessitates the real-time processing of high-definition video data. This requirement brings about considerable computational complexity and memory demands, highlighting a critical research void for a design integrating high FPS throughput with optimal Mean Average Precision (mAP) and Mean Intersection over Union (mIoU). Performance improvement at lower costs, multi-tasking ability on a single hardware platform, and flawless incorporation into memory-constrained devices are also essential for boosting ADAS performance. Addressing these challenges, this study proposes an ADAS multi-task learning hardware-software co-design approach underpinned by the Kria KV260 Multi-Processor System-on-Chip Field Programmable Gate Array (MPSoC-FPGA) platform. The approach facilitates efficient real-time execution of deep learning algorithms specific to ADAS applications. Utilizing the BDD100K+Waymo, KITTI, and CityScapes datasets, our ADAS multi-task learning system endeavours to provide accurate and efficient multi-object detection, segmentation, and lane and drivable area detection in road images. The system deploys a segmentation-based object detection strategy, using a ResNet-18 backbone encoder and a Single Shot Detector architecture, coupled with quantization-aware training to augment inference performance without compromising accuracy. The ADAS multi-task learning offers customization options for various ADAS applications and can be further optimized for increased precision and reduced memory usage. Experimental results showcase the system's capability to perform real-time multi-class object detection, segmentation, line detection, and drivable area detection on road images at approximately 25.4 FPS using a 1920 x 1080p Full HD camera. Impressively, the quantized model has demonstrated a 51% mAP for object detection, 56.62% mIoU for image segmen

关键词： ADAS deep learning deep processing unit memory allocation multi-task learning MPSoC-FPGA architecture Vitis-AI quantization aware training

来源：评论

学校读者我要写书评

暂无评论

On the Use of Bayesian Networks for real-time Urban Traffic Measurements: a Case Study with Low-Cost Devices

引用

JOURNAL OF SIGNAL processing SYSTEMS FOR SIGNAL image AND video TECHNOLOGY 2022年第3期94卷 293-304页

作者： Domenech-Asensi, Gines Cano, Maria-Dolores Morales-Esteras, Victor Univ Politecn Cartagena Dept Elect Tecnol Comp & Proyectos Cartagena Spain Univ Politecn Cartagena Dept Tecnol Informac & Comunicac Cartagena Spain

This paper describes a low cost computer vision system able to obtain traffic metrics at urban intersections. The proposed system is based on a Bayesian network based reasoning model. It employs the data extracted from background subtraction and contrast analysis techniques applied to predefined regions of interest of the video sequences, to evaluate different traffic metrics. The system has been designed to be able to work with already installed urban cameras, in order to reduce installation costs. So, it can be configured to work with different types of image sizes and video frame rates, as well as to process images taken from different distances and perspectives. The validity of the proposed system has been proved using a Raspberry Pi platform and tested using two real surveillance video cameras managed by the local authority of Cartagena (Spain) during different environmental light conditions. Using this hardware the system is able to process VGA grayscale images at a rate of 8 frames per second.

关键词： Traffic signaling Intelligent traffic lights image processing Intelligent transportation Bayesian networks

来源：评论

学校读者我要写书评

暂无评论

Self-Supervised video Defocus Deblurring with Atlas Learning 24

Self-Supervised Video Defocus Deblurring with Atlas Learning

引用

SIGGRAPH conference on Emerging Technologies

作者： Ruan, Lingyan Balint, Martin Bemana, Mojtaba Wolski, Krzysztof Seidel, Hans-Peter Myszkowski, Karol Chen, Bin Max Planck Inst Informat Saarbrucken Germany Univ Melbourne Melbourne Vic Australia

ISBN: (纸本)9798400705250

Misfocus is ubiquitous for almost all video producers, degrading video quality and often causing expensive delays and reshoots. Current autofocus (AF) systems are vulnerable to sudden disturbances such as subject movement or lighting changes commonly present in real-world and on-set conditions. Single image defocus deblurring methods are temporally unstable when applied to videos and cannot recover details obscured by temporally varying defocus blur. In this paper, we present an end-to-end solution that allows users to correct misfocus during post-processing. Our method generates and parameterizes defocused videos into sharp layered neural atlases and propagates consistent focus tracking back to the video frames. We introduce a novel differentiable disk blur layer for more accurate point spread function (PSF) simulation, coupled with a circle of confusion (COC) map estimation module with knowledge transferred from the current single image defocus deblurring (SIDD) networks. Our pipeline offers consistent, sharp video reconstruction and effective subject-focus correction and tracking directly on the generated atlases. Furthermore, by adopting our approach, we achieve comparable results to the state-of-the-art optical flow estimation approach from defocus videos.

关键词： defocus deblur implicit representation neural atlas video deblur post refocus

来源：评论

学校读者我要写书评

暂无评论

real-time Tone Mapping: A Survey and Cross-Implementation Hardware Benchmark

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR video TECHNOLOGY 2022年第5期32卷 2666-2686页

作者： Ou, Yafei Ambalathankandy, Prasoon Takamaeda, Shinya Motomura, Masato Asai, Tetsuya Ikebe, Masayuki Hokkaido Univ Res Ctr Integrated Quantum Elect Sapporo Hokkaido 0600813 Japan Univ Tokyo Grad Sch Informat Sci & Technol Dept Comp Sci Tokyo 1138656 Japan Tokyo Inst Technol Inst Innovat Res Yokohama Kanagawa 2268502 Japan Hokkaido Univ Grad Sch Informat Sci & Technol Sapporo Hokkaido 0600814 Japan

The rising demand for high quality display has ensued active research in high dynamic range (HDR) imaging, which has the potential to replace the standard dynamic range imaging. This is due to HDR's features like accurate reproducibility of a scene with its entire spectrum of visible lighting and color depth. But this capability comes with expensive capture, display, storage and distribution resource requirements. Also, display of HDR images/video content on an ordinary display device with limited dynamic range requires some form of adaptation. Many adaptation algorithms, widely known as tone mapping (TM) operators, have been studied and proposed in the last few decades. In this article, we present a comprehensive survey of 60 TM algorithms that have been implemented on hardware for acceleration and real-time performance. In this state-of-the-art survey, we will discuss those TM algorithms which have been implemented on GPU, FPGA, and ASIC in terms of their hardware specifications and performance. Output image quality is an important metric for TM algorithms. From our literature survey we found that, various objective quality metrics have been used to demonstrate the quality of those algorithms hardware implementation. We have compiled those metrics used in this survey, and analyzed the relationship between hardware cost, image quality and computational efficiency. Currently, machine learning-based (ML) algorithms have become an important tool to solve many image processing tasks, and this article concludes with a discussion on the future research directions to realize ML-based TM operators on hardware.

关键词： Graphics processing units Field programmable gate arrays Hardware Dynamic range Imaging image sensors real-time systems Tone mapping computational complexity survey high dynamic range image sensor ASIC FPGA GPU

来源：评论

学校读者我要写书评

暂无评论

real-time Detection of Illegally Parked Vehicles in Roadside Parking Areas Based on Intelligent video Terminals 24

Real-time Detection of Illegally Parked Vehicles in Roadside...

引用

2024 International conference on image processing, Intelligent Control and Computer Engineering, IPICE 2024

作者： Tang, Kang Sun, Yu Zhong, Xiaoyang Key Laboratory of Spatial Data Mining and Information Sharing Ministry of Education Fuzhou University China

ISBN: (纸本)9798400710285

With the development of artificial intelligence technology, urban traffic management has become increasingly convenient, and the task of illegal parking detection has become a major research focus. Currently, most illegal parking detection schemes use fixed-point cameras, which not only waste resources but also have the limitation of a small detection range. To overcome this issue, we have designed a highly mobile system for detecting illegally parked vehicles. Considering the different relative positions of the tires and parking lines in roadside parking areas, we use an optimized You Only Look Once(YOLOv5) algorithm to classify the tires and determine whether the vehicle is illegally parked based on different category of tires combinations. Subsequently, we applied the above-mentioned illegal parking detection strategy to a embedded device which mounted on the mobile vehicle to realize the real-time mobile detection and information collection of illegal vehicles in the roadside parking area. © 2024 Copyright held by the owner/author(s).

关键词： Tires

来源：评论

学校读者我要写书评

暂无评论

real-time Intelligent video Surveillance System using Recurrent Neural Network 2

Real-Time Intelligent Video Surveillance System using Recurr...

引用

2nd International conference on Machine Learning and Data Engineering, ICMLDE 2023

作者： Pooja, B.R. Rajkumar, N. Department of Computer Applications Krupanidhi College of Management Karnataka Bengaluru India Department of Computer Applications Harsha Institute of Management Studies Karnataka Bengaluru India

Security is a significant concern at all locations where CCTV cameras are installed. Security is a top priority;you must invest considerable time and effort to keep track of everything. Shortly, developments in computer vision may substantially impact video surveillance systems. To measure the video feed in real-time and detect any abnormal behavior without human intervention., such as violence or theft. The real-time video was captured as data with regular and strange events, and it has been trained (21 videos) and tested (16 videos) using a deep learning neural network. The captured video has been converted as frames using a Spatio-temporal autoencoder and 3D convolution network with the help of the RNN algorithm with a threshold value of 0.006 to detect the events. The RNN algorithm analyzed the captured video to see the possibilities with higher accuracy, 96%, than others. A data processing model for event detection will be constructed using deep-learning neural network technologies. A sophisticated monitoring system also transmits real-time video and voice communications to the web. © 2024 Elsevier B.V.. All rights reserved.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

COTTON-YOLO: Enhancing Cotton Boll Detection and Counting in Complex Environmental Conditions Using an Advanced YOLO Model

引用

APPLIED SCIENCES-BASEL 2024年第15期14卷 6650页

作者： Lu, Ziao Han, Bo Dong, Luan Zhang, Jingjing Xinjiang Agr Univ Coll Comp & Informat Engn Urumqi 830052 Peoples R China Minist Educ Engn Res Ctr Intelligent Agr Urumqi 830052 Peoples R China Xinjiang Agr Informatizat Engn Technol Res Ctr Urumqi 830052 Peoples R China

This study aims to enhance the detection accuracy and efficiency of cotton bolls in complex natural environments. Addressing the limitations of traditional methods, we developed an automated detection system based on computer vision, designed to optimize performance under variable lighting and weather conditions. We introduced COTTON-YOLO, an improved model based on YOLOv8n, incorporating specific algorithmic optimizations and data augmentation techniques. Key innovations include the C2F-CBAM module to boost feature recognition capabilities, the Gold-YOLO neck structure for enhanced information flow and feature integration, and the WIoU loss function to improve bounding box precision. These advancements significantly enhance the model's environmental adaptability and detection precision. Comparative experiments with the baseline YOLOv8 model demonstrated substantial performance improvements with COTTON-YOLO, particularly a 10.3% increase in the AP50 metric, validating its superiority in accuracy. Additionally, COTTON-YOLO showed efficient real-time processing capabilities and a low false detection rate in field tests. The model's performance in static and dynamic counting scenarios was assessed, showing high accuracy in static cotton boll counting and effective tracking of cotton bolls in video sequences using the ByteTrack algorithm, maintaining low false detections and ID switch rates even in complex backgrounds.

关键词： automated cotton detection YOLO (You Only Look Once) data augmentation techniques real-time image processing precision agriculture ByteTrack

来源：评论

学校读者我要写书评

暂无评论

Automatic Text-based Clip Composition for video News 24

Automatic Text-based Clip Composition for Video News

引用

9th International conference on Multimedia and image processing (ICMIP)

作者： Quandt, Dennis Altmeyer, Philipp Ruppel, Wolfgang Narroschke, Matthias RheinMain Univ Appl Sci Wiesbaden Germany

ISBN: (纸本)9798400716164

News broadcasters must produce engaging video clips quicker than ever to ensure their successful positioning in the market. This is due, in part, to the growing number of news sources and changes in media consumption amongst target audiences. This evolution has amplified the need to quickly produce news clips, a requirement that remains at odds with the traditionally manual and time-consuming video editing processes. Besides advances in automating video news production, current systems are yet to meet the sufficient automation level and quality standards required for professional news broadcasting. Addressing this gap, we propose a novel transformer-based framework for automatically composing news clips to streamline the editing process. Our framework is predicated on a vision-language feature embedding mechanism and a cross-attention transformer architecture designed to generate multi-shot news clips semantically coherent with the editorial text and stylistically consistent with professional editing benchmarks. Our framework composes news clips with a length of 2 minutes from source material ranging from 20 minutes to 2 hours in less than 5 minutes using a single GPU. In our user study, target groups with different experience levels rated the generated videos on a 6-point Likert scale. Users rated the news clips generated by our framework with an average score of 4.13 and the manually edited news clips with an average score of 4.58.

关键词： News Clip Editing AI video Editing Computational Cinematography Text-based Clip Sequencing

来源：评论

学校读者我要写书评

暂无评论

FastLLVE: real-time Low-Light video Enhancement with Intensity-Aware Lookup Table 23

FastLLVE: Real-Time Low-Light Video Enhancement with Intensi...

引用

31st ACM International conference on Multimedia (MM)

作者： Li, Wenhao Wu, Guangyang Wang, Wenyi Ren, Peiran Liu, Xiaohong Shanghai Jiao Tong Univ Shanghai Peoples R China Univ Elect Sci & Technol China Chengdu Peoples R China Alibaba Damo Acad Hangzhou Peoples R China

ISBN: (纸本)9798400701085

Low-Light video Enhancement (LLVE) has received considerable attention in recent years. One of the critical requirements of LLVE is inter-frame brightness consistency, which is essential for maintaining the temporal coherence of the enhanced video. However, most existing single-image-based methods fail to address this issue, resulting in flickering effect that degrades the overall quality after enhancement. Moreover, 3D Convolution Neural Network (CNN)-based methods, which are designed for video to maintain inter-frame consistency, are computationally expensive, making them impractical for real-time applications. To address these issues, we propose an efficient pipeline named FastLLVE that leverages the Look-Up-Table (LUT) technique to maintain inter-frame brightness consistency effectively. Specifically, we design a learnable Intensity-Aware LUT (IA-LUT) module for adaptive enhancement, which addresses the low-dynamic problem in low-light scenarios. This enables FastLLVE to perform low-latency and low-complexity enhancement operations while maintaining high-quality results. Experimental results on benchmark datasets demonstrate that our method achieves the State-Of-The-Art (SOTA) performance in terms of both image quality and inter-frame brightness consistency. More importantly, our FastLLVE can process 1,080p videos at 50+ Frames Per Second (FPS), which is 2x faster than SOTA CNN-based methods in inference time, making it a promising solution for real-time applications. The code is available at https://***/Wenhao-Li777/FastLLVE.

关键词： Low-light video enhancement lookup table brightness consistency

来源：评论

学校读者我要写书评

暂无评论

Automated Brain Tumor Detection using image processing Techniques 5

Automated Brain Tumor Detection using Image Processing Techn...

引用

5th International conference on image processing and Capsule Networks, ICIPCN 2024

作者： Mane, Vijay Patil, Mansi Pawar, Akanksha Poruthur, Jescintha Vishwakarma Institute of Technology Department of Electronics and Telecommunication Pune India

ISBN: (纸本)9798350367171

Using MRI to reliably diagnose brain tumors is important but it is often time-consuming. The study uses an automated method for brain tumor detection and classification using image processing techniques and conventional machine learning algorithms in order to considerably accelerate the diagnostic procedure and automate the diagnosis of brain tumors. Through precise identification and classification of brain cancers in MRI images, the system provides a more efficient method of diagnosing brain tumors. The methodology consists of four main stages: data pre-processing, where images are prepared for analysis;application of various image filters such as bilateral, median, and Gaussian filters to enhance image quality and highlight tumor features;training of machine learning models including Logistic Regression and Support Vector Machines on pre-processed image data;and real-time prediction facilitated by a user-friendly graphical interface. © 2024 IEEE.

关键词： Median filters

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：