检索结果-内蒙古大学图书馆

2nd IEEE World conference on Communication and Computing (WCONF)

作者： Chinni, Naga Praneeth Kumar Kaamaala, Sai Pranav Reddy Vardhan, Bhasham Vishva Kishan, Adari Uday Richards, Vasimalla Sunny Vardhan, Ramadasu Nooka Harsh Puneet Lovely Profess Univ Comp Sci & Engn Jalandhar Punjab India

ISBN: (纸本)9798350395334;9798350395327

This project introduces a transformative object detection system designed to enhance the navigational capabilities of visually impaired individuals through the application of advanced computer vision technologies. Utilizing the You Only Look Once (YOLO) model, paired with the Comprehensive Object Collection (COCO) dataset, this system provides real-time, accurate object detection and classification. The core functionality of the application allows for the processing of both static images and live video feeds, enabling blind users to receive auditory announcements of nearby objects, thereby assisting with spatial awareness and environmental interaction. The system leverages a pre-trained YOLO model to ensure robust detection performance, achieving a peak detection accuracy of 99%. By delivering object labels and bounding box coordinates audibly, the application serves as a critical tool in improving the daily independence and quality of life for people with visual impairments. This project not only highlights the potential of deep learning in assistive technologies but also underscores the importance of adaptive solutions in inclusive technology development.

关键词： YOLO model COCO dataset object detection visually impaired assistance real-time processing audio feedback computer vision assistive technology

来源：评论

学校读者我要写书评

暂无评论

TransCGAN-based Parameter Extraction Framework for SAR image Simulation 2

TransCGAN-based Parameter Extraction Framework for SAR Image...

引用

2nd IEEE International conference on Signal, Information and Data processing, ICSIDP 2024

作者： Deng, Sidan He, Jingfei Mao, Yongfei Zhao, Liangbo Chen, Liang Shi, Hao Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Chongqing Innovation Center Chongqing China Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Beijing China China Academy of Space Technology Institute of Remote Sensing Satellite Beijing China Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Chongqing Innovation Center Beijing China

ISBN: (纸本)9798331515669

Synthetic Aperture Radar (SAR) images have a wide range of applications due to their all-weather and all-day working conditions. However, SAR images with different scenarios and imaging conditions are insufficient or even rare, which is required in specific SAR image tasks. Fortunately, SAR image simulation technology can provide ample simulated images under these conditions at a low cost, addressing the scarcity of specific real SAR data. Accurate parameters are crucial for obtaining high-quality simulated images. However, it is time-consuming and labour-intensive to adjust parameters manually, and it often fails to achieve accurate simulation parameters. To tackle this problem, this paper proposes a TransCGAN-based SAR image simulation method that combines deep learning with traditional methods. By training TransCGAN, a conditional generative adversarial network (CGAN) integrated with Transformer architecture, a mapping between SAR images and simulation parameters is established. This enables the extraction of simulation parameters directly from the real SAR image, guided by the corresponding real SAR image. Ultimately, the parameters are subsequently converted into simulated SAR images via simulation software. Experimental results demonstrate that our TransCGAN-based method can effectively extract accurate simulation parameters from real SAR images, resulting in simulated images holding high similarity to real images. © 2024 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for real-time Endoscopic video image on Customizable Embedded DSP Cores

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 2022年第1期E105A卷 58-62页

作者： Odagawa, Masayuki Koide, Tetsushi Tamaki, Toru Yoshida, Shigeto Mieno, Hiroshi Tanaka, Shinji Hiroshima Univ Res Inst Nanodevice & Bio Syst Higshihiroshima 7398527 Japan Cadence Design Syst Japan Yokohama Kanagawa 2220033 Japan Nagoya Inst Technol Dept Comp Sci Nagoya Aichi 4668555 Japan Med Corp JR Hiroshima Hosp Dept Gastroenterol Hiroshima 7320057 Japan Hiroshima Univ Dept Endoscopy & Med Grad Sch Biomed & Hlth Sci Hiroshima 7348553 Japan

This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

关键词： medical image/video processing computer-aided diagnosis system (CAD) navigation function convolutional neural network (CNN) support vector machine (SVM) customizable embedded digital signal processor (DSP)

来源：评论

学校读者我要写书评

暂无评论

Design of a Facial Recognition System for Power Grid video Surveillance Based on Deep Learning Algorithms

Design of a Facial Recognition System for Power Grid Video S...

引用

2024 International conference on Power, Electrical Engineering, Electronics and Control, PEEEC 2024

作者： Liu, Yuan Miao, Chunyuan Mei, Lin Lv, Jin Jiang, Geli Henan Zhengzhou450000 China Nanjing Nari Information and Communication Technology Co. Ltd Jiangsu Nanjing211100 China

ISBN: (纸本)9798350378917

In recent years, with the extensive application of deep learning methods, face recognition technology has been greatly developed. Aiming at the problem of video surveillance in power network, a video surveillance method based on image processing is proposed. In this paper, an intelligent decision model based on knowledge automatic extraction is studied and optimized. In power network monitoring, face detection and recognition technology based on deep learning is a commonly used face detection and recognition technology, including: face detection, face alignment, face size alignment. Aiming at video surveillance image in power system, a new technology with strong robustness is studied. This new algorithm can solve this problem well. On this basis, a power network monitoring system based on computer vision technology is adopted. Experimental results show that the proposed method can achieve {9 5. 9%} accuracy and {9 4. 7%} stability, and can be used in real environment. © 2024 IEEE.

关键词： Power systems computer aided design

来源：评论

学校读者我要写书评

暂无评论

AI based Analysis of Hyperspectral images with Spectacular Efficacy in video Analytics

AI based Analysis of Hyperspectral Images with Spectacular E...

引用

2024 International conference on Intelligent and Innovative Practices in Engineering and Management, IIPEM 2024

作者： Jadhav, Samarth Yogesh More, Rutuja Rajatam Tingare, Bhagyashree Ashok Lakshmi, G. Prasanna Kolse, Shivam Darwante, N.K. Sanjivani College of Engineering Department of Information Technology MH Kopargaon India D Y Patil College of Engineering Akurdi Department of Artificial Intelligence and Data Science Pune India Sandip University School of Computer Science and Engineering Maharashtra Nashik India Sanjivani College of Engineering Department of Electronics and Computer Engineering MH Kopargaon India

ISBN: (纸本)9798350390049

Hyperspectral imaging and artificial intelligence (AI) have transformed imaging and data processing through their ability to capture and analyze detailed spectral information. This paper explores the integration of hyperspectral imaging with AI, focusing on its impact on video analytics. Hyperspectral imaging provides comprehensive, multi-dimensional data by capturing a wide range of spectral wavelengths, enabling precise material identification and environmental monitoring beyond traditional RGB imaging. The synergy between hyperspectral imaging and AI enhances real-time analysis and decision-making by leveraging deep learning algorithms for pattern recognition and anomaly detection. This study introduces a novel AI-based hyperspectral image analysis approach for video analytics, utilizing convolutional neural networks (CNNs) and hybrid CNN-attention models to improve object recognition and classification. The methodology is validated through experiments that measure classification accuracy, processing speed, and resilience to varying conditions. Results demonstrate significant improvements in accuracy and efficiency over traditional methods, highlighting the potential for advanced applications in surveillance, environmental monitoring, and industrial quality control. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

An image correction algorithm and implementation for remote sensing camera

An image correction algorithm and implementation for remote ...

引用

2022 Applied Optics and Photonics China: Optical Sensing, Imaging, and Display Technology, AOPC 2022

作者： Ming, Liu Wei, Huang Jianzhu, Tian Liang, Lei Beijing Institute of Space Mechanic & Electricity Beijing100094 China

ISBN: (纸本)9781510662285

real time control, diversified functions, system integration and miniaturization are an important development direction of video electronics system. Embedded design based on FPGA can manage system resources more reasonably and effectively, and use limited resources to achieve more complex functions. It will become an important design method to promote the development of video electronics technology. Based on the embedded MicroBlaze microprocessor in V FPGA chip, combined with the model application background and Xilinx embedded development platform, an eight channel gray image channel inconsistency correction algorithm is implemented. On board experiments show the feasibility of applying embedded system to video electronic design, and show its advantages in interface control, storage management, process control, floating-point operation, development efficiency and so on, and it can be applied to real-time image processing of satellite remote sensing camera. © 2023 SPIE.

关键词： real time control

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT TRANSFORMER WITH LOCALLY SHARED ATTENTION FOR video QUALITY ASSESSMENT 29

EFFICIENT TRANSFORMER WITH LOCALLY SHARED ATTENTION FOR VIDE...

引用

IEEE International conference on image processing (ICIP)

作者： You, Junyong Lin, Yuan Norwegian Res Ctr NORCE Bergen Norway Kristiania Univ Coll Bergen Norway

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

Transformer has shown outstanding performance in time-series data processing, which can definitely facilitate quality assessment of video sequences. However, the quadratic time and memory complexities of Transformer potentially impede its application to long video sequences. In this work, we study a mechanism of sharing attention across video clips in video quality assessment (VQA) scenario. Consequently, an efficient architecture based on integrating shared multi-head attention (MHA) into Transformer is proposed for VQA, which greatly ease the time and memory complexities. A long video sequence is first divided into individual clips. The quality features derived by an image quality model on each frame in a clip are aggregated by a shared MHA layer. The aggregated features across all clips are then fed into a global Transformer encoder for quality prediction at sequence level. The proposed model with a lightweight architecture demonstrates promising performance in no-reference VQA (NR-VQA) modelling on publicly available databases. The source code can be found at https://***/junyongyou/lagt_vqa.

关键词： Attention Transformer user-generated content (UGC) video quality assessment (VQA)

来源：评论

学校读者我要写书评

暂无评论

QueryEdge: real-time Muti-video Query in Edge-Cloud Collaborative System

QueryEdge: Real-Time Muti-Video Query in Edge-Cloud Collabor...

引用

2023 IEEE International conference on Systems, Man, and Cybernetics, SMC 2023

作者： Zhong, Jihua Niu, Yannian Zhu, Minghua East China Normal University MoE Engineering Research Center for Software/Hardware Co-Design Technology and Application Shanghai China

ISBN: (纸本)9798350337020

The real-time query of surveillance video plays a significant role in many fields such as public safety, smart city, and abnormality monitoring. However, with the exponential growth of surveillance video data, traditional cloud-based intelligent video processing faces significant challenges in terms of latency and bandwidth, while the pure edge computing approach is deficient in query accuracy due to its lack of computational power. Existing edge cloud collaboration approaches, such as SurveilEdge, focus on real-time target queries within a single video stream and do not show promising results during target queries across multiple video streams. For this reason, this paper proposes Query Edge, an edge-cloud collaborative real-time query system for multiple video streams. Specifically, we design a real-time query system based on an edge-cloud collaboration framework to achieve highly accurate and low-latency target query services in multiple video streams. In addition, we introduce a prioritization mechanism and a load-balancing strategy in the query task scheduling process to further improve query efficiency. The evaluation proves that QueryEdge has a significant improvement in query latency and bandwidth consumption compared with pure cloud computing, pure edge computing, and SurveilEdge. © 2023 IEEE.

关键词：

来源：评论

学校读者我要写书评

暂无评论

iRAG: Advancing RAG for videos with an Incremental Approach 24

iRAG: Advancing RAG for Videos with an Incremental Approach

引用

33rd ACM International conference on Information and Knowledge Management (CIKM)

作者： Arefeen, Md Adnan Debnath, Biplob Uddin, Md Yusuf Sarwar Chakradhar, Srimat Univ Missouri Kansas City Kansas City MO 64110 USA NEC Labs Amer Princeton NJ 08540 USA

ISBN: (纸本)9798400704369

Retrieval-augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for understanding of videos is appealing but there are two critical limitations. One-time, upfront conversion of all content in large corpus of videos into text descriptions entails high processing times. Also, not all information in the rich video data is typically captured in the text descriptions. Since user queries are not known apriori, developing a system for video to text conversion and interactive querying of video data is challenging. To address these limitations, we propose an incremental RAG system called iRAG, which augments RAG with a novel incremental workflow to enable interactive querying of a large corpus of videos. Unlike traditional RAG, iRAG quickly indexes large repositories of videos, and in the incremental workflow, it uses the index to opportunistically extract more details from select portions of the videos to retrieve context relevant to an interactive user query. Such an incremental workflow avoids long video to text conversion times, and overcomes information loss issues due to conversion of video to text, by doing on-demand query-specific extraction of details in video data. This ensures high quality of responses to interactive user queries that are often not known apriori. To the best of our knowledge, iRAG is the first system to augment RAG with an incremental workflow to support efficient interactive querying of a large corpus of videos. Experimental results on real-world datasets demonstrate 23x to 25x faster video to text ingestion, while ensuring that latency and quality of responses to interactive user queries is comparable to responses from a traditional RAG where all video data is converted to text upfront before any user querying.

关键词： Generative AI Retrieval Augmented Generation (RAG) video Analytics Large Language Models (LLMs) Vision Language Models (VLMs)

来源：评论

学校读者我要写书评

暂无评论

Design of real-time image processing system of medical high-definition electronic endoscope based on FPGA 28

Design of real-time image processing system of medical high-...

引用

28th International conference on Mechatronics and Machine Vision in Practice, M2VIP 2022

作者： Meng, Peng Zhang, Hui Wen, Haiying Min, Dai Zhang, Zhisheng Zhang, Guifu School of Mechanical Engineering Southeast University Seu Nanjing China

ISBN: (纸本)9798350334333

Medical high-definition electronic endoscopes have high requirements on real-time performance and video quality. Compared with software-based image algorithms, the algorithms based on field programmable gate array (FPGA) have higher real-time performance, but are more difficult to implement. Referring to the working requirements of the medical high-definition electronic endoscope image processing system, an experimental platform for the real-time image processing system based on FPGA is designed in this paper. And the FPGA implementation of lens shading correction, white balance and gamma correction is studied in this system. The hardware consumption is effectively reduced by improving the relevant algorithm equations. After these three kinds of processing, the output image quality is obviously improved, and high-definition output is realized. At the same time, the total time delay of three kinds of image signal processing based on FPGA is within 16 working clock cycles, which can well meet the real-time working requirements of endoscope. The design of this system has certain reference for the research of medical high-definition electronic real-time image processing system. The realization of the three algorithms also enriches the research of FPGA-based image processing algorithms. © 2022 IEEE.

关键词： Endoscopy

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：