检索结果-内蒙古大学图书馆

2024 IEEE International conference on Signal processing and Advance Research in Computing, SPARC 2024

作者： Ranjan, Arti Ravinder, M. Indira Gandhi Delhi Technical University for Women Department of Computer Science and Engineering New Delhi India Sharda University Department of Computer Science and Engineering Greater Noida India

ISBN: (纸本)9798350385199

The growing demand for real-time image processing on edge devices calls for novel approaches that balance computational efficiency with high performance. This paper introduces an integrated solution combining ShuffleNet, a lightweight convolutional neural network, with LLaVA-Phi model for efficient image deblurring and descriptive analysis on mobile devices. ShuffleNet's structural efficiency, characterized by channel shuffling and depth-wise convolutions, is exploited to deblur images swiftly, while LLaVA-Phi interprets the imagery to generate concise natural language descriptions. Our unified approach significantly enhances both the visual clarity of images and the accuracy of their associated descriptions with minimal computational overhead. Experimental results reveal substantial improvements over existing methods, confirming the efficacy of our approach for enhanced real-time image processing in computationally limited environments. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Unsupervised video Skimming with Adaptive Hierarchical Shot Detection 37

Unsupervised Video Skimming with Adaptive Hierarchical Shot ...

引用

37th SIBGRAPI conference on Graphics, Patterns and images (SIBGRAPI)

作者： Cardoso, Leonardo Vilela Werneck, July F. M. Guimaraes, Silvio Jamil E. Patrocinio, Zenilton K. G., Jr. Pontifical Catholic Univ Minas Gerais PUC Minas Lab Image & Multimedia Data Sci IMSci Belo Horizonte MG Brazil

ISBN: (纸本)9798350376043;9798350376036

video skimming involves generating a concise representation that captures all its significant information. However, conventional skimming techniques often fail to capture different shots in a video due to their inability to detect scene modifications and incorporate the hierarchical structure of video content. This work proposes an unsupervised hierarchical method for video skimming, called Hierarchical time-aware Skimming - HieTaSkim, in which video content is modeled as a graph, and an adaptive strategy is employed to produce hierarchical graph cuts. Those cuts are used to identify the most relevant video segments or keyshots, allowing the extraction of frames' sequences that convey the video's central message and resulting in a more effective and accurate video summary. Experimental results demonstrate that the proposed approach outperforms other state-of-the-art unsupervised methods for video skimming, achieving in the SumMe dataset an F-score of 39.9 which represents an improvement of 10% at least.

关键词： video analysis

来源：评论

学校读者我要写书评

暂无评论

Research on a Parallel Algorithm for video image Compression of Transmission Line Inspection

Research on a Parallel Algorithm for Video Image Compression...

引用

2023 International conference on Big Data Mining and Information processing, BDMIP 2023

作者： Hu, Meihui Li, Kai Wan, Jiao Chen, Tao Xiang, Zhiwei State Grid Xinjiang Information & Telecommunication Company Xinjiang Urumqi830017 China State Grid Xinjiang Electric Power Co. Ltd. Xinjiang Urumqi830017 China

ISBN: (纸本)9798400709166

Unmanned aerial vehicle (UAV) has the advantages of simple operation, sensitive response, flexible flight, long battery life and low cost, and has become a conventional way of power inspection. However, the video signal with huge data will bring a certain burden to the hardware of the data acquisition end of the system, so it is necessary to improve the sampling performance of the data acquisition end of the video compression system. In this paper, a parallel algorithm for video image compression of power transmission line inspection is proposed. By using the message passing interface function provided by MPI (message passing interface), the search and matching process of image domain block and value domain block is distributed to multiple processors for simultaneous execution. The experimental results show that when only one computing node is used, the CPU utilization efficiency is very close when the images with the same compression ratio are decompressed in two parallel modes. With the increase of the number of computing nodes, the efficiency of MPI parallel mode decreases gradually, while the efficiency of MPI+Open MP hybrid model increases. This study has certain reference value and practical value for real-time processing of transmission line inspection data. © 2023 ACM.

关键词： image compression

来源：评论

学校读者我要写书评

暂无评论

real-time underwater video feed enhancement for Autonomous Underwater Vehicles (AUV)

Real-time underwater video feed enhancement for Autonomous U...

引用

conference on Multimodal image Exploitation and Learning

作者： Hasan, Yusuf Ali, Athar Aligarh Muslim Univ Zakir Hussain Coll Engn & Technol Dept Comp Engn Aligarh Uttar Pradesh India Univ Buckingham Sch Comp Buckingham England

ISBN: (纸本)9781510673854;9781510673847

In underwater exploration, Autonomous Underwater Vehicles (AUVs) face challenges due to the adverse effects of the aquatic environment on optical sensors, resulting in sub-optimal data acquisition. To overcome this, we propose a novel solution utilizing a Generative Adversarial Network (GAN) model. Rooted in the U-Net architecture, our model processes low-quality AUV camera feed, generating enhanced representations of the underwater scene. The discriminator focuses on evaluating current image patches, capturing high-frequency properties with fewer parameters, achieving a 15% improvement in model accuracy. This approach facilitates real-time preprocessing in visually-guided underwater robot autonomy pipelines, overcoming challenges associated with underwater visibility.

关键词： Autonomous Underwater Vehicles YOLO Generative Adversarial Networks

来源：评论

学校读者我要写书评

暂无评论

A Feasibility Study of real-time image processing Techniques for Small Flying Object Detection in Drones

A Feasibility Study of Real-Time Image Processing Techniques...

引用

2024 IEEE International conference on Consumer Electronics, ICCE 2024

作者： Loftus, Neil Parlato, Cade McGinty, Amelia Kizilay, Furkan Narman, Husnu S. Marshall University Department of Computer Sciences and Electrical Engineering United States Manisa Celal Bayar University Turkey

ISBN: (纸本)9798350324136

Drone usage is increasing significantly in our daily life, from military to delivery purposes. Although drones are also used to detect objects by using different techniques, they are limited to detecting flying small objects such as birds and responding quickly not to cause unintended collisions while flying at high speed. In this paper, we investigate the feasibility of using machine learning and image processing methods in drones while detecting birds mid-flight and responding to ensure their safety. This real time Bird Detection system (RTBD) is designed to detect birds so that proper response or evasive action can be taken by the drone. To avoid erroneous responses and observe the auto-behavior of drones while acting not to collide, we have developed an application with a graphical interface to easily control the drone's video feed and process that information using a machine-learning model. The application also has the capability to detect if a bird is close enough to interfere with the drone's flight path. Our test results show that the drone identified bird images within a 50-millisecond window of time, with Precision exceeding 96%, when Confidence exceeded 80%. © 2024 IEEE.

关键词： Drones Machine Learning. Object Detection

来源：评论

学校读者我要写书评

暂无评论

real-time adaptive skin detection using skin color model updating unit in videos

引用

JOURNAL OF real-time image processing 2022年第2期19卷 303-315页

作者： Zhang, Kun Wang, Yedong Li, Wenyuan Li, Changlu Lei, Zhichun Tianjin Univ Sch Microelect Tianjin 300072 Peoples R China Hisense Visual Technol Co Ltd Display R&D Dept Qingdao 266071 Peoples R China

Skin color plays an important role in color image processing and human-computer interaction. However, factors such as rapidly changing illumination, various color styles, and camera characteristics also make skin detection a challenging task. In particular, the real-time requirement of practical applications is a challenging task in skin detection. In this paper, face detection and alignment are applied to select facial reference points for modeling the skin color distribution. Moreover, we propose the conception and detection approach of skin color model updating unit (SCMUU) according to the fact of skin color distribution remains consistent in a range of frames. The redundant operation of frame by frame updating is avoided using one model in frames of SCMUU. When no reliable faces are detected, two strategies are introduced to remedy and reduce the computational cost. It uses the corresponding model parameters if a similar previous SCMUU is found. Otherwise, we use fixed thresholds instead and increase the interval between two consecutive face detection. Besides, the time-consuming steps are accelerated using a graphic processing unit (GPU) with CUDA in this paper. Experimental results show that, compared with other existing methods, the proposed method has good real time and accuracy for skin detection of various resolution videos under different illumination conditions.

关键词： Skin modeling Skin detection Adaptive thresholds Face detection video processing

来源：评论

学校读者我要写书评

暂无评论

MAAD-GAN: Memory-Augmented Attention-Based Discriminator GAN for video Anomaly Detection 1

引用

8th International conference on Computer Vision and image processing (CVIP)

作者： Sethi, Anikeit Saini, Krishanu Singh, Rituraj Tiwari, Aruna Saurav, Sumeet Singh, Sanjay Chauhan, Vikas Indian Inst Technol Comp Sci & Engn Indore 452020 Madhya Pradesh India CSIR CEERI Intelligent Syst Grp Pilani 333031 Rajasthan India Natl Taipei Univ Technol Elect & Comp Sci Taipei 106 Taiwan

ISBN: (数字)9783031585357

ISBN: (纸本)9783031585340;9783031585357

The detection of anomalies in video data is of great importance in various applications, such as surveillance and industrial monitoring. This paper introduces a novel approach, named MAAD-GAN, for video anomaly detection (VAD) utilizing Generative Adversarial Networks (GANs). The MAAD-GAN framework combines a Wide Residual Network (WRN) in the generator with a memory module to learn the normal patterns present in the training video dataset, enabling the generation of realistic samples. To address the challenge of detecting subtle anomalies and those with motion characteristics, we propose the integration of self-attention in the discriminator model. Our proposed model MAAD-GAN enhances the ability to distinguish between real and generated samples, ensuring that anomalous samples are distorted when reconstructed. Experimental evaluations show the effectiveness of MAAD-GAN as compared to traditional methods on UCSD (University of California, San Diego) Peds2, CUHK Avenue, and ShanghaiTech datasets.

关键词： Anomaly Detection Generative Adversarial Networks Deep Learning Memory Network

来源：评论

学校读者我要写书评

暂无评论

Speech2rtMRI: Speech-Guided Diffusion Model for real-time MRI video of the Vocal Tract during Speech

Speech2rtMRI: Speech-Guided Diffusion Model for Real-time MR...

引用

2025 IEEE International conference on Acoustics, Speech, and Signal processing, ICASSP 2025

作者： Nguyen, Hong Foley, Sean Huang, Kevin Shi, Xuan Feng, Tiantian Narayanan, Shrikanth Signal Analysis and Interpretation Lab University of Southern California Los AngelesCA90089 United States

ISBN: (纸本)9798350368741

Understanding speech production both visually and kinematically can inform second language learning system designs, as well as the creation of speaking characters in video games and animations. In this work, we introduce a data-driven method to visually represent articulator motion in Magnetic Resonance Imaging (MRI) videos of the human vocal tract during speech based on arbitrary audio or speech input. We leverage large pre-trained speech models, which are embedded with prior knowledge, to generalize the visual domain to unseen data using an speech-to-video diffusion model. Our findings demonstrate that the visual generation significantly benefits from the pre-trained speech representations. We also observed that evaluating phonemes in isolation is challenging but becomes more straightforward when assessed within the context of spoken words. Limitations of the current results include the presence of unsmooth tongue motion and video distortion when the tongue contacts the palate. The source code is available for the public at: https://***/Hong7Cong/***. © 2025 IEEE.

关键词： inverse problems real-time MRI Speech production modeling Speech-guided video video Diffusion Model

来源：评论

学校读者我要写书评

暂无评论

Dual-Channel Visible Light Communication System for Enhanced V2V video Streaming 7

Dual-Channel Visible Light Communication System for Enhanced...

引用

7th International conference on Signal processing and Information Security, ICSPIS 2024

作者： Tettey, Daniel K. Elamassie, Mohammed Uysal, Murat Özyeǧin University Electrical & Electronics Engineering Istanbul Turkey Engineering Division Abu Dhabi United Arab Emirates

ISBN: (纸本)9798350368673

Visible light communication (VLC) operates on the principle of modulating light-emitting diodes (LEDs) for data transmission at frequencies imperceptible to the human eye. In vehicular communication, VLC leverages existing vehicle lighting infrastructure, such as headlights and taillights, to transmit data. This enables the sharing of real-time video feeds from onboard vehicle cameras with other vehicles, allowing drivers to see beyond the immediate traffic ahead. In this paper, we present an experimental study that demonstrates vehicle-to-vehicle (V2V) video streaming by using both headlights as wireless transmitters. This approach reduces the likelihood of signal degradation or interruptions caused by obstacles, movement, or changing road conditions. By leveraging multiple light sources, the system ensures a more stable and consistent data flow, improving overall performance and robustness in dynamic vehicular environments. Our experimental setup utilizes modified software-defined radio platforms for baseband processing and a custom-designed frontend featuring truck low-beam LED headlights. A real-time video streaming demonstration is conducted with our prototype to validate the feasibility of dual-channel VLC for vehicular connectivity using both headlights. © 2024 IEEE.

关键词： Radio communication

来源：评论

学校读者我要写书评

暂无评论

Keyframe Extraction Approaches for Intelligent video Analysis 4

Keyframe Extraction Approaches for Intelligent Video Analysi...

引用

4th International conference on Sentiment Analysis and Deep Learning, ICSADL 2025

作者： Sabarivasan, N. Thangavel, Senthil Kumar Amrita School of Computing Department of Computer Science and Engineering Amrita Vishwa Vidyapeetham Coimbatore India

ISBN: (纸本)9798331523923

real-time violence detection is essential for protecting the safety and security of people, especially in college campuses that are dynamic and have crowds. Manual surveillance systems are popular but inefficient as they require human attention all the time and are not capable of quickly identifying and responding to violent *** the exsisting violence detection model takes quite a time for processing the entire video. The proposed model extracts features as keyframes from the video using MobileNet V3 and K-medoids clustering. The redundancy of video data can be avoided and significant violent activities identified by isolating and analyzing critical frames from video data. This enhances the efficiency and responsiveness of surveillance systems. This work primarily contributes a customized violent activities dataset and implementing keyframe extraction algorithm for optimizing the usage of computational resources, and incorporating lightweight algorithms for seamless deployment with existing CCTV networks. The experimental results clearly demonstrate the ability of this approach to detect keyframes, thereby offering a scalable and practical solution. © 2025 IEEE.

关键词： Security systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：