检索结果-内蒙古大学图书馆

11th International conference on Networking, Systems and Security, NSysS 2024

作者： Sayeed, Khondker Salman Shahgir, Haz Sameen Mahfuz, Tamzeed Dey, Satak Kumar Rahman, M. Saifur IQVIA Dhaka Dhaka Bangladesh Bangladesh University of Engineering and Technology Dhaka Dhaka Bangladesh University of California Riverside RiversideCA United States

ISBN: (纸本)9798400711589

Grayscale video capture remains a popular, low-cost approach for security and surveillance-related tasks, especially on edge devices. We create a deep-learning solution to colorize grayscale videos in real-time without dedicated acceleration hardware such as GPUs. We first trained EfficientNet-B7-based U-Net on a combination of image and video datasets. We prune redundant parameters in the bottleneck layers of the trained neural network using weight-base pruning, followed by minimal training to recover performance. Finally, we quantize parts of the neural network which reduces model size and inference-time memory requirement. Our final optimized model achieves a 43.75% inference speed improvement and 30.6% model size reduction over the base model and can colorize videos at 6+ frames per second on low-end CPUs while maintaining a competitive CDC score of 0.0031 and PSNR of 19. © 2024 Copyright held by the owner/author(s).

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Dynamic Masking and Unmasking for Drone Surveillance: A Privacy-Preserving Solution for real-time video processing 15

Dynamic Masking and Unmasking for Drone Surveillance: A Priv...

引用

15th International conference on Information and Communication Technology Convergence, ICTC 2024

作者： Berdibayev, Yergali Vu, Duc Tiep Lee, Heon Gyu Ai & BigData Research Center Gaion Co. Ltd. Daejeon Korea Republic of

ISBN: (纸本)9798350364637

The proliferation of drone technology in surveillance, media, and commercial applications has intensified the need for robust privacy protection measures, especially in regions with strict data protection laws like the Republic of Korea. This paper introduces a versatile video masking and unmasking system designed for real-time processing of drone footage, capable of detecting and masking sensitive objects such as faces and vehicles during flight or live video streams. Leveraging state-of-the-art object detection algorithms, including YOLOv8, the system automatically identifies these objects and applies various masking techniques to obscure them, ensuring compliance with privacy regulations. Additionally, the system includes secure unmasking functionality for authorized users, enabling controlled access to unaltered footage when necessary. The effectiveness and efficiency of the system are demonstrated through various real-world scenarios, highlighting its adaptability to different environments and its potential applications in public safety, media, and commercial surveillance. The paper also discusses future directions for enhancing the system's capabilities and expanding its use cases to further advance privacy-preserving solutions and optimize performance. © 2024 IEEE.

关键词： Drones

来源：评论

学校读者我要写书评

暂无评论

Hand Guided High Resolution Feature Enhancement for Fine-Grained Atomic Action Segmentation within Complex Human Assemblies

Hand Guided High Resolution Feature Enhancement for Fine-Gra...

引用

23rd IEEE/CVF Winter conference on Applications of Computer Vision (WACV)

作者： Myers, Matthew Kent Wright, Nick McGough, A. Stephen Martin, Nicholas Newcastle Univ Sch Engn Newcastle Upon Tyne Tyne & Wear England Newcastle Univ Sch Comp Newcastle Upon Tyne Tyne & Wear England Tharsus Ltd Blyth England

ISBN: (纸本)9798350320565

Due to the rapid temporal and fine-grained nature of complex human assembly atomic actions, traditional action segmentation approaches requiring the spatial (and often temporal) down sampling of video frames often loose vital fine-grained spatial and temporal information required for accurate classification within the manufacturing domain. In order to fully utilise higher resolution video data (often collected within the manufacturing domain) and facilitate real time accurate action segmentation - required for human robot collaboration - we present a novel hand location guided high resolution feature enhanced model. We also propose a simple yet effective method of deploying offline trained action recognition models for real time action segmentation on temporally short fine-grained actions, through the use of surround sampling while training and temporally aware label cleaning at inference. We evaluate our model on a novel action segmentation dataset containing 24 (+background) atomic actions from video data of a real world robotics assembly production line. Showing both high resolution hand features as well as traditional frame wide features improve fine-grained atomic action classification, and that though temporally aware label clearing our model is capable of surpassing similar encoder/decoder methods, while allowing for real time classification.

关键词： Training image segmentation video sequences Streaming media real-time systems Data models Cleaning

来源：评论

学校读者我要写书评

暂无评论

Vehicle Type Classification using Lightweight CNN from Aerial images for Traffic Management Applications 19

Vehicle Type Classification using Lightweight CNN from Aeria...

引用

IEEE 19th conference on Industrial Electronics and Applications (ICIEA)

作者： Tellefsen, Anders Yakkati, Rakesh Reddy Cenkeramaddi, Linga Reddy Univ Agder Dept ICT N-4879 Grimstad Norway

ISBN: (纸本)9798350360875;9798350360868

In applications related to traffic management, a specific kind of vehicle recognition is important. This research aims to improve traffic management systems by designing and implementing a lightweight Convolutional Neural Network (CNN) for vehicle-type detection from aerial photos. This study aims to develop a model that is accurate in classification and computationally efficient to provide real-time processing skills required for dynamic traffic monitoring. It does this by employing a dataset consisting of high-resolution aerial images taken by drones. The main issue that needs to be addressed is how cars appear differently depending on the angles, sizes, and environmental factors present in aerial imagery. The lightweight CNN architecture is specifically designed to balance performance and computational efficiency, which is critical for implementation in real-time traffic management applications, including low-power devices such as the Raspberry Pi. It optimizes parameter counts and employs approaches that speed up training without sacrificing accuracy. The study's key findings show that the suggested model outperforms pre-trained models in terms of both accuracy and efficiency. The model achieves a testing accuracy of 99.31% while remaining compact, making it ideal for real-time applications.

关键词： Lightweight Convolutional Neural Network Vehicle images Vehicle Type Classification real-time processing Aerial image Analysis Traffic Management Systems Model Optimization Computational Efficiency

来源：评论

学校读者我要写书评

暂无评论

Parking detection method using quadtree decomposition analysis

引用

Journal of Traffic and Transportation Engineering(English Edition) 2022年第4期9卷 645-653页

作者： Khaled Shaaban Houweida Tounsi Department of Engineering Utah Valley UniversityOremUT 84058USA Blizzard Entertainment San DiegoCA 90405USA

Searching for available parking spaces can be a painful experience for drivers due to driving around until finding a vacant *** study proposes a new method to automatically detect available parking *** proposed system identifies empty parking spaces using grayscale images obtained from any type of video *** method was found to successfully identify parking availability under different conditions and *** method was tested using real-life data and achieved a detection rate of 99.7%.This method can be applied in real-time to monitor parking availability and guide drivers to empty *** method has several advantages,including simple algorithms,the use of low-quality black and white images,and simple ***,the system can provide enormous cost savings for locations with existing black and white surveillance cameras instead of replacing existing cameras with new high-quality cameras.

关键词： Traffic engineering Parking detection Smart cities Outdoor parking image processing image detection

来源：评论

学校读者我要写书评

暂无评论

A Temporal Consistency Enhancement Algorithm Based on Pixel Flicker Correction 1

引用

29th International conference on Neural Information processing

作者： Meng, Junfeng Shen, Qiwei He, Yangliu Liao, Jianxin Beijing Univ Posts & Telecommun Beijing Peoples R China

ISBN: (数字)9789819916399

ISBN: (纸本)9789819916382;9789819916399

When the image algorithm is directly applied to the video scene and the video is processed frame by frame, an obvious pixel flickering phenomenon is happened, that is the problem of temporal inconsistency. In this paper, a temporal consistency enhancement algorithm based on pixel flicker correction is proposed to enhance video temporal consistency. The algorithm consists of temporal stabilization module TSM-Net, optical flow constraint module and loss calculation module. The innovation of TSM-Net is that the ConvGRU network is embedded layer by layer with dual-channel parallel structure in the decoder, which effectively enhances the information extraction ability of the neural network in the time domain space through feature fusion. This paper also proposes a hybrid loss based on optical flow, which sums the temporal loss and the spatial loss to better balance the dominant role of the two during training. It improves temporal consistency while ensuring better perceptual similarity. Since the algorithm does not require optical flow during testing, it achieves real-time performance. This paper conducts experiments based on public datasets to verify the effectiveness of the pixel flicker correction algorithm.

关键词： short video creation deep learning temporal consistency optical flow

来源：评论

学校读者我要写书评

暂无评论

real time Crop Harvest time Prediction Model Using Raspberry Pi and image processing 3

Real Time Crop Harvest Time Prediction Model Using Raspberry...

引用

3rd International conference for Advancement in Technology, ICONAT 2024

作者： Bhandarkar, Mrunalini S. Dewan, Basudha Bansal, Payal Poornima University Jaipur India Poornima College of Engineering Jaipur India

ISBN: (纸本)9798350354171

Agriculture is a vital sector for ensuring global food security and promoting sustainable development in every country. Additionally, accurate prediction of crop yield and best harvest time is vital as it will help the farmers to maximize their profit with right utilization of the available resources. The proposed system enables prediction of crop harvest time using Raspberry Pi and image processing technique. The Raspberry Pi, which has a camera module installed, takes pictures of the crops over time. image analysis techniques are applied to assess the visual indicators of crop maturity, such as color, size, and texture. This technology forecasts the best time to harvest by comparing these visual cues with past data. This approach eliminates the need for manual inspection and enables timely decision-making. The system incorporates a user-friendly interface accessible through a web or mobile application, presenting real-time and predicted harvest data. Notifications alert users to optimal harvest conditions. Continuous improvement is achieved through periodic updates and refinement of the image analysis algorithms, ensuring accurate predictions and adaptability to different crop types. This technique offers a practical and efficient solution for determining the ideal harvest time, facilitating resource optimization, and maximizing crop yields. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Development of a Cost-Effective Artificial Intelligence-Based image processing Sorting Mechanism for Conveyor Belt System 16

Development of a Cost-Effective Artificial Intelligence-Base...

引用

16th International conference on Human System Interaction (HSI)

作者： Luwes, Nicolaas Pretorius, Wilhelmus Cent Univ Technol Dept Elect Elect & Comp Engn Ctr Sustainable Smart Cities Bloemfontein South Africa

ISBN: (纸本)9798350362923;9798350362916

Integration of artificial intelligence in industrial automation has led to significant advancements in new techniques for automation. Such an aspect of industrial automation includes sorting consumables on conveyor belt systems via image processing. Typically, these applications use expensive dedicated, and focus-driven hardware and individual image-processing coding. This paper discusses the development of such an image-processing sorting conveyor belt but utilizing low-cost processors compared to dedicated and focus-driven hardware. This is achieved by using at the core of this system a Convolutional Neural Network (CNN), specifically tailored for hue-based image processing, and implemented on a Raspberry Pi 4B. A standard Pi camera, attached to the Raspberry Pi, captures images for real-time object classification. A key innovation of the system is the utilization of a pixel-based trigger mechanism for image capture, which significantly improves the accuracy and efficiency of the sorting process. The system achieves an accuracy rate of 92.74% in classifying objects as trained, underscoring the efficacy of the approach. Additionally, the system operates in a dual-mode capacity, enabling not only the sorting of existing object types but also the learning and adaptation to new objects through user input. This feature enhances the system's versatility and applicability in various industrial contexts. The paper details the design, implementation, and testing of this AI-driven sorting mechanism, highlighting its potential as a scalable and low-cost solution for modern industrial sorting needs.

关键词： Convolutional Neural Network (CNN) Raspberry Pi Hue-Based image Object Classification

来源：评论

学校读者我要写书评

暂无评论

A video inpainting method based on deep flow estimation 3

A video inpainting method based on deep flow estimation

引用

3rd International conference on Electronic Information Engineering and Data processing, EIEDP 2024

作者： Liu, Ziji Jia, Huiying Yang, Jiaqi School of Computer Science and Engineering Tianjin University of Technology Tianjin China

ISBN: (纸本)9781510680531

With the increasing popularity of digital video applications, video restoration techniques have become increasingly important. This paper presents a flow-based video restoration method that aims to achieve high-quality video restoration by analyzing the spatio-temporal relationships between video frames. Specifically, we employ two key steps: flow preprocessing and inter-frame restoration network. In the flow preprocessing stage, frame differencing and optical flow estimation are utilized to obtain the appearance and motion features between frames. The inter-frame restoration network learns feature representations and restoration capabilities by compensating for flow estimation errors during the frame alignment process. We conducted extensive experimental evaluations on multiple video datasets. The experimental results demonstrate significant improvements in restoration quality, spatio-temporal coherence, and real-time performance. © 2024 SPIE.

关键词： Restoration

来源：评论

学校读者我要写书评

暂无评论

MiT-RelTR: An Advanced Scene Graph-based Cross-modal Retrieval Model for real-time Capabilities 4

MiT-RelTR: An Advanced Scene Graph-based Cross-modal Retriev...

引用

4th International Mobile, Intelligent, and Ubiquitous Computing conference

作者： Essam, Mohammad Shedeed, Howida A. Khattab, Dina Ain Shams Univ Fac Comp & Informat Sci Sci Comp Cairo Egypt

ISBN: (纸本)9798350367782;9798350367775

In the current technological landscape, cross-modal retrieval systems have become essential, bridging the gap between diverse data types to boost accessibility and interaction across digital platforms. Our research enhances these systems by aiming for the efficient handling of low-resolution inputs, a common challenge in various real-life fields. This was conducted while ensuring robust performance even when high-resolution data is unavailable. The paper introduces advancement to the Local-Global Scene Graph Matching (LGSGM) architecture for cross-modal image/text retrieval, by incorporating a lightweight replacement of the scene graph generation module. The novel MiT-RelTR scene graph generation model is used to optimize the retrieval process. Our contribution improved caption retrieval by achieving a 0.4% increase in Recall@10, which signifies boosted accuracy in processing textual data. Conversely, it resulted in a decline in the image retrieval Recall@10 by 0.9%. Nonetheless, the system's inference speed improved notably, with a 38% increase in frames per second (FPS), bolstering its fitness for real-time applications. These findings illustrate the trade-offs and benefits of refining system components and suggest a need for balanced optimization strategies that equally benefit all modalities.

关键词： Cross-modal Retrieval real-time image Retrieval Text Retrieval Scene Graph

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：