检索结果-内蒙古大学图书馆

27th International conference on Interactive Collaborative Learning-ICL

作者： Badea, Cosmin Costea, Antonia Halmaghi, Alexandra Hancu, Alexandra Bogdan, Ioana Corina Modran, Horia Alexandru Transilvania Univ Brasov Dept Elect & Comp Brasov Romania

ISBN: (纸本)9783031835193;9783031835209

The integration of artificial intelligence (AI) and unmanned aerial vehicle (UAV) technologies presents a significant advancement in enhancing safety in traffic, workplace, and healthcare environments. This study explores the application of AI-driven computer vision algorithms in UAVs to detect and mitigate risks associated with substance abuse, fatigue, and health impairments. Utilizing sophisticated image processing techniques, such as edge detection and support vector machine (SVM) algorithms, drones are equipped to autonomously monitor and analyze ocular characteristics and facial expressions of individuals. The research employs a mobile phone camera and Python-based libraries to conduct real-time assessments, providing critical data to medical and industrial professionals. The study demonstrates the potential of drones to enhance safety by checking sobriety and monitoring worker health. The experimental setup includes a detailed workflow for real-time video detection and facial analysis, leveraging pre-trained models and convolutional neural networks. The results confirm the effectiveness of this approach, highlighting significant progress in AI and UAV technology. Future work aims to transition these innovations from laboratory conditions to practical, real-world applications, continuously enhancing the algorithms and expanding their applicability across various safety-critical scenarios.

关键词： Drones Surveillance Computer Vision Artificial Intelligence image processing Healthcare

来源：评论

学校读者我要写书评

暂无评论

Edge deployed satellite image classification with TinEViT a X-Cube-AI compatible efficient vision transformer

Edge deployed satellite image classification with TinEViT a ...

引用

conference on real-time image processing and Deep Learning

作者： Halford, Gavin Depoian, Arthur C., II Bailey, Colleen P. Univ North Texas Dept Elect Engn Denton TX 76207 USA

ISBN: (纸本)9781510673878;9781510673861

Lower resolutions and a lack of distinguishing features in large satellite imagery datasets make identification tasks challenging for traditional image classification models. Vision Transformers (ViT) address these issues by creating deeper spatial relationships between image features. Self attention mechanisms are applied to better understand not only what features correspond to which classification profile, but how the features correspond to each other within each separate category. These models, integral to computer vision machine learning systems, depend on extensive datasets and rigorous training to develop highly accurate yet computationally demanding systems. Deploying such models in the field can present significant challenges on resource constrained devices. This paper introduces a novel approach to address these constraints by optimizing an efficient Vision Transformer (TinEVit) for real-time satellite image classification that is compatible with ST Microelectronics AI integration tool, X-Cube-AI.

关键词： ViT STM-32 X-Cube-AI Satellite image Classification Efficient AI EuroSAT

来源：评论

学校读者我要写书评

暂无评论

Innovative Hand Gesture Recognition Techniques for Volume Adjustment in real-time 4

Innovative Hand Gesture Recognition Techniques for Volume Ad...

引用

4th International conference on Artificial Intelligence and Signal processing

作者： Kanagamalliga, S. Vinayagam, P. Yazhvanan, M. A. Winaayag, Amit J. Saveetha Engn Coll Dept Elect & Commun Engn Chennai Tamil Nadu India

ISBN: (数字)9798350350654

ISBN: (纸本)9798350350661;9798350350654

Hand gesture recognition is an advanced system that identifies hand movements in real-time video for applications such as volume control. The challenge in designing such a system lies in identifying the hand and creating gestures recognizable by a single hand. This technology finds use in various fields, comprising sign language interpretation. The primary concept involves hand recognition, utilizing the Haar-cascade classifier (HCC) to implement hand motion recognition with OpenCV and Python. The research explores a method for identifying hand gestures based on shape-based feature recognition. The system configuration includes a single camera that captures user gestures and feeds them into the recognition system. The main target of gesture recognition is to develop a system capable of identifying specific human motions and using them to transmit control data to devices. With real-time gesture recognition, users can control a workstation by making specific gestures in front of the camera. Leveraging the OpenCV module, we create a hand gesture recognition system that allows device control without needing a keyboard or mouse. This approach involves several stages: capturing the hand gesture using a camera, processing the video frame to segment and the hand, and recognizing the gesture based on shape features. The HCC is employed for hand recognition due to its efficiency and accuracy in identifying hand regions in real-time. The implementation of this system promises a user-friendly and intuitive way of interacting with devices. By eliminating the need for physical input devices, it enhances accessibility and convenience. This research discusses the development of a hand gesture recognition system for volume control, highlighting the techniques used and the potential applications of this technology in improving human-computer interaction. The findings suggest that such systems can significantly enhance the user experience by providing an alternative, non-contact metho

关键词： Hand Gesture Recognition real-time video processing Haar-Cascade Classifier Shape-Based Feature Recognition Human-Computer Interaction

来源：评论

学校读者我要写书评

暂无评论

MOD-IR: moving objects detection from UAV-captured video sequences based on image registration

引用

MULtimeDIA TOOLS AND APPLICATIONS 2023年第16期83卷 46779-46798页

作者： Bouhlel, Fatma Mliki, Hazar Hammami, Mohamed Univ Sfax MIRACL FSS Fac Sci Sfax Rd Sokra Km 3 Sfax 3018 Tunisia Univ Sfax MIRACL Lab Sfax Tunisia Univ Carthage Natl Inst Appl Sci & Technol Tunis Tunisia

The moving objects detection from freely moving camera like the one mounted on Unmanned Aerial Vehicle (UAV) stands as an important and challenging issue. This paper introduced a new MOD-IR method for moving objects detection from UAV-captured video sequences. The proposed method consists of four steps: (1) feature extraction and matching, (2) frame registration, (3) moving objects detection and (4) moving objects detection post-processing. Our method stands out from those of the literature in a number of ways. First, we enhanced the method effectiveness and robustness by handling the constraints related to this field through extracting robust features, on the one hand, and automatically defining the optimum threshold, on the other. Second, we proposed an efficient method able to deal with real-time applications by extracting keypoint features instead of pixel-to-pixel model estimation, and by simulating the search for the matching features among multiple trees. Finally, we involved the quick-shift segmentation in parallel with the three first steps, in order to enhance and accelerate the moving objects detection task. Relying on quantitative and qualitative evaluations of the proposed method on a variety of sequences extracted from several datasets (such as DARPA VIVID-EgTest05, Hopkins 155, UCF Aerial Action, etc.), we assessed the performance of our method compared to the state-of-the-art reference methods. Furthermore, the time cost evaluation has enabled us to emphasize that our MOD-IR method is the optimal choice for real-time applications, owing to its lower computational time requirement compared to the reference methods.

关键词： Moving objects detection UAV Feature extraction and matching image registration Max entropy thresholding Quick-shift segmentation

来源：评论

学校读者我要写书评

暂无评论

LIGHTWEIGHT NETWORK TOWARDS real-time image DENOISING ON MOBILE DEVICES 30

LIGHTWEIGHT NETWORK TOWARDS REAL-TIME IMAGE DENOISING ON MOB...

引用

30th IEEE International conference on image processing (ICIP)

作者： Liu, Zhuoqun Jin, Meiguang Chen, Ying Liu, Huaida Yang, Canqian Xiong, Hongkai Alibaba Grp Hangzhou Peoples R China Shanghai Jiao Tong Univ Shanghai Peoples R China

ISBN: (纸本)9781728198354

Deep convolutional neural networks have achieved great progress in image denoising tasks. However, their complicated architectures and heavy computational cost hinder their deployments on mobile devices. Some recent efforts in designing lightweight denoising networks focus on reducing either FLOPs (floating-point operations) or the number of parameters. However, these metrics are not directly correlated with the on-device latency. In this paper, we identify the real bottlenecks that affect the CNN-based models' run-time performance on mobile devices: memory access cost and NPU-incompatible operations, and build the model based on these. To further improve the denoising performance, the mobile-friendly attention module MFA and the model reparameterization module RepConv are proposed, which enjoy both low latency and excellent denoising performance. To this end, we propose a mobile-friendly denoising network, namely MFDNet. The experiments show that MFDNet achieves state-of-the-art performance on real-world denoising benchmarks SIDD and DND under real-time latency on mobile devices. The code and pre-trained models will be released.

关键词： image Denoising Mobile-friendly Network Design

来源：评论

学校读者我要写书评

暂无评论

A Lightweight Model for Fast and Detail-Preserving Human Segmentation in video

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2025年

作者： Zhang, Xiaoyao Yao, Lan Zeng, Feng Cent South Univ Sch Comp Scienceand Engn Changsha 410017 Peoples R China Hunan Univ Sch Math Changsha 410012 Peoples R China

With the rapid development of artificial intelligence, human segmentation in video is becoming increasingly important in the field of computer vision. However, existing segmentation models suffer from inaccurate segmentation, slow processing, and large model size that limit their deployment on resource-constrained devices. To this end, we propose a lightweight model called Efficient Memory Aggregation U-shaped Network (EMAUnet) for human segmentation in video, which is based on a traditional U-shaped network and attention mechanism. In EMAUnet, memory modules are combined with segmentation modules, enabling end-to-end learning of semantic extraction patterns for human images. Mobile-inverse bottleneck convolution (MBConv) is used as the network backbone that has relatively few parameters and computational complexity. Inverted sub-pixel down-sampling (ISP) is proposed to minimize information loss and achieve detail-preserving of segmentation. Coordinate attention (CA) is adopted to precisely locate the portrait area. Moreover, bidirectional memory update (BMU) and memory update trigger (MUT) are proposed to improve memory resource utilization and reduce unnecessary computation. Experimental results show that, compared with the classic model ISNet, EMAUnet has the mIoU, FPS and pixel accuracy increased by 2.3%, 25.0% and 1.8%, respectively, while the amount of parameters and the size of model decreased by 33.3% and 45.7%, respectively.

关键词： image segmentation Computational modeling Accuracy Semantics Object segmentation Streaming media real-time systems Memory management Convolution Training video human segmentation lightweight model U-shaped network attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Arithmetically Fast Position Transformation for View Synthesis and Depth Estimation

引用

IEEE ACCESS 2025年 13卷 24661-24671页

作者： Wegner, Krzysztof Grajek, Tomasz Klimaszewski, Krzysztof Mucha Sp Zoo PL-61626 Poznan Poland Poznan Univ Tech Inst Multimedia Telecommun PL-60965 Poznan Poland

In this paper we present a method of fast computation of matrix transformation in the process of position transformation of objects of the scene between different, virtual or real, camera positions. The process finds extensive use in virtual view generation in Free Viewpoint video (FVV) and virtual reality applications as well as in depth estimation algorithms. The proposed method relies on the reformulation of the matrix equation used in the process. As a result, the number of necessary arithmetic operations is reduced and some of the calculations can be reused for consecutive pixel position transformations. The presented algorithm produces identical output as an unoptimized algorithm with approximately 22% reduction of the processing time averaged over the examined representative test sequences.

关键词： Cameras Vectors Three-dimensional displays Software algorithms Depth measurement Mathematical models Training TV Recording image resolution Depth estimation DIBR multiview and depth position transformation view synthesis

来源：评论

学校读者我要写书评

暂无评论

Computer Vision Based Hybrid Classroom Attention Monitoring

Computer Vision Based Hybrid Classroom Attention Monitoring

引用

2024 IEEE International conference on Information Technology, Electronics and Intelligent Communication Systems, ICITEICS 2024

作者： Rawat, Saniya Rodrigues, Malivia Sheregar, Prateeksha Wagaskar, Kalpita Ajinkya Tripathy, Amiya Kumar Mumbai India

ISBN: (纸本)9798350382693

This research presents a novel computer vision-based attention monitoring system designed for both online and offline contexts. Leveraging advanced image processing and machine learning algorithms, the system analyzes human gaze patterns, eye movements, and facial expressions to accurately gauge attention levels. In online scenarios, the system employs real-time webcam-based gaze tracking and facial recognition to provide immediate insights into user engagement during activities like video conferencing and virtual meetings. For offline analysis, recorded video footage is retrospectively examined, facilitating applications in education, workplace productivity, and user experience assessments. Privacy considerations are addressed through the implementation of privacy-preserving techniques. Experimental results demonstrate the system's efficacy in monitoring attention dynamics across diverse settings, contributing to a deeper understanding of human attention in various domains. © 2024 IEEE.

关键词： video conferencing

来源：评论

学校读者我要写书评

暂无评论

A colour image segmentation method and its application to medical images

引用

SIGNAL image AND video processing 2024年第2期18卷 1635-1648页

作者： Halim, Abdul Kumar, B. V. Rathish Niranjan, Ajay Nigam, Aditya Schneider, Walter Ahuja, Chirag K. Pathak, Sudhir K. King Abdullah Univ Sci & Technol Thuwal Saudi Arabia Hari Singh Coll Dept Math Munger 811213 Bihar India IIT Dept Math & Stat Kanpur 208016 UP India Univ Pittsburgh Sch Med Neurol Surg Pittsburgh PA USA IIT Mandi Sch Comp & Elect Engn Mandi India Univ Pittsburgh Learning Res & Dev Ctr Pittsburgh PA USA PGIMER Chandigarh Dept Radio Diag & Imaging Chandigarh India

In this paper, we propose a segmentation model using an anisotropic multi-well potential-based nonlinear transient PDE for colour images. A channel-wise greyscale classification approach is devised for colour image segmentation. The time evolution of the PDE model is carried out by the implicit-explicit convexity splitting approach. Further, we consider the fractional version of the time-discretised model by replacing the Laplacian with its fractional counterpart. The spatial terms are approximated by the Fourier basis under the pseudo-spectral method. The convergence and the stability of the numerical scheme are elaborated. Both models (fractional and non-fractional) are tested on some synthetic images and few real-world standard test images. The results on synthetic images are compared with those from the literature using Dice similarity index, Jaccard similarity index and BF score. Later the method is successfully applied on several medical images to classify the same.

关键词： Colour segmentation Multi-well potential Nonlinear PDE Fractional PDEs Medical imaging

来源：评论

学校读者我要写书评

暂无评论

A Fast image Dehazing Using Encoder–Decoder Deep Neural Network

A Fast Image Dehazing Using Encoder–Decoder Deep Neural Net...

引用

International conference on Advances in Signal processing and Communication Engineering, ICASPACE 2023

作者： Gurjar, Prakhar Pavan Kumar, Balla Kumar, Arvind National Institute of Technology Kurukshetra India

ISBN: (纸本)9789819705610

The image quality is degraded in bad weather situations such as haze or fog. This problem can affect image processing applications such as computer vision, security, and some other real-time image processing systems. Hence, image dehazing is essential for these applications to improve their performance. There are many dehazing algorithms that are implemented earlier using the atmospheric light-scattering model and enhancement-based techniques. However, these algorithms are complex and consume more execution time to perform dehazing, which isn’t fit for real-time image processing applications. To overcome this drawback, an encoder–decoder deep neural network (EDDNN) is designed in this manuscript for fast image dehazing purposes. The proposed EDDNN contains a total of four layers, they are input, encoder, decoder, and output layers. The proposed EDDNN is trained and tested with the most popular dataset called realistic single image dehazing (RESIDE). The proposed EDDNN is fast in execution that suits real-time image processing systems (RTIPS) and also effectively eliminates the haze effect from the image. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：