检索结果-内蒙古大学图书馆

conference on real-time processing of image, Depth, and Video Information

作者： Semenishcheva, Evgenii Zdanovaa, Marina Alepkoa, Andrey Voronina, Viacheslav Moscow State Tech Univ STANKIN 1a Vadkovsky Moscow 127055 Russia

ISBN: (纸本)9781510673199;9781510673182

The article proposes an approach to the development of computationally simple and fast algorithms for data preprocessing and the selection of stable features. The following algorithms are used: 1. a modified method of multicriteria processing in local windows. The method is based on minimizing the objective function, which allows both to reduce the noise component in locally stationary areas and to preserve and strengthen the transition boundaries;2. The method of reducing the scope of clusters allows you to change the number of color histograms with the absorption of nearby areas and preservation of objects;3. The method of non-local change in color balance allows you to select areas on a dark/light background when the color balance is shifted;4. Edge detector based on the analysis of local areas in various data layers. The effectiveness test was carried out on a set of test images obtained by the flip chip machine, images by a microcircuit analyzer, as well as data from the product production line. The analyzation frames had low resolution and poor lighting. images are captured in RGB color space.

关键词： image preprocessing edge detection object detection low resolution image analyze

来源：评论

学校读者我要写书评

暂无评论

MoReSo: A DNN Framework Expediting Content-based Video image Retrieval (CBVIR) 32

MoReSo: A DNN Framework Expediting Content-based Video Image...

引用

32nd European Signal processing conference (EUSIPCO)

作者： Li, Sinian Profeta, Doruk Barokas Dauwels, Justin Delft Univ Technol Signal Proc Syst Dept Microelect Delft Netherlands

ISBN: (纸本)9789464593617;9798331519773

With the exponential growth of video data, individuals, particularly scholars in the fields of history and sociology, are increasingly reliant on video materials. However, the task of locating specific frames within videos remains a laborious and time-consuming endeavor. Advanced machine learning-assisted video processing techniques have emerged, including text-based video searches, video summarization, real-time object detection, and person re-identification. However, distinct from these, the main challenge of retrieving video frames based on given visual content is how to efficiently and accurately pinpoint the instance occurrences. To expedite the process while maintaining retrieval performance, we propose a two-stage approach, combining KeyFrame Extraction (KFE) and Content-based image Retrieval (CBIR), underpinned a DNN-empowered framework called MoReSo. Our innovations include 1) the integration of improved statistical features with dynamic clustering in the KFE stage and 2) the development of the MoReSo framework, which consists of MobileNet and ResNet backbones with SOA layer to jointly represent video frames, achieving 2.67x increase in efficiency compared to existing solutions. Our framework is evaluated on two datasets: the annotated EHM Historical Database provided by digital history researchers and the widely-used image retrieval benchmark datasets, the Oxford and Paris datasets. The experimental results showcase that the proposed framework and scheme excel among other models in the CBVIR task. We make our code available for further exploration through our GitHub repository. This repository contains the implementation of our model and CBVIR system with a GUI prototype.

关键词： Content-Based Video image Retrieval Content-Based image Retrieval Key Frame Extraction image Retrieval from Video

来源：评论

学校读者我要写书评

暂无评论

Vehicle Type Classification using Lightweight CNN from Aerial images for Traffic Management Applications 19

Vehicle Type Classification using Lightweight CNN from Aeria...

引用

IEEE 19th conference on Industrial Electronics and Applications (ICIEA)

作者： Tellefsen, Anders Yakkati, Rakesh Reddy Cenkeramaddi, Linga Reddy Univ Agder Dept ICT N-4879 Grimstad Norway

ISBN: (纸本)9798350360875;9798350360868

In applications related to traffic management, a specific kind of vehicle recognition is important. This research aims to improve traffic management systems by designing and implementing a lightweight Convolutional Neural Network (CNN) for vehicle-type detection from aerial photos. This study aims to develop a model that is accurate in classification and computationally efficient to provide real-time processing skills required for dynamic traffic monitoring. It does this by employing a dataset consisting of high-resolution aerial images taken by drones. The main issue that needs to be addressed is how cars appear differently depending on the angles, sizes, and environmental factors present in aerial imagery. The lightweight CNN architecture is specifically designed to balance performance and computational efficiency, which is critical for implementation in real-time traffic management applications, including low-power devices such as the Raspberry Pi. It optimizes parameter counts and employs approaches that speed up training without sacrificing accuracy. The study's key findings show that the suggested model outperforms pre-trained models in terms of both accuracy and efficiency. The model achieves a testing accuracy of 99.31% while remaining compact, making it ideal for real-time applications.

关键词： Lightweight Convolutional Neural Network Vehicle images Vehicle Type Classification real-time processing Aerial image Analysis Traffic Management Systems Model Optimization Computational Efficiency

来源：评论

学校读者我要写书评

暂无评论

A Motion Injury Detection and Prevention System Based on image Recognition and Deep Learning 3

A Motion Injury Detection and Prevention System Based on Ima...

引用

3rd International conference on Integrated Circuits and Communication Systems, ICICACS 2025

作者： Shao, Kang Zhang, Limin Ma, Yunlong Weifang Engineering Vocational College Shandong Weifang China

ISBN: (纸本)9798331508456

With the popularization of sports and fitness activities, how to effectively monitor and prevent sports injuries has become an important challenge. This article proposes a sports injury detection and prevention system. Firstly, this system collects real-time motion image data of athletes through high-definition cameras and uses YOLOv5 (You Only Look Once v5) for efficient object detection and action recognition. Then, combining long short-term memory networks (LSTM) to analyze time series data during the motion process. Finally, multimodal data fusion technology was introduced to comprehensively analyze the data. The experimental results show that the system achieves an accuracy of 88% when processing 5000 samples. In terms of real-time performance, the system has a response time of 0.60 seconds under 5000 samples. In the above data conclusions, the proposed system outperforms traditional models in terms of accuracy, real-time performance, and computational resource consumption in sports injury detection, demonstrating its effectiveness and feasibility in practical applications. © 2025 IEEE.

关键词： image fusion

来源：评论

学校读者我要写书评

暂无评论

real-time Multiclass Face Spoofing Recognition Through Spatiotemporal Convolutional 3D Features 22nd

Real-Time Multiclass Face Spoofing Recognition Through Spati...

引用

22nd International conference on image Analysis and processing (ICIAP)

作者： Giurato, Salvatore Ortis, Alessandro Battiato, Sebastiano Univ Catania Image Proc Lab Dipartimento Matemat & Informat Viale A Doria 6 I-95125 Catania Italy

ISBN: (纸本)9783031510229;9783031510236

Face recognition is used in numerous authentication applications, unfortunately they are susceptible to spoofing attacks such as paper and screen attacks. In this paper, we propose a method that is able to recognise if a face detected in a video is not real and the type of attack performed on the fake video. We propose to learn the temporal features exploiting a 3D Convolution Network that is more suitable for temporal information. The 3D ConvNet, other than summarizing temporal information, allows us to build a real-time method since it is so much more efficient to analyse clips instead of analyzing single frames. The learned features are classified using a binary classifier to distinguish if the person in the clip video is real (i.e. live) or not, multi class classifier recognises if the person is real or the type of attack (screen, paper, ect.). We performed our test on 5 public datasets: Replay Attack, Replay Mobile, MSU-MSFD, Rose-Youtu, RECOD-MPAD.

关键词： Antispoofing Attack 3D Features Multi-Class detection liveness

来源：评论

学校读者我要写书评

暂无评论

Development of a Cost-Effective Artificial Intelligence-Based image processing Sorting Mechanism for Conveyor Belt System 16

Development of a Cost-Effective Artificial Intelligence-Base...

引用

16th International conference on Human System Interaction (HSI)

作者： Luwes, Nicolaas Pretorius, Wilhelmus Cent Univ Technol Dept Elect Elect & Comp Engn Ctr Sustainable Smart Cities Bloemfontein South Africa

ISBN: (纸本)9798350362923;9798350362916

Integration of artificial intelligence in industrial automation has led to significant advancements in new techniques for automation. Such an aspect of industrial automation includes sorting consumables on conveyor belt systems via image processing. Typically, these applications use expensive dedicated, and focus-driven hardware and individual image-processing coding. This paper discusses the development of such an image-processing sorting conveyor belt but utilizing low-cost processors compared to dedicated and focus-driven hardware. This is achieved by using at the core of this system a Convolutional Neural Network (CNN), specifically tailored for hue-based image processing, and implemented on a Raspberry Pi 4B. A standard Pi camera, attached to the Raspberry Pi, captures images for real-time object classification. A key innovation of the system is the utilization of a pixel-based trigger mechanism for image capture, which significantly improves the accuracy and efficiency of the sorting process. The system achieves an accuracy rate of 92.74% in classifying objects as trained, underscoring the efficacy of the approach. Additionally, the system operates in a dual-mode capacity, enabling not only the sorting of existing object types but also the learning and adaptation to new objects through user input. This feature enhances the system's versatility and applicability in various industrial contexts. The paper details the design, implementation, and testing of this AI-driven sorting mechanism, highlighting its potential as a scalable and low-cost solution for modern industrial sorting needs.

关键词： Convolutional Neural Network (CNN) Raspberry Pi Hue-Based image Object Classification

来源：评论

学校读者我要写书评

暂无评论

real time Crop Harvest time Prediction Model Using Raspberry Pi and image processing 3

Real Time Crop Harvest Time Prediction Model Using Raspberry...

引用

3rd International conference for Advancement in Technology, ICONAT 2024

作者： Bhandarkar, Mrunalini S. Dewan, Basudha Bansal, Payal Poornima University Jaipur India Poornima College of Engineering Jaipur India

ISBN: (纸本)9798350354171

Agriculture is a vital sector for ensuring global food security and promoting sustainable development in every country. Additionally, accurate prediction of crop yield and best harvest time is vital as it will help the farmers to maximize their profit with right utilization of the available resources. The proposed system enables prediction of crop harvest time using Raspberry Pi and image processing technique. The Raspberry Pi, which has a camera module installed, takes pictures of the crops over time. image analysis techniques are applied to assess the visual indicators of crop maturity, such as color, size, and texture. This technology forecasts the best time to harvest by comparing these visual cues with past data. This approach eliminates the need for manual inspection and enables timely decision-making. The system incorporates a user-friendly interface accessible through a web or mobile application, presenting real-time and predicted harvest data. Notifications alert users to optimal harvest conditions. Continuous improvement is achieved through periodic updates and refinement of the image analysis algorithms, ensuring accurate predictions and adaptability to different crop types. This technique offers a practical and efficient solution for determining the ideal harvest time, facilitating resource optimization, and maximizing crop yields. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

MiT-RelTR: An Advanced Scene Graph-based Cross-modal Retrieval Model for real-time Capabilities 4

MiT-RelTR: An Advanced Scene Graph-based Cross-modal Retriev...

引用

4th International Mobile, Intelligent, and Ubiquitous Computing conference

作者： Essam, Mohammad Shedeed, Howida A. Khattab, Dina Ain Shams Univ Fac Comp & Informat Sci Sci Comp Cairo Egypt

ISBN: (纸本)9798350367782;9798350367775

In the current technological landscape, cross-modal retrieval systems have become essential, bridging the gap between diverse data types to boost accessibility and interaction across digital platforms. Our research enhances these systems by aiming for the efficient handling of low-resolution inputs, a common challenge in various real-life fields. This was conducted while ensuring robust performance even when high-resolution data is unavailable. The paper introduces advancement to the Local-Global Scene Graph Matching (LGSGM) architecture for cross-modal image/text retrieval, by incorporating a lightweight replacement of the scene graph generation module. The novel MiT-RelTR scene graph generation model is used to optimize the retrieval process. Our contribution improved caption retrieval by achieving a 0.4% increase in Recall@10, which signifies boosted accuracy in processing textual data. Conversely, it resulted in a decline in the image retrieval Recall@10 by 0.9%. Nonetheless, the system's inference speed improved notably, with a 38% increase in frames per second (FPS), bolstering its fitness for real-time applications. These findings illustrate the trade-offs and benefits of refining system components and suggest a need for balanced optimization strategies that equally benefit all modalities.

关键词： Cross-modal Retrieval real-time image Retrieval Text Retrieval Scene Graph

来源：评论

学校读者我要写书评

暂无评论

real-time task scheduling with image resizing for criticality-based machine perception

引用

real-time SYSTEMS 2022年第4期58卷 430-455页

作者： Hu, Yigong Liu, Shengzhong Abdelzaher, Tarek Wigness, Maggie David, Philip Univ Illinois Champaign IL 61820 USA US DEVCOM Army Res Lab Adelphi MD USA

This paper extends a previous conference publication that proposed a real-time task scheduling framework for criticality-based machine perception, leveraging image resizing as the tool to control the accuracy and execution time trade-off. Criticality-based machine perception reduces the computing demand of on-board AI-based machine inference pipelines (that run on embedded hardware) in applications such as autonomous drones and cars. By segmenting inputs, such as individual video frames, into smaller parts and allowing the downstream AI-based perception module to process some segments ahead of (or at a higher quality than) others, limited machine resources are spent more judiciously on more important parts of the input (e.g., on foreground objects in lieu of backgrounds). In recent work, we explored the use of image resizing as a way to offer a middle ground between full-resolution processing and dropping, thus allowing more flexibility in handling less important parts of the input. In this journal extension, we make the following contributions: (i) We relax a limiting assumption of our prior work;namely, the need for a "perfect sensor" to identify which parts of the image are more critical. Instead, we investigate the use of real LiDAR measurements for quick-and-dirty image segmentation ahead of AI-based processing. (ii) We explore another dimension of freedom in the scheduler: namely, merging several nearby objects into a consolidated segment for downstream processing. We formulate the scheduling problem as an optimal resize-merge problem and design a solution for it. Experiments on an AI-powered embedded platform with a real-world driving dataset demonstrate the practicality and effectiveness of our proposed framework.

关键词： Machine perception real-time scheduling Cyber-physical systems

来源：评论

学校读者我要写书评

暂无评论

On-the-Fly CT image Pre-processing on MPSoC-FPGAs 37th

On-the-Fly CT Image Pre-processing on MPSoC-FPGAs

引用

37th International conference on Architecture of Computing Systems (ARCS)

作者： Passaretti, Daniele Pionteck, Thilo Otto von Guericke Univ D-39106 Magdeburg Germany

ISBN: (纸本)9783031661457;9783031661464

Due to the increasing number of tumors, new interventional Computed Tomography (CT) procedures have been proposed that aim to optimize workflow, time-effective diagnosis and treatments. To support tumor ablation procedures, CT scanners must pre-process 2D projections and reconstruct 3D slices of the human body in real time, while data are acquired. This paper proposes a lightweight processing architecture for MPSoC-FPGA that performs the "CT pre-processing phase" on the fly;this phase consists of the pixel processing of 2D images. It is also suitable for exploring different data formats that can be selected at design time to improve performance while keeping image quality. This article focuses on the cosine and redundancy weighting steps, which can not be implemented following the standard method on embedded MPSoC-FPGA, due to the high resource utilization costs of their arithmetic operations. Therefore, this work proposes different optimizations that result in a reduction of the number of operations to compute and the amount of on-chip memory required in comparison to the standard algorithm. Finally, the proposed architecture has been implemented and instantiated within a Control Data Acquisition System (CDAS) architecture running on the XC7Z045 AMD-Xilinx MPSoC-FPGA and integrated into an open-interface CT scanner assembled in our laboratory. Here, the optimized weighting steps use up to 33.8 times fewer DSPs than the implementation based on the standard solution. Furthermore, it adds only 80 ns of latency, making it 7.9 times faster than the implementation based on the standard solution.

关键词： Filtered Back-Projection real-time Data processing Signal processing System-on-Chip High-Level Synthesis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：