检索结果-内蒙古大学图书馆

22nd IEEE/CVF Winter conference on Applications of Computer Vision (WACV)

作者： Ashrafee, Alif Khan, Akib Mohammed Irbaz, Mohammad Sabik Al Nasim, Md Abdullah Islamic Univ Technol Dept Comp Sci & Engn Gazipur Bangladesh Pioneer Alpha Ltd Machine Learning Team Dhaka Bangladesh

ISBN: (纸本)9781665458245

Automatic License Plate Recognition systems aim to provide a solution for detecting, localizing, and recognizing license plate characters from vehicles appearing in video frames. However, deploying such systems in the real world requires real-time performance in low-resource environments. In our paper, we propose a two-stage detection pipeline paired with Vision API that provides real-time inference speed along with consistently accurate detection and recognition performance. We used a haar-cascade classifier as a filter on top of our backbone MobileNet SSDv2 detection model. This reduces inference time by only focusing on high confidence detections and using them for recognition. We also impose a temporal frame separation strategy to distinguish between multiple vehicle license plates in the same clip. Furthermore, there are no publicly available Bangla license plate datasets, for which we created an image dataset and a video dataset containing license plates in the wild. We trained our models on the image dataset and achieved an AP(0.5) score of 86% and tested our pipeline on the video dataset and observed reasonable detection and recognition performance (82.7% detection rate, and 60.8% OCR F1 score) with real-time processing speed (27.2 frames per second).

关键词： Computer vision image recognition conferences Pipelines Focusing Licenses Streaming media

来源：评论

学校读者我要写书评

暂无评论

5th International Symposium on Signal and image processing, ISSIP 2024

5th International Symposium on Signal and Image Processing, ...

引用

5th International Symposium on Signal and image processing, ISSIP 2024

ISBN: (纸本)9789819795147

The proceedings contain 31 papers. The special focus in this conference is on Signal and image processing. The topics include: A Blockchain-Based Secure Framework for Storage and Access of Surveillance video Data;encountering the Challenges and Awareness of Internet Security;Data-Driven Strategies for Twitter Engagement: Hashtag Recommendations and API Insights;Analyzing and Visualizing AI Decision-Making for Human-Centered Interaction and Trust;AI-ML-Based Handover Techniques in Next-Generation Wireless Networks;castle Defender: Design and Implementation of Game Based on Pygame;optimizing Healthcare: Enhancing Disease Management with Recommendation Systems;optimizing Diabetes Prediction Models for Enhanced Health Data processing;comparative Analysis of Classification Models Using Various Feature Sets;An Early Detection of Tomato Plant Disease with Deep Reinforcement Learning and CNN;ioT-Powered Health Monitoring System for Protecting Vital Organs Through Cloud-Based Diagnosis;LASKV: Lightweight Authenticated Session Key Agreement for video Surveillance System;Data Analyses of Factors Influencing Sustainable Adoption of Online Public Services: An Extended UTAUT Analysis in North Macedonia;a Comprehensive Review on real-time Campus Compass;deep Learning Approaches for Early Detection of Bovine Respiratory Diseases in Cattle;identifying Suitable Cloud Storage Services from Cross Cloud Platforms Based on the Requirements;storage-Savvy Frame Recorder: Enhancing Storage Efficiency and Inspection Speed*;lossless Medical image Compression Using Block-Wise Burrows–Wheeler Transform and Modified Run Length Encoding;advancing Software Defect Detection and Prevention: Bridging Gaps in Early-Stage and Evolving Software Systems;a New Way to Communicate: Deep Learning for Deaf and Dumb People;dependent Binomial: A Family of Distributions Derivable by Modifying a Base.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for real-time Endoscopic video image on Customizable Embedded DSP Cores

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 2022年第1期E105A卷 58-62页

作者： Odagawa, Masayuki Koide, Tetsushi Tamaki, Toru Yoshida, Shigeto Mieno, Hiroshi Tanaka, Shinji Hiroshima Univ Res Inst Nanodevice & Bio Syst Higshihiroshima 7398527 Japan Cadence Design Syst Japan Yokohama Kanagawa 2220033 Japan Nagoya Inst Technol Dept Comp Sci Nagoya Aichi 4668555 Japan Med Corp JR Hiroshima Hosp Dept Gastroenterol Hiroshima 7320057 Japan Hiroshima Univ Dept Endoscopy & Med Grad Sch Biomed & Hlth Sci Hiroshima 7348553 Japan

This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

关键词： medical image/video processing computer-aided diagnosis system (CAD) navigation function convolutional neural network (CNN) support vector machine (SVM) customizable embedded digital signal processor (DSP)

来源：评论

学校读者我要写书评

暂无评论

Automated Recognition of Optic Disc and Blood Vessels in Diabetic Fundoscopy images Using real-time image Analysis 7

Automated Recognition of Optic Disc and Blood Vessels in Dia...

引用

7th IEEE International conference on Multimedia Information processing and Retrieval (MIPR)

作者： Li, Kaixuan Chen, Wei-Bang Lu, Yongjin Wang, Xiaoliang Gao, He Virginia State Univ Dept Comp Sci Petersburg VA 23806 USA Oakland Univ Dept Math & Stat Rochester MI USA SUNY Coll Oswego Dept Comp Sci Oswego NY USA

ISBN: (纸本)9798350351439;9798350351422

Diabetic retinopathy (DR) is a sight-threatening condition associated with diabetes, characterized by damage to the retinal blood vessels. Key to the automation of DR staging is the identification of various symptoms directly or closely associated with retinal blood vessels, as well as the number of these symptoms in the four quadrants of the retina separated by the optic disc. Therefore, precise identification of the optic disc (OD) and blood vessels in fundus images is crucial for DR stage diagnosis but is often time-consuming and requires expert analysis. This study introduces a thresholding-based approach for the automated localization of the OD and the detection of blood vessels in fundus images of diabetic patients. Our algorithm is more robust than some deep learning-based algorithms, achieving more accurate results, particularly in advanced DR stages where the resemblance between various symptoms and blood vessels complicates the extraction of blood vessels. Additionally, our computer vision system can achieve OD localization and blood vessel segmentation in real time. The experimental results on a dataset selected by an ophthalmologist from a Kaggle dataset, ensuring data quality, show that the proposed algorithm can achieve an accuracy higher than 94% for both OD localization and blood vessel detection, outperforming some state-of-the-art algorithms.

关键词： diabetic retinopathy optical disc localization blood vessel detection fundus image analysis automated ophthalmic diagnostics

来源：评论

学校读者我要写书评

暂无评论

time Based image Synthesis: Light and Color Conversion Using Machine Learning Techniques 4

Time Based Image Synthesis: Light and Color Conversion Using...

引用

4th International conference on Technological Advancements in Computational Sciences, ICTACS 2024

作者： Labh, Jyoti Ranjan Dwivedi, R.K. College of Computing Sciences and Information Technology Teerthanker Mahaveer University Moradabad India

ISBN: (纸本)9798350387490

The progress made in camera technology and computational imaging has been astounding when it comes to photography. A specialized neural image signal processor is currently used by many top smartphone cameras and DSLR to convert chaotic raw sensor photos into the final, polished output. A sizable dataset of image pairs-one with a noisy, short-exposure raw picture and another with a clean, long-exposure ground-truth raw image-is needed to train these neural ISPs for low-light, processing It takes a lot of effort and time to capture these aligned image pairs, and motion blur in the long-exposure ground truth frequently makes the process difficult. Artificial nighttime datasets can be used to train various image processing algorithms and neural ISPs that are more resilient and effective. In order to overcome this difficulty, we proposed a model that uses to convert image colour and lighting of a raw picture to artificially create realistic image of different time zone raw photos, brighter photos are typically easier to take, have less noise, and are less likely to suffer from motion blur. The objective of the suggested processing framework is to transform raw photographs taken during the different time into simulated images of different timing with different levels of noise and other low-light features. More ease and flexibility in training neural ISPs for time-based rendering is made possible by this artificial time- based picture dataset. Additionally, in our proposed method the performance of the trained neural ISPs can resemble that of models trained only on real different timings photos if they mix their synthetic data with a tiny percentage of real data. The potential to produce accurate raw photos of midnight and different night time in a synthetic manner holds significant implications for computational photography and image processing applications and even object recognition under difficult nocturnal conditions. © 2024 IEEE.

关键词： Color photography

来源：评论

学校读者我要写书评

暂无评论

In-sensor neural network for real-time KWS by image processing

In-sensor neural network for real-time KWS by image processi...

引用

real-time processing of image, Depth and video Information 2023

作者： Vitolo, Paola Esposito, Pio Pau, Danilo Liguori, Rosalba Di Benedetto, Luigi Licciardo, Gian Domenico Fisciano84084 Italy Agrate Brianza20864 Italy

ISBN: (数字)9781510662636

ISBN: (纸本)9781510662629

KeyWord Spotting (KWS), i.e. the capability to identify vocal commands as they are pronounced, is becoming one of the most important features of Human-Machine Interface (HMI), also thanks to the pervasive diffusion of high-performance MEMS audio sensors with very reduced dimensions. In-Sensor Computing (ISC) appears the most viable solution to get the maximum advantage of KWS, since the dimensions of MEMS microphones remain reduced and minimally invasive. ISC, indeed, represents the extreme evolution of the edge computing paradigm, where the processing circuits are moved close to the audio sensor, integrated into its auxiliary circuitry or in the same package. However, ISC introduces severe area and power constraints and must trade off with processing speed to meet real-time operations naturally required by KWS. In this work, we want to show a neural networkbased KWS suitable for ISC contexts, when audio sensor data are converted into MEL spectrogram images and a Depthwise Separable Convolutional Neural Network (DSCNN) with feature extraction capabilities is designed. To show the advantages of the above approach, the DSCNN is compared with an alternative Fully Connected Neural Network (FCNN), operating on audio signals not converted into images. The considered models have been profiled on a microcontroller and implemented on an FPGA. Their performances are compared in terms of classification accuracy and HW resources. Comparisons show that the FCNN is very far from meeting the ISC real-time processing requirements, showing a number of parameters and a frame latency respectively of 3 and 1 orders of magnitude higher than required by the DSCNN alternative when mapped to a Xilinx Zynq Ultrascale+ MPSoC. © 2023 SPIE.

关键词： Low power electronics

来源：评论

学校读者我要写书评

暂无评论

Exploring Motion Cues for video Test-time Adaptation 23

Exploring Motion Cues for Video Test-Time Adaptation

引用

31st ACM International conference on Multimedia (MM)

作者： Zeng, Runhao Deng, Qi Xu, Huixuan Niu, Shuaicheng Chen, Jian Shenzhen Univ Guangdong Key Lab Elect Control & Intelligent Rob Coll Mech & Control Engn Shenzhen Peoples R China Shenzhen Univ Ind Res Ctr Intelligent Mfg Shenzhen Peoples R China South China Univ Technol Guangzhou Peoples R China Nanyang Technol Univ Singapore Singapore

ISBN: (纸本)9798400701085

Test-time adaptation (TTA) aims at boosting the generalization capability of a trained model by conducting self-/un-supervised learning during testing in real-world applications. Though TTA on image-based tasks has seen significant progress, TTA techniques for video remain scarce. Naively introducing image-based TTA methods into video tasks may achieve limited performance, since these methods do not consider the special nature of video tasks, e.g., the motion information. In this paper, we propose leveraging motion cues in videos to design a new test-time learning scheme for video classification. We extract spatial appearance and dynamic motion clip features using two sampling rates (i.e., slow and fast) and propose a fast-to-slow unidirectional alignment scheme to align fast motion and slow appearance features, thereby enhancing the motion encoding ability. Additionally, we propose a slow-fast dual contrastive learning strategy to learn a joint feature space for fastly and slowly sampled clips, guiding the model to extract discriminative video features. Lastly, we introduce a stochastic pseudo-negative sampling scheme to provide better adaptation supervision by selecting a more reliable pseudo-negative label compared to the pseudo-positive label used in prior TTA methods. This technique reduces the adaptation difficulty often caused by poor performance on out-of-distribution test data before adaptation. Our approach significantly improves performance on various video classification backbones, as demonstrated through extensive experiments on two benchmark datasets.

关键词： test-time adaptation video classification motion encoding

来源：评论

学校读者我要写书评

暂无评论

video Smoke Detection Algorithm Based on a Spatial-Temporal Neural Network Model 9

Video Smoke Detection Algorithm Based on a Spatial-Temporal ...

引用

9th International conference on Intelligent Computing and Signal processing, ICSP 2024

作者： Cao, Zhen Zhang, Xi Shenyang Fire Science and Technology Research Institute Ministry of Emergency Management Shengyang China

ISBN: (纸本)9798350376548

a kind of spatial-temporal neural network video smoke detection algorithm is proposed in order to solve the problems associated with the incorrect classification of the static approximate smoke background in the face of the detection of smoke in video detection networks, and the problem of false alarms and of the original test model algorithms being different in different detection environments. Based on the original YOLO v4 neural network algorithm, this paper introduces a k-means + + algorithm and genetic algorithm, while using the algorithm's clustering function to classify the sample points of the real boxes of the image data set, which make it a more suitable anchor. At the same time, the genetic algorithm is used to adjust its anchor in order to allow the generated anchor to adapt to the needs related to smoke detection. In the original neural network model, the dual-stream network model algorithm is used to extract information from the first step of the YOLO algorithm in order to further filter the smoke's characteristics as well as filter out error information, all to improve the detection capabilities of the overall neural network for video smoke fog images. Compared with traditional YOLOv4 networks, the algorithm obtained by the model algorithm has been improved by 8.51°/0. In actual tests, the alarm time requirements of the smoke alarm test program for early fire monitoring and the alarm systems for visual images were improved, and the detection accuracy of the network was also improved based on the assurance of the detection speed, while the performance of the model algorithm was also improved for different scenes. © 2024 IEEE.

关键词： dual-stream Network K-means safety science smoke detection YOLOv4

来源：评论

学校读者我要写书评

暂无评论

Simulation Study on Feature Extraction of Human Body Motion video images Based on Information Entropy 5

Simulation Study on Feature Extraction of Human Body Motion ...

引用

5th Asia-Pacific conference on image processing, Electronics and Computers, IPEC 2024

作者： Zhao, Lun Shandong Vocational College of Science and Technology Weifang China

ISBN: (纸本)9798350374407

This study addresses the deficiencies in the analysis of local parameters of target features in human motion video images during the rapid extraction of local features. This deficiency leads to inaccurate description of low-level feature maps when constructing hierarchical structures of video images, thereby increasing extraction time and resulting in lower recall and precision rates. To address this issue, a method for rapidly extracting local features of target objects in human motion video images is proposed. By utilizing the information entropy between adjacent frames of human motion video images and combining it with density functions to determine initial clustering centers, the number of clusters is calculated to facilitate the analysis of local parameters of target features in human motion video images. The study constructs a hierarchical structure of human motion video images by selecting feature maps from the convolutional output layer. The method introduces information entropy to describe low-level feature maps of human motion video images and combines it with regional averaging to describe high-level feature maps, ultimately completing the extraction of local features in human motion video images. Experimental results demonstrate that the proposed method achieves rapid extraction of local features in human motion video images, with shorter feature extraction times and higher recall and precision rates. © 2024 IEEE.

关键词： Motion capture

来源：评论

学校读者我要写书评

暂无评论

video Stabilization based on Sub-pixel Keypoints 4

Video Stabilization based on Sub-pixel Keypoints

引用

4th International conference on Defence Technology, ICDT 2024

作者： Zhong, Bihua Wang, Fangyi Liu, Tao Liu, Chengyi Bai, Hongyang Wan, Gang National Key Laboratory of Transient Physics Nanjing University of Science and Technology Nanjing210094 China School of Energy and Power Engineering Nanjing University of Science and Technology Nanjing210094 China Key Laboratory of Maritime Intelligent Cyberspace Technology Nanjing University of Science and Technology Ministry of Education Nanjing210094 China School of Electronic and Optical Engineering Nanjing University of Science and Technology Nanjing210094 China

With the continuous advancement of seeker technology and image processing techniques, the precision of guided weapons has increasingly improved. However, due to the rigidly fixed structure between the seeker and the guided weapon, the weapon is prone to experiencing disturbances and vibrations during flight, which can deteriorate the real-time video quality monitored by the seeker and, consequently reduce the precision of the guided weapon. To enhance the precision of guided weapons, it is necessary to mitigate the adverse effects of vibrations through video stabilization techniques. This paper proposes a video stabilization framework based on sub-pixel keypoints detection. It utilizes a lightweight network to detect keypoints, employs the Lucas-Kanade (LK) optical flow method for motion estimation, and smooths the camera's motion path with an adaptive filter. Experimental results show that the proposed algorithm has an average processing time of approximately 0.2s per frame, achieving a stability index of 0.9262, a PSNR index of 22.3074, and an SSIM index of 0.7188. This demonstrates a balanced performance in terms of computational efficiency and stabilization, exhibiting excellent comprehensive performance. © 2024 Institute of Physics Publishing. All rights reserved.

关键词： Optical flows

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：