检索结果-内蒙古大学图书馆

12th IEEE International Symposium on Signal, image, video, and Communications, ISIVC 2024

作者： Marasco, Nicholas Roberge, Vincent Elghamrawy, Haidy Tarbouchi, Mohammed Noureldin, Aboelmagd Royal Military College of Canada Kingston Department of Electrical and Computer Engineering Canada

ISBN: (纸本)9798350385267

Equipment health monitoring (EHM) techniques are increasing in their ability to accurately diagnose defective equipment. This increase in capability comes with an increase in computational complexity. For these techniques to be useful in real applications, the algorithms must be computable in real time. The Fast Orthogonal Search (FOS) algorithm shows the potential to be effective in a variety of EHM applications. In this paper, we demonstrate that the FOS algorithm can be accelerated to real-time processing on real examples of ship-radiated noise by using parallel processing, making it suitable for use in EHM. © 2024 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

image processing based real-time Online Attendance Monitoring System using Facial Recognition 3

Image Processing based Real-Time Online Attendance Monitorin...

引用

3rd International conference on image processing and Robotics, ICIPRoB 2024

作者： Fernando, Ravishka Athauda, Hashini Cinec Campus Department of Electrical and Electronics Engineering Malabe Sri Lanka

ISBN: (纸本)9798350374766

With the advancement of technology and due to the recent pandemic situation, the education sector has turned to the online teaching method. But the main problem here is the inconvenience and irregularities in the student's attendance. To ensure traditional attendance and reduce time wastage, this research aims to explore and implement an automated attendance marking system using facial recognition technology. This will enable students to be marked present or absent in real-time and will also help teachers to identify students who are present or absent for the online session. As a novelty of this research gives a more efficient and accurate method for attendance marking and eliminating manual and false attendance marking in online sessions. The proposed system employs the latest advancements in image processing and machine learning techniques such as the Haar Cascade feature and LBPH algorithm to accurately detect and recognize the face of a student. The performance of the system is evaluated on its own dataset which was the images of students captured through a video stream from a web camera and the results demonstrate through a confusion matrix its effectiveness in accurately recognizing faces and marking attendance in real-time. The results showed that the attendance system achieved 99.22% accuracy and can accurately mark the attendance of students in an Excel sheet. This real-time GUI-based system is unique in that it revolutionizes the traditional attendance marking process by automating it and providing real-time attendance data. © 2024 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

A Survey on 360°images and videos in Mixed reality:Algorithms and Applications

引用

Journal of Computer Science & Technology 2023年第3期38卷 473-491页

作者：张方略赵军红张赟 Stefanie Zollmann School of Engineering and Computer Science Victoria University of WellingtonWellington 6012New Zealand College of Media Engineering Communication University of ZhejiangHangzhou 310018China Department of Computer Science University of OtagoDunedin 9054New Zealand

Mixed reality technologies provide real-time and immersive experiences,which bring tremendous opportunities in entertainment,education,and enriched experiences that are not directly accessible owing to safety or *** research in this field has been in the spotlight in the last few years as the metaverse went *** recently emerging omnidirectional video streams,i.e.,360°videos,provide an affordable way to capture and present dynamic real-world *** the last decade,fueled by the rapid development of artificial intelligence and computational photography technologies,the research interests in mixed reality systems using 360°videos with richer and more realistic experiences are dramatically increased to unlock the true potential of the *** this survey,we cover recent research aimed at addressing the above issues in the 360°image and video processing technologies and applications for mixed *** survey summarizes the contributions of the recent research and describes potential future research directions about 360°media in the field of mixed reality.

关键词： 360°image mixed reality 360°image processing virtual reality scene reconstruction virtual reality content manipulation

来源：评论

学校读者我要写书评

暂无评论

An FPGA based real-time video processing system on Zynq 7010 2

An FPGA based Real-Time Video Processing system on Zynq 7010

引用

2nd International conference on Advances in Computational Intelligence and Communication, ICACIC 2023

作者： Singha, Antareep Puducherry Technological University Department of Mechatronics Engineering Puducherry India

ISBN: (纸本)9798350318456

real-time image processing involves the transformation of incoming signals, primarily from a camera, into a format that can be readily interpreted by a display device. This process is heavily reliant on precise timing constraints, demanding efficient hardware execution. This paper proposes an innovative method for interfacing the OV7670 Complementary Metal Oxide Semiconductor (CMOS) Camera with an FPGA-based real-time image processing system on a Zynq 7010 platform, using the open-source Digilent Dynamic Clock Generator. The architecture is characterized by it's parallel processing capability of both controlling the camera output signals and processing the signals and converting them from RGB to DVI format on the fly. In lieu of the traditional PLL based clocking wizard, which provides a fixed clock signal, the open-source Dynamic Clock Generator has been incorporated in the architecture to generate the essential pixel clock, meeting the real-time clocking requirements. The RGB to DVI(Digital Visual Interface) block has been coded in VHDL to convert the output from AXI4-Stream to video Out Xilinx IP Core to TMDS (Transition-Minimized Differential Signaling data, to be interpreted by an HDMI compatible monitor. © 2023 IEEE.

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

Lip-Audio Modality Fusion for Deep Forgery video Detection

引用

Computers, Materials & Continua 2025年第2期82卷 3499-3515页

作者： Yong Liu Zhiyu Wang Shouling Ji Daofu Gong Lanxin Cheng Ruosi Cheng College of Cyberspace Security Information Engineering UniversityZhengzhou450001China Research Institute of Intelligent Networks Zhejiang LabHangzhou311121China College of Computer Science and Technology Zhejiang UniversityHangzhou310027China Henan Key Laboratory of Cyberspace Situation Awareness Zhengzhou450001China Key Laboratory of Cyberspace Security Ministry of EducationZhengzhou450001China

In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-time processing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.

关键词： Deep forgery video detection lip-audio modality fusion mel frequency cepstrum coefficient siamese neural network band-pass filter

来源：评论

学校读者我要写书评

暂无评论

Practical real-time image compression for resource-challenged devices

Practical real-time image compression for resource-challenge...

引用

conference on Multimodal image Exploitation and Learning

作者： Pham, Kevin Depoian, Arthur C., II Bailey, Colleen P. Univ North Texas Dept Elect Engn Denton TX 76207 USA

ISBN: (纸本)9781510673854;9781510673847

In the era of rapidly expanding image data, the demand for improved image compression algorithms has grown significantly, particularly with the integration of deep learning approaches into traditional image processing tasks. However, many of the existing solutions in this domain are burdened by computational complexity, rendering them unsuitable for real-time deployment on standard devices as they often necessitate complex systems and substantial energy consumption. This work addresses the growing paradigm of edge computing for real-time applications by introducing a novel, on-edge device solution. This innovative approach aims to strike a balance between efficiency and accuracy, adhering to the practical constraints of real-world deployment. By presenting demonstrations of the proposed solution's performance on readily available devices, we provide tangible evidence of its applicability and viability in real-world scenarios. This advance contributes to the ongoing dialogue about the need for accessible and efficient image compression algorithms that can be deployed real-time applications on edge devices, bridging the gap between the demanding computational requirements of deep learning and the practical limitations of everyday hardware. As data continues to surge, solutions like this become ever more critical in ensuring effective image compression, aligning with on-edge computing within AI. This research paves the way for improved image processing in real-time applications while conserving computational resources and energy consumption.

关键词： image compression efficient AI real time processing

来源：评论

学校读者我要写书评

暂无评论

Transfer learning and Machine Learning Classification for Laparoscopic video Distortion Detection 8

Transfer learning and Machine Learning Classification for La...

引用

8th IEEE International conference on image and Signal processing and their Applications, ISPA 2024

作者： Mohamed, Belmokeddem Kamila, Khemis Salim, Loudjedi University Abou-Bekr Belkaid of Tlemcen Faculty of Technology Biomedical Engineering Department Tlemcen Algeria University Abou-Bekr Belkaid of Tlemcen Surgery B Tlemcen Hospital Department of Medicine Tlemcen Algeria

ISBN: (纸本)9798350309249

Distortions like blur and smoke in real-time laparoscopic videos often result from lens contamination. Detecting these distortions automatically and "in real time"is a step preceding automatic lens cleaning and leads to a clear vision for surgeons, hence reducing surgical time and minimizing risks for patient. Our approach leveraged transfer learning, transposing knowledge from natural images to the laparoscopic video domain. We utilized a pre-trained ResNet50 convolutional neural network (CNN) to extract image features, subsequently processed by a cascade of support vector machine (SVM) classifiers to categorize various distortions. This last strategy amalgamates outputs from two binary classifiers. The first classifier distinguishes videos as good or distorted. The second classifier focuses on smoke and blur detection. The first classifier attains 99 % accuracy. The second classifier achieves 100% accuracy in detecting smoke and blur. These values prove the effectiveness of our approach, which combines ResNet50-based transfer learning and cascaded SVM classification for automatically detecting smoke and blur distortions in laparoscopic videos. Such results are promising for the detection of the remaining distortions. © 2024 IEEE.

关键词： Smoke

来源：评论

学校读者我要写书评

暂无评论

Efficient Camera Pose Adjustment to a Mirror Array for Structured Light Field video Acquisition

Efficient Camera Pose Adjustment to a Mirror Array for Struc...

引用

2024 conference on Visual Communications and image processing

作者： Maeda, Shunsuke Kodama, Kazuya Harnamoto, Takayuki Tokyo Univ Sci Grad Sch Engn Niijuku 6-3-1Katsushika Ku Tokyo 1258585 Japan Res Org Informat & Syst Natl Inst Informat Hitotsubashi 2-1-2Chiyoda Ku Tokyo 1018430 Japan

ISBN: (纸本)9798331529543;9798331529550

We previously implemented an inexpensive imaging system that combines a single real camera with a mirror array located along a paraboloid. It allows us to robustly acquire dynamic light fields composed of multi-view videos by providing a virtual camera array, where its viewpoints exist in the mirrors. Actually, as moving the real camera to the focus of the paraboloid, virtual viewpoints in the mirrors get equally-spaced to achieve multi-view imaging with structured disparity. In this paper, we discuss an efficient method for adjusting the pose of a single camera to acquire high quality dynamic light fields as multi-view videos. Specifically, we introduce some indicator values determined by detected corners of the mirror array on acquired images while adjusting the camera. By using these values for camera adjustment, we easily know how to move its position and virtually correct its angle through homography transform. Experimental results of simulations demonstrate that our proposed method sufficiently achieves structured light field video acquisition with equally-spaced virtual viewpoints, where we do not need camera rotation requiring complex devices and only the camera position is controlled by a simple 3D system like XYZ stages.

关键词： light field multi-view imaging mirror homography

来源：评论

学校读者我要写书评

暂无评论

Simple yet Effective video-Based Epileptic Tonic-Clonic Seizure Detection

引用

APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION processing 2024年第1期13卷

作者： Yazaki, Yoshinao Watanabe, Satsuki Tanaka, Yuichi Tokyo Univ Agr & Technol Grad Sch BASE Tokyo Japan Saitama Med Univ Hosp Dept Psychiat Hidaka Japan Osaka Univ Grad Sch Engn Osaka Japan

Epilepsy, a prevalent neurological disorder, often leads to tonicclonic seizures characterized by loss of consciousness and uncontrolled motor activity. Prompt detection of these seizures is crucial for effective nursing and diagnosis. This paper introduces a novel analysis, eliminating the need for body attachments or special equipment like markers or specific clothing. Our approach is straightforward: each video frame is segmented into blocks, and the average values of these blocks are computed. We then analyze the temporal changes in these averages using spectrograms. Our findings indicate that during tonic-clonic seizures, dominant frequency components typically range from 1 to 6 Hz and decrease as the seizure progresses. By capitalizing on these clinical observations, we have formulated effective detection rules. Experimental evaluations reveal that our method not only accurately detects epileptic seizures but also operates approximately four times faster than real-time on standard desktop computers. This efficiency and accuracy underscore the potential of our method as a practical tool in epilepsy monitoring and management.

关键词： Epilepsy tonic-clonic seizure video analysis image processing and image feature extraction

来源：评论

学校读者我要写书评

暂无评论

An Overview of Moving Object Detection Using YOLO Deep Learning Models 2

An Overview of Moving Object Detection Using YOLO Deep Learn...

引用

2nd International conference on Disruptive Technologies, ICDT 2024

作者： Dwivedi, Upendra Joshi, Kireet Shukla, Surendra Kumar Rajawat, Anand Singh G L Bajaj Institute of Technology and Management Department of Applied Computational Science and Engineering Greater Noida India Graphic Era Deemed to Be University Department of Computer Science and Engineering Dehradun India SVKM S Nmims Mpstme Department of Computer Science and Engineering Shirpur Campus Shirpur India School of Computer Sciences and Engineering Sandip University Nashik India

ISBN: (纸本)9798350371055

Computer vision is a promising domain that focuses on emerging approaches, algorithms and technologies to provide computing capability to machine to analysis visual data, such as image files, videos files and real time video streaming. In Computer Vision and image processing detecting object from images and videos has been topic of extensive research. This is accomplished by applying various computer vision techniques to analyze the visual data and determine class of objects from image and videos files. One widespread approach to object detection is using the deep learning models. YOLO (You Only Look Once) is a convolutional neural network that offers a fast and efficient solution to object detection using deep learning. It enable computer to handle real-time detection of objects in video frames and accurately locate and classify moving objects. YOLO is able to simultaneously detect and classify objects in an efficient manner using convolutional neural networks. © 2024 IEEE.

关键词： Deep Learning image processing Motion Detection Moving Object Detection YOLO

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：