检索结果-内蒙古大学图书馆

Classroom video image Emotion Analysis Method for Online Teaching Quality Evaluation

TRAITEMENT DU SIGNAL 2022年第5期39卷 1767-1774页

作者： Wang, Shunye Cheng, Limin Liu, Dayong Qin, Junqiao Hu, Guohua Langfang Normal Univ Coll Elect & Informat Engn Langfang 065000 Peoples R China Langfang Normal Univ Off Sci Res Langfang 065000 Peoples R China Langfang Normal Univ Coll Educ Langfang 065000 Peoples R China

Classroom emotion is an important dimension to evaluate teaching effect, and the application of image processing to online teaching emotion analysis has become an inevitable trend of development. Aiming at the problems of low accuracy of expression recognition, unclear emotion scheme for online teaching evaluation, and low applicability of expression recognition model in existing methods, this paper conducts a research on classroom video image emotion analysis method for online teaching quality evaluation. First, the classroom video image emotion analysis task is divided into facial expression recognition task and facial feature point location task, and multi-task learning is carried out to achieve real-time switching between the two tasks for different types of input. The tag attention mechanism is proposed to deeply mine the key areas of the face in the classroom video image, so as to maintain the compactness of the distribution of the center sample of the classroom video image and its neighborhood samples in the feature space. Finally, based on the expression activity of teachers and students in the online classroom, the online classroom teaching emotion is analyzed, and the online teaching quality is evaluated from the side. The experimental results verify the validity of the model.

关键词： classroom emotion classroom video online teaching quality evaluation image emotion analysis

来源：评论

学校读者我要写书评

暂无评论

real-time video Enhancement with Spatio-Temporal Attention via Deep Convolutional Neural Networks 3rd

Real-Time Video Enhancement with Spatio-Temporal Attention v...

引用

3rd International conference on Intelligent Systems and Sustainable Computing, ICISSC 2023

作者： Maria, Anto Bennet Balasubramanian, Swaminathan Vijayan, Rajmohan Sonthi, Vijaya Krishna Ramakrishnan, Jayapratha Manogaran, Umamaheswari Sengan, Sudhakar Department of Electronics and Communication Engineering Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology Tamil Nadu Chennai600062 India Karnataka Bengaluru560069 India Tamil Nadu Chennai602107 India Department of Computer Science and Engineering Koneru Lakshmaiah Education Foundation Andhra Pradesh Vaddeswaram522502 India Department of Information Technology V.S.B. College of Engineering Technical Campus Tamil Nadu Coimbatore642109 India School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Vellore632014 India Department of Computer Science and Engineering PSN College of Engineering and Technology Tamil Nadu Tirunelveli627451 India

ISBN: (纸本)9789819783540

The realm of video content has witnessed exponential growth, and alongside it emerges the challenge of enhancing video quality efficiently. Traditional techniques, although valuable, often fall short of addressing the intricate details and temporal consistencies required for high-quality video upscaling. Moreover, real-time processing demands methods ensuring accuracy and swift computational speeds. Addressing these challenges, this paper introduces the STA-DCNN. This innovative architecture is rooted in combining spatial and temporal attention mechanisms to produce superior video super-resolution. The Spatio-Temporal Attention Module (STAM) acts as a cornerstone, singling out essential features across frames, while the Deep Convolutional Backbone (DCB) intensifies these frames, capitalizing on attention-informed features. Furthermore, the Temporal Memory Unit (TMU) preserves crucial aspects from preceding frames, guaranteeing a seamless video rendition. Empirical tests confirm that STA-DCNN notably surpasses its counterparts, offering an ideal balance of quality and real-time processing. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Spatio-temporal data

来源：评论

学校读者我要写书评

暂无评论

Vehicle Detection Algorithm Based on video image processing in Intelligent Transportation System

Vehicle Detection Algorithm Based on Video Image Processing ...

引用

作者： Guan, Hongliang Jilin Railway Technology College Jilin China

With the rapid development of the automobile industry, the problem of traffic congestion has become increasingly prominent. In order to reduce traffic accidents, improve road transportation efficiency and ensure road traffic safety, there is an urgent need for intelligent processing of vehicle real-time dynamic detection research. In this paper, the research of vehicle detection algorithm based on video image processing in the intelligent transportation system is aimed at real-time detection of vehicles through the algorithm to ensure driving safety. This article mainly uses experimental method and comparative method to analyze the application of several different algorithms in detection. The experimental results show that the minimum feature vector of the background difference method in vehicle detection is 32, but the highest accuracy is 98.4%. This shows that the background difference method is very useful in image processing. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Automotive industry

来源：评论

学校读者我要写书评

暂无评论

On-the-Fly CT image Pre-processing on MPSoC-FPGAs 37th

On-the-Fly CT Image Pre-processing on MPSoC-FPGAs

引用

37th International conference on Architecture of Computing Systems (ARCS)

作者： Passaretti, Daniele Pionteck, Thilo Otto von Guericke Univ D-39106 Magdeburg Germany

ISBN: (纸本)9783031661457;9783031661464

Due to the increasing number of tumors, new interventional Computed Tomography (CT) procedures have been proposed that aim to optimize workflow, time-effective diagnosis and treatments. To support tumor ablation procedures, CT scanners must pre-process 2D projections and reconstruct 3D slices of the human body in real time, while data are acquired. This paper proposes a lightweight processing architecture for MPSoC-FPGA that performs the "CT pre-processing phase" on the fly;this phase consists of the pixel processing of 2D images. It is also suitable for exploring different data formats that can be selected at design time to improve performance while keeping image quality. This article focuses on the cosine and redundancy weighting steps, which can not be implemented following the standard method on embedded MPSoC-FPGA, due to the high resource utilization costs of their arithmetic operations. Therefore, this work proposes different optimizations that result in a reduction of the number of operations to compute and the amount of on-chip memory required in comparison to the standard algorithm. Finally, the proposed architecture has been implemented and instantiated within a Control Data Acquisition System (CDAS) architecture running on the XC7Z045 AMD-Xilinx MPSoC-FPGA and integrated into an open-interface CT scanner assembled in our laboratory. Here, the optimized weighting steps use up to 33.8 times fewer DSPs than the implementation based on the standard solution. Furthermore, it adds only 80 ns of latency, making it 7.9 times faster than the implementation based on the standard solution.

关键词： Filtered Back-Projection real-time Data processing Signal processing System-on-Chip High-Level Synthesis

来源：评论

学校读者我要写书评

暂无评论

Driver fatigue detection method based on facial image analysis 16

Driver fatigue detection method based on facial image analys...

引用

16th International conference on Human System Interaction (HSI)

作者： Cichocka, S. Ruminski, J. Gdansk Univ Technol Fac Elect Telecommun & Informat Dept Biomed Engn Gdansk Poland

ISBN: (纸本)9798350362923;9798350362916

Nowadays, ensuring road safety is a crucial issue that demands continuous development and measures to minimize the risk of accidents. This paper presents the development of a driver fatigue detection method based on the analysis of facial images. To monitor the driver's condition in real-time, a video camera was used. The method of detection is based on analyzing facial features related to the mouth area and eyes, such as the frequency of blinking and yawning, mouth aspect ratio (MAR), and the duration of eye closure. The method was implemented in Python using a convolutional neural network (CNN). To validate the method, a dataset was created containing eye images that were subjected to various modifications, including the use of corrective glasses. The model's results confirm the method's effectiveness in detecting fatigue, achieving an average accuracy of 92% for eye detection and 82% for yawning detection under well-lit conditions.

关键词： Facial recognition drowsiness real-time monitoring machine learning neural networks driver fatigue

来源：评论

学校读者我要写书评

暂无评论

real-time MRI video Synthesis from time Aligned Phonemes with Sequence-to-Sequence Networks 48

Real-Time MRI Video Synthesis from Time Aligned Phonemes wit...

引用

48th IEEE International conference on Acoustics, Speech and Signal processing, ICASSP 2023

作者： Udupa, Sathvik Ghosh, Prasanta Kumar Electrical Engineering Department Bangalore560012 India

ISBN: (纸本)9781728163277

real-time Magnetic resonance imaging (rtMRI) of the midsagittal plane of the mouth is of interest for speech production research. In this work, we focus on estimating utterance level rtMRI video from the spoken phoneme sequence. We obtain time-aligned phonemes from forced alignment, to obtain frame-level phoneme sequences which are aligned with rtMRI frames. We propose a sequence-tosequence learning model with a transformer phoneme encoder and convolutional frame decoder. We then modify the learning by using intermediary features obtained from sampling from a pretrained phoneme-conditioned variational autoencoder (CVAE). We train on 8 subjects in a subject-specific manner and demonstrate the performance with a subjective test. We also use an auxiliary task of air tissue boundary (ATB) segmentation to obtain the objective scores on the proposed models. We show that the proposed method is able to generate realistic rtMRI video for unseen utterances, and adding CVAE is beneficial for learning the sequence-to-sequence mapping for subjects where the mapping is hard to learn. © 2023 IEEE.

关键词： real-time Magnetic Resonance Imaging sequence-to-sequence learning speech production

来源：评论

学校读者我要写书评

暂无评论

Detection of Multiple Objects from video Frames Through Mobilenet-SSDv2 2

Detection of Multiple Objects from Video Frames Through Mobi...

引用

2nd IEEE International conference on Integrated Circuits and Communication Systems, ICICACS 2024

作者： Ramakrishna, K.V.S.S. Vyshnavi, P. Surekha, K. Bhavani, D. Venkata Mahitha, D. Vignan's Nirula Institute of Technology and Science for Women Department of It Guntur India

ISBN: (纸本)9798350317558

Object detection involves looking at pictures and figuring out where things are and what the things are. This idea is expanded into real-time settings where numerous items must be detected simultaneously by using multiple object recognition from video frames. There are already some very effective models for multiple object recognition, such as YOLO and Faster R-CNN. There are, however, issues with computational effectiveness and real-time processing that these models frequently encounter. This problem has been fixed with MobileNet-SSDv2. Together, MobileNetV2 and the Single Shot Multi Box Detector (SSD) framework ensure precise object localization. MobileNet-SSDv2's simplified architecture leverages convolutions that are depth-wise separable to decrease calculation, enabling it to run seamlessly on resource-constrained devices. MobileNet-SSDv2 has overcome the limits associated with previous object detection models by increasing processing efficiency without sacrificing accuracy. The model's performance was exceptional, achieving a 97% success rate in detecting multiple objects from video frames. © 2024 IEEE.

关键词： deep learning frameworks Multiple object detection object recognition video frame analysis

来源：评论

学校读者我要写书评

暂无评论

Automatic Bharatanatyam Dance video Annotation Tool Using CNN 8th

Automatic Bharatanatyam Dance Video Annotation Tool Using CN...

引用

8th International conference on Computer Vision and image processing (CVIP)

作者： Bhuyan, Himadri Das, Partha Pratim Tewari, Vishal GITAM Visakhapatnam 530045 Andhra Pradesh India Indian Inst Technol Kharagpur Kharagpur 721302 W Bengal India

ISBN: (纸本)9783031581809;9783031581816

Dance video analysis and interpretation have been challenging tasks in computer vision due to the lack of availability of annotated data. We may get several videos available in the public domain but hardly annotated because it incurs much time, cost, and expert knowledge. This paper addresses a solution to this problem in the field of Bharatanatyam, an Indian Classical Dance (ICD) form. The video annotation itself has a broad area of coverage. It may include the annotation of objects, activities, hand gestures, facial expressions, the semantics of a video, and many more. However, the paper annotates the elementary postures of the Bharatanatyam dance videos. The proposed tool takes a video as an input and segments it into motion and stationary/non-motion frames. The non-motion frames are part of the elementary postures and our point of interest. After segmentation, the tool recognizes the postures and labels the postures' duration of occurrence inform of frame number. The data set on which it is applied covers most basic dance variations in the Bharatanatyam, which was not addressed earlier on this large scale. The basic dance variations used to learn Bharatanatyam are called Adavus. The paper includes 13 Adavus and its 52 variations, which three dancers have performed. We use this as our data set. This annotation tool may significantly contribute to the state of the arts, saving the time and cost involved in manual annotation. The tool uses the deep learning technique for dance video annotation and gets an accuracy above 75%.

关键词： Bharatanatyam Adavu Elementary Posture CNN

来源：评论

学校读者我要写书评

暂无评论

TSANET: TEMPORAL AND SCALE ALIGNMENT FOR UNSUPERVISED video OBJECT SEGMENTATION 30

TSANET: TEMPORAL AND SCALE ALIGNMENT FOR UNSUPERVISED VIDEO ...

引用

30th IEEE International conference on image processing (ICIP)

作者： Lee, Seunghoon Cho, Suhwan Lee, Dogyoon Lee, Minhyeok Lee, Sangyoun Yonsei Univ Seoul South Korea Korea Inst Sci & Technol KIST Seoul South Korea

ISBN: (纸本)9781728198354

Unsupervised video Object Segmentation (UVOS) refers to the challenging task of segmenting the prominent object in videos without manual guidance. In recent works, two approaches for UVOS have been discussed that can be divided into: appearance and appearance-motion-based methods, which have limitations respectively. Appearance-based methods do not consider the motion of the target object due to exploiting the correlation information between randomly paired frames. Appearance-motion-based methods have the limitation that the dependency on optical flow is dominant due to fusing the appearance with motion. In this paper, we propose a novel framework for UVOS that can address the aforementioned limitations of the two approaches in terms of both time and scale. Temporal Alignment Fusion aligns the saliency information of adjacent frames with the target frame to leverage the information of adjacent frames. Scale Alignment Decoder predicts the target object mask by aggregating multi-scale feature maps via continuous mapping with implicit neural representation. We present experimental results on public benchmark datasets, DAVIS 2016 and FBMS, which demonstrate the effectiveness of our method. Furthermore, we outperform the state-of-the-art methods on DAVIS 2016.

关键词： video Object Segmentation Temporal Alignment Scale Alignment Implicit Neural Representation Joint Training

来源：评论

学校读者我要写书评

暂无评论

Experimental Evaluation of LED-based Wearable Transmitter for Optical Camera Communications Systems 17

Experimental Evaluation of LED-based Wearable Transmitter fo...

引用

17th International conference on Telecommunications (ConTEL)

作者： Niarchou, Eleni Matus, Vicente Rabadan, Jose Guerra, Victor Perez-Jimenez, Rafael IDeTIC ULPGC Las Palmas Gran Canaria Spain Pi Lighting Valais Switzerland

ISBN: (纸本)9798350302233

In this paper, we experimentally demonstrate an optical camera communications (OCC) system for wearable light-emitting diode (LED) as the transmitter. Wearable devices are powerful tools for supporting Internet of Things (IoT) systems because of their sensing, processing, and communication capability. The term "wearable devices" refers to a wide range of products that can be integrated into clothing and accessories, thus allowing real-time data detection, storage, and exchange without human intervention. This paper presents the practical evaluation of an LED-based wearable transmitter for an OCC system to demonstrate its feasibility. In particular, an LED array attached to the body is modulated using on-off keying to transmit data via visible light, and a smartphone camera captures video of the user wearing the device while slightly moving in a static position in the room. Finally, the data is decoded from the video frames using an image processing algorithm that tracks the source and demodulates the signal.

关键词： Optical camera communications (OCC) visible light communications (VLC) Internet of Things (IoT) wearable devices image processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：