Robots have emerged as versatile tools with significant potential to enhance teaching and learning environments, including classrooms, laboratories, play homes, crèches, and even at home. Their engaging nature ca...
详细信息
K-Nearest Neighbors (KNN), a simple and widely used algorithm, is extremely valuable in the field of machine learning models. Finding an optimal value of the nearest neighbor parameter in the KNN algorithm has been a ...
详细信息
Recent media have increasingly shifted towards multimedia formats that simultaneously utilize visual and linguistic information. Research on multimodal AI is actively conducted to analyze large-scale multimodal data e...
详细信息
ISBN:
(纸本)9798400706295
Recent media have increasingly shifted towards multimedia formats that simultaneously utilize visual and linguistic information. Research on multimodal AI is actively conducted to analyze large-scale multimodal data effectively. Multimodal AI fuses the probability and feature values extracted from single modalities by a backbone model, enabling the simultaneous analysis of multimodal information. This allows for discovering new insights that may not be detectable through single-modality analysis. Depending on the data collection environment, multimedia can be classified into one-to-one and one-to-many modality balances. Previous multimodal AI approaches analyze these one-to-many relationships by downsampling or duplicating data to fit a one-to-one relationship. In this paper, we optimize multimedia analysis in one-to-one and one-to-many modality balances based on the local and global context analysis capabilities of multimodal AI and the multimodal analysis characteristics of backbone models. The multimedia analysis system employs late score and feature fusion to independently analyze the local context as the baseline for multimodal AI. In contrast, early and hierarchical feature fusion is utilized for comprehensive global context analysis. The backbone models used include ViT and RoBERTa to analyze the overall structure of multimodal data and BEiT and DeBERTa to analyze structural features. Experimental results show that, in the duplication method, late score and feature fusion, which independently analyze the local context of multimodal data, are 0.56% more accurate and achieve an f1 score that is 0.025 higher. Additionally, BEiT and DeBERTa, which analyze structural features, demonstrate a 0.2% increase in accuracy and a 0.0167 improvement in f1 score. In the downsampling method, early and hierarchical feature fusion, which comprehensively analyzes the global context, outperforms by 1.17% in accuracy and 0.0164 in f1 score. Furthermore, ViT and RoBERTa, which foc
Sound event detection is pivotal in various applications, including environmental monitoring and surveillance systems, enhancing situational awareness and response strategies. This paper investigates the intricacies o...
详细信息
The use of deep learning algorithms for vehicle detection and speed estimate in traffic surveillance systems is investigated in this research study. Convolutional Neural Networks (CNNs) are the main tool used in this ...
详细信息
In an era marked by remarkable technological advancements, the way we create and share information has undergone a profound transformation. This paradigm shift is epitomized by the NextGen Dynamic Video Generator usin...
详细信息
Digital game-based learning (DGBL) has been viewed as an effective teaching strategy that encourages students to pick up and learn a subject. This paper explores its viability to help increase the reach and efficiency...
详细信息
Traffic management also plays a crucial role in urban planning and development, with pressing challenges related to congestion, safety, and environmental impact. In this study, we proposed a real time traffic control ...
详细信息
Modifying an algorithm that has been established over many years and making it even faster has always been a fascinating and challenging area in the field of algorithms, which motivated us to take the challenge of imp...
详细信息
Drowsiness during driving includes high risks for the driver, the co-passengers, and the other people on the road. Advance Driver Assistance Systems (ADAS) have been proposed to reduce these risks. This work introduce...
详细信息
暂无评论