Long Short-Term Memory (LSTM) networks are particularly useful in recommender systems since user preferences change over time. Unlike traditional recommender models which assume static user-item interactions, LSTM mod...
详细信息
Multimodal sentiment analysis on images with textual content is a research area aiming to understand the sentiment conveyed by visual and textual elements in the images. While multimodal sentiment analysis on images a...
详细信息
Class imbalance is a significant and emerging issue in machine learning, which expresses that the number of majority class instances is much greater than the number of minority class instances. In real applications, a...
详细信息
Deep convolutional neural network architectures have in recent years been widely used for enhancing various computer vision tasks, such as Image classification, Semantic Segmentation and Object detection. With great a...
详细信息
A key goal of clustering is data reduction. In center-based clustering of complex objects therefore not only the number of clusters but also the complexity of the centers plays a crucial role. We propose LBudget Clust...
详细信息
The increasing complexity of designing, deploying, and maintaining Cyber-Physical Systems (CPS), particularly those incorporating multiple interacting robots, presents significant challenges regarding programming and ...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate *** this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in *** support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the *** the model complexity and the overall model *** fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA *** far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no ***,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA *** indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that *** the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model *** Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
作者:
Petkar, Taniya G.Kumar, PraveenSarate, Kirtiksha U.
Faculty of Engineering and Technology Department of Computer Science & Medical Engineering Maharashtra Sawangi Wardha442001 India
Faculty of Engineering and Technology Department of Computer Science and Design Maharashtra Sawangi Wardha442001 India
By enabling precise, individualized, and effective treatments, the integration of artificial intelligence (AI) and machine learning (ML) into wound and skin healing is revolutionizing healthcare. Artificial intelligen...
详细信息
作者:
Warbhe, Mohan K.Bore, Joy JordanChaudari, Shiv Nath
Faculty of Engineering and Technology Department of Computer Science and Design Maharashtra Sawangi Wardha442001 India
Faculty of Engineering and Technology Department of Computer Science and Medical Engineering MaharashtraSawangi Wardha442001 India
The proposed web application for tomato leaf disease detection exemplifies the transformative power of Artificial Intelligence and computer Vision in modern agriculture. Addressing the critical issue of early and accu...
详细信息
An ultra-wideband (UWB) slotted compact Vivaldi antenna with a microstrip line feed was evaluated for microwave imaging (MI) applications. The recommended FR4 substrate-based Vivaldi antenna is 50×50×1.5 mm3...
详细信息
暂无评论