The crazy, unconscious use of the Internet, and the increase in cybercrime and hacking, which resulted in the loss of a large number of sensitive data, the risk of piracy, etc. were the motivation for protecting right...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate *** this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in *** support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the *** the model complexity and the overall model *** fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA *** far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no ***,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA *** indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that *** the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model *** Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
Nowadays, Internet of Things (IoT) become progressively a fundamental part of our life. It revolutionizes various industries by enabling seamless connectivity between devices as well as it increases automation and eff...
详细信息
Parkinson’s disease (PD) is a neurodegenerative disorder with slow progression whose symptoms can be identified at late stages. Early diagnosis and treatment of PD can help to relieve the symptoms and delay progressi...
详细信息
In the field of digital image and computer vision, haze and smoke removal (dehazing) is one pf a popular scientific arena where it is being studied by an ample number of computer scientists. However, conventional join...
详细信息
Network intrusion detection systems (NIDSs) play an important role in protecting network infrastructure from cyber threats. Traditional NIDS often rely on signature-based or rule-based methods, which can contention to...
详细信息
Due to the extreme growth in digital information and data, cybersecurity has become one of the major concerns addressed by recent research, organizations, and governments. However, Traditional security methods are fin...
详细信息
Drowsiness detection is a critical aspect of ensuring safety in various domains, including transportation, online learning, and multimedia consumption. This research paper presents a comprehensive investigation into d...
详细信息
Narrator disambiguation is a field within hadith science that studies unidentified narrators in hadith narration chains, also known as sanads. Sanads can be represented as graphs, with the nodes representing the narra...
详细信息
Predicting and controlling crowd dynamics in emergencies is one of the main objectives of simulated emergency exercises. However, during emergency exercises, there is often a lack of sense of danger by the actors invo...
详细信息
暂无评论