Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...
详细信息
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate *** this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in *** support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the *** the model complexity and the overall model *** fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA *** far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no ***,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA *** indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that *** the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model *** Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
The manual process of evaluating answer scripts is strenuous. Evaluators use the answer key to assess the answers in the answer scripts. Advancements in technology and the introduction of new learning paradigms need a...
详细信息
Optical wireless networks emerge as a promising solution to the ever-growing data demand for user-centric indoor applications. This work demonstrates a novel approach to advance multi-beam radiation patterns in indoor...
详细信息
With the development of deep learning in recent years, code representation learning techniques have become the foundation of many software engineering tasks such as program classification [1] and defect detection. Ear...
With the development of deep learning in recent years, code representation learning techniques have become the foundation of many software engineering tasks such as program classification [1] and defect detection. Earlier approaches treat the code as token sequences and use CNN, RNN, and the Transformer models to learn code representations.
Wireless Federated Learning (FL) is a distributed Artificial Intelligence (AI) framework, enabling decision-making at the network edge where data are generated. However, wireless transmissions of model updates from ed...
详细信息
Fog computing is a key enabling technology of 6G systems as it provides quick and reliable computing,and data storage services which are required for several 6G *** Intelligence(AI)algorithms will be an integral part ...
详细信息
Fog computing is a key enabling technology of 6G systems as it provides quick and reliable computing,and data storage services which are required for several 6G *** Intelligence(AI)algorithms will be an integral part of 6G systems and efficient task offloading techniques using fog computing will improve their performance and *** this paper,the focus is on the scenario of Partial Offloading of a Task to Multiple Helpers(POMH)in which larger tasks are divided into smaller subtasks and processed in parallel,hence expediting task ***,using POMH presents challenges such as breaking tasks into subtasks and scaling these subtasks based on many interdependent factors to ensure that all subtasks of a task finish simultaneously,preventing resource ***,applying matching theory to POMH scenarios results in dynamic preference profiles of helping devices due to changing subtask sizes,resulting in a difficult-to-solve,externalities *** paper introduces a novel many-to-one matching-based algorithm,designed to address the externalities problem and optimize resource allocation within POMH ***,we propose a new time-efficient preference profiling technique that further enhances time optimization in POMH *** performance of the proposed technique is thoroughly evaluated in comparison to alternate baseline schemes,revealing many advantages of the proposed *** simulation findings indisputably show that the proposed matching-based offloading technique outperforms existing methodologies in the literature,yielding a remarkable 52 reduction in task latency,particularly under high workloads.
Precise polyp segmentation is vital for the early diagnosis and prevention of colorectal cancer(CRC)in clinical ***,due to scale variation and blurry polyp boundaries,it is still a challenging task to achieve satisfac...
详细信息
Precise polyp segmentation is vital for the early diagnosis and prevention of colorectal cancer(CRC)in clinical ***,due to scale variation and blurry polyp boundaries,it is still a challenging task to achieve satisfactory segmentation performance with different scales and *** this study,we present a novel edge-aware feature aggregation network(EFA-Net)for polyp segmentation,which can fully make use of cross-level and multi-scale features to enhance the performance of polyp ***,we first present an edge-aware guidance module(EGM)to combine the low-level features with the high-level features to learn an edge-enhanced feature,which is incorporated into each decoder unit using a layer-by-layer ***,a scale-aware convolution module(SCM)is proposed to learn scale-aware features by using dilated convolutions with different ratios,in order to effectively deal with scale ***,a cross-level fusion module(CFM)is proposed to effectively integrate the cross-level features,which can exploit the local and global contextual ***,the outputs of CFMs are adaptively weighted by using the learned edge-aware feature,which are then used to produce multiple side-out segmentation *** results on five widely adopted colonoscopy datasets show that our EFA-Net outperforms state-of-the-art polyp segmentation methods in terms of generalization and *** implementation code and segmentation maps will be publicly at https://***/taozh2017/EFANet.
This research concentrates on author profiling using transfer learning models for classifying age and gender. The investigation encompassed a diverse set of transfer learning techniques, including Roberta, BERT, ALBER...
详细信息
This paper develops a new distributed attention-enabled multi-agent reinforcement learning method for frequency regulation of power systems. Specifically, the controller of each generator is modelled as an agent, and ...
详细信息
We observe an infinite sequence of independent identically distributed random variables X1, X2, . . . drawn from an unknown distribution p over [n], and our goal is to estimate the entropy H(p) = − E[log p(X)] within ...
详细信息
暂无评论