This study is an extension of our previously published manuscript [2]. Building upon the foundations laid in our earlier work, this research further refines and extends the transformer-based approach to enhance its di...
详细信息
Knowledge graphs are used to alleviate the problems of data sparsity and cold starts in recommendation systems. However, most existing approaches ignore the hierarchical structure of the knowledge graph. In this paper...
详细信息
Generalized Zero-shot Learning (GZSL) aims to recognize categories unseen during training. Despite the success of generative-based methods in GZSL, they often suffer from bias towards seen data, impacting performance....
详细信息
We focus on maximizing a non-negative k-submodular function under a knapsack constraint. As a generalization of submodular functions, a k-submodular function considers k distinct, non-overlapping subsets instead of a ...
详细信息
In recent years, visual language models have made significant advancements in the fields of computer vision and natural language processing. The BLIP-2 model effectively bridges modality gaps through its lightweight Q...
详细信息
Facial expression Recognition is an essential aspect of computer vision, significantly influencing human-computer interaction, education, security monitoring, and autonomous driving. However, the subtle differences be...
详细信息
In this paper, we discuss a generalization of Vieta theorem (Vieta’s formulas) to the case of Clifford geometric algebras. We compare the generalized Vieta’s formulas with the ordinary Vieta’s formulas for characte...
详细信息
Most of the existing Surgical Visual Question Answering (VQA) systems use naive fusion strategies for text and image modalities and there is an absence of localized answering. The limited availability of annotated med...
详细信息
In order to provide more comprehensive medical services and personalized health monitoring according to individual needs, Body Area Networks (BANs) have been extensively studied by many researchers. As BANs involve th...
详细信息
The automatic development of meaningful, detailed textual descriptions for supplied images is a difficult task in the fields of computer vision and natural language processing. As a result, an AI-powered image caption...
详细信息
The automatic development of meaningful, detailed textual descriptions for supplied images is a difficult task in the fields of computer vision and natural language processing. As a result, an AI-powered image caption generator can be incredibly useful for producing captions. In this study, we present a unique method for creating picture captions utilizing an attention mechanism that concentrates on pertinent areas of the image while it creates captions. On benchmark datasets, our model, which uses deep neural networks to extract picture attributes and produce captions, obtains state-of-the-art results, confirming the effectiveness of the attention mechanism in raising the caliber of the generated captions. We also offer a thorough evaluation of the performance of our approach and talk about potential future directions for enhancing image caption generation.
暂无评论