The amount of violent content shared on social networks makes it a very unhealthy space. Hence the birth of a growing research domain that involves filtering social media content using Artificial Intelligence-powered ...
详细信息
Covid 19 is a disastrous infection that the whole world tackled for 2 years from 2020-to 2021. As this virus was new and doctors had no idea about it, they treated patients to save lives with all their possible experi...
详细信息
The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing...
详细信息
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic ***,traditional models still rely on static visual features that do not evolve with ...
详细信息
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic ***,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated *** limitation often leads to captions that are less accurate or *** this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic *** model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate ***,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated *** continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language *** demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual *** results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.
Partial video copy detection (PVCD) aims to discover copy segments of query videos from a video database, which plays an important role in video copyright protection, filtering, tracking, etc. For a large-scale video ...
详细信息
Recently, Transformer-based methods for single image super-resolution (SISR) have achieved better performance advantages than the methods based on convolutional neural network (CNN). Exploiting self-attention mechanis...
详细信息
Neural networks excel at mining the static graph, but how to learn from streaming graphs without forgetting previous knowledge is an emerging challenge and well known as continual graph learning (CGL). Despite recent ...
详细信息
Recent advancements in satellite technologies have resulted in the emergence of Remote Sensing (RS) images. Hence, the primary imperative research domain is designing a precise retrieval model for retrieving the most ...
详细信息
Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, eve...
详细信息
Question-answering(QA)models find answers to a given *** necessity of automatically finding answers is increasing because it is very important and challenging from the large-scale QA data *** this paper,we deal with t...
详细信息
Question-answering(QA)models find answers to a given *** necessity of automatically finding answers is increasing because it is very important and challenging from the large-scale QA data *** this paper,we deal with the QA pair matching approach in QA models,which finds the most relevant question and its recommended answer for a given *** studies for the approach performed on the entire dataset or datasets within a category that the question writer manually *** contrast,we aim to automatically find the category to which the question belongs by employing the text classification model and to find the answer corresponding to the question within the *** to the text classification model,we can effectively reduce the search space for finding the answers to a given ***,the proposed model improves the accuracy of the QA matching model and significantly reduces the model inference ***,to improve the performance of finding similar sentences in each category,we present an ensemble embedding model for sentences,improving the performance compared to the individual embedding *** real-world QA data sets,we evaluate the performance of the proposed QA matching *** a result,the accuracy of our final ensemble embedding model based on the text classification model is 81.18%,which outperforms the existing models by 9.81%∼14.16%***,in terms of the model inference speed,our model is faster than the existing models by 2.61∼5.07 times due to the effective reduction of search spaces by the text classification model.
暂无评论