Spatiotemporal data imputation plays a crucial role in various fields such as traffic flow monitoring, air quality assessment, and climate prediction. However, spatiotemporal data collected by sensors often suffer fro...
详细信息
Semantic segmentation, one of the fundamental tasks in computer vision, requires classification for each pixel in an image, and thus semantic segmentation is time-consuming. With the rise of technologies such as auton...
详细信息
In view of the problems such as exploding gradient or vanishing gradient or inefficiency caused by parallel problems when traditional neural networks deal with long text grammar error correction. In this paper, Chines...
详细信息
In the processing, manufacturing, and production of modern fields, rolling bearings, the most basic module of most mechanical equipment, have a key role that cannot be ignored. This paper proposes three fault diagnosi...
详细信息
Image-text retrieval is a fundamental cross-modal task, which dedicates to align the representation space between image modality and text modality. Existing cross-interactive image-text retrieval methods generate imag...
Image-text retrieval is a fundamental cross-modal task, which dedicates to align the representation space between image modality and text modality. Existing cross-interactive image-text retrieval methods generate image and sentence em-beddings independently, introduce interaction-based networks for cross-modal reasoning, and then retrieve them using matching metrics. However, existing approaches do not consider fully utilizing semantic relationships among multimodal knowledge to enhance cross-modal fine-grained implicit semantic reasoning capabilities. In this paper, we propose Multimodal Knowledge Graph-guided Cross-modal Graph Network (MKCGN) that exploits multimodal knowledge graphs to explore cross-modal relationships and enhance global representations. In MKCGN, images generate semantic and spatial graphs, which are used to represent visual graphs, and sentences generate textual graphs based on word semantic relations. The visual and textual graphs are used to implement inter-modal reasoning respectively. Then we obtain interest embeddings of image regions and text words based on entity embed dings in Multimodal Knowledge Graph (MKG), which approximates and aligns the representation space of regions and words to a certain extent, thus obtaining effective inter-modal interactions and learning fine-grained cross-modal communication through graph node contrast loss for inter-modal semantic reasoning. Finally, we mine the implicit semantics and potential relationships of images and texts through the MKG as a means of enhancing the global representations and use cross-modal contrast loss to narrow the space of coarse-grained cross-modal representations. Experiments on the MS-COCO and Flickr30K benchmark datasets show that our proposed MKCGN outperforms state-of-the-art image-text retrieval methods.
Obsessive-Compulsive Disorder (OCD) is a hereditary mental illness, and unaffected first-degree relative (UFDR) is also at high risk. This study constructed a framework based on traditional machine learning and deep l...
详细信息
Mobile edge computing (MEC) has been proposed to provide mobile devices with both satisfactory computing resources and latency. Key issues in MEC include task offloading and power allocation (TOPA), for which deep rei...
详细信息
In modern technology environments, raising users’ privacy awareness is crucial. Existing efforts largely focused on privacy policy presentation and failed to systematically address a radical challenge of user motivat...
详细信息
Data mining and knowledge discovery are essential aspects of extracting valuable insights from vast datasets. Neural topic models (NTMs) have emerged as a valuable unsupervised tool in this field. However, the predomi...
详细信息
Convolutional neural network (CNN)-based and Transformer-based methods have recently made significant strides in time series forecasting, which excel at modeling local temporal variations or capturing long-term depend...
详细信息
暂无评论