Multi-Modal Relation Extraction (MMRE) plays a key role in various multimedia applications including, recommendation and information retrieval systems. MMRE aims to extract the semantic relation between entities by le...
详细信息
Multi-Modal Relation Extraction (MMRE) plays a key role in various multimedia applications including, recommendation and information retrieval systems. MMRE aims to extract the semantic relation between entities by leveraging context from a text-image pair. By utilizing context from images, the challenge of learning from noisy images in MMRE emerges as a research problem by itself. For instance, subtle variations in similar images can act as noise and potentially impact the predictions made by MMRE models. To tackle this problem, current work utilizes attention mechanisms to fuse relevant text and image features or devise data augmentation techniques (e.g., via generative models) to improve generalization. However, the current performance still remains unsatisfactory. In an effort to improve upon the performance, we propose a Dual-Aspect Noise-based Regularization framework that encompasses two techniques: 1) noise removal through an adaptive gating mechanism, 2) fighting noise with noise to improve feature stability in the learning process. We find that combining these techniques encourages the model to focus on more relevant image features for MMRE. We carry out extensive experiments and demonstrate that our proposed model is further enhanced by exploring data augmentation techniques. This additional improvement leads the model to achieve state-of-the-art performance on the widely-used Multi-modal Neural Relation Extraction (MNRE) dataset, and show its effectiveness and generalizability on the Multi-Modal Named Entity Recognition task.
Average filtering plays a vital role in image smoothing tasks. However, existing quantum image weighted average filtering methods suffer from high circuit complexity. Therefore, this paper proposes an improved quantum...
详细信息
This paper investigates resilient consensus control for teleoperation systems under denial-of-service (DoS) attacks. We design resilient controllers with auxiliary systems based on sampled positions of both master and...
详细信息
RGB-D Rail Surface Defect Inspection (RSDI) is a critical measure for ensuring transportation safety. It improves inspection accuracy by using depth maps, but the issue of poor-quality depth maps in rail defect datase...
详细信息
A Doherty Power Amplifier (DPA) has been designed and optimized specifically for compact mobile base station deployment, operating within a frequency range of 3.3 GHz to 3.6 GHz. The amplifier utilizes the proprietary...
详细信息
With the proliferation of data-intensive industrial applications, the collaboration of computing powers among standalone edge servers is vital to provision such services for smart devices. In this paper, we propose an...
详细信息
As one of the most important forensic tasks, reconstruction of the original information in tampered images is a key step for tampering detection and localization. Currently, a number of methods have been designed to e...
详细信息
Urban tree species identification is crucial for forest management and ecosystem assessment. Mobile Laser Scanning (MLS) provides significant advantages for this task through its flexibility in navigating complex urba...
详细信息
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. First...
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.
In this paper,we investigate the application of the Unmanned Aerial Vehicle(UAV)-enabled relaying system in emergency communications,where one UAV is applied as a relay to help transmit information from ground users t...
详细信息
In this paper,we investigate the application of the Unmanned Aerial Vehicle(UAV)-enabled relaying system in emergency communications,where one UAV is applied as a relay to help transmit information from ground users to a Base Station(BS).We maximize the total transmitted data from the users to the BS,by optimizing the user communication scheduling and association along with the power allocation and the trajectory of the *** solve this non-convex optimization problem,we propose the traditional Convex Optimization(CO)and the Reinforcement Learning(RL)-based ***,we apply the block coordinate descent and successive convex approximation techniques in the CO approach,while applying the soft actor-critic algorithm in the RL *** simulation results show that both approaches can solve the proposed optimization problem and obtain good ***,the RL approach establishes emergency communications more rapidly than the CO approach once the training process has been completed.
暂无评论