Index recommendation is essential for improving query performance in database management systems (DBMSs) through creating an optimal set of indexes under specific constraints. Traditional methods, such as heuristic an...
详细信息
Current large language models (LLMs) often struggle to produce accurate solutions on the first attempt for code generation. Prior research tackles this challenge by generating multiple candidate solutions and validati...
Multi-Modal Relation Extraction (MMRE) plays a key role in various multimedia applications including, recommendation and information retrieval systems. MMRE aims to extract the semantic relation between entities by le...
详细信息
Multi-Modal Relation Extraction (MMRE) plays a key role in various multimedia applications including, recommendation and information retrieval systems. MMRE aims to extract the semantic relation between entities by leveraging context from a text-image pair. By utilizing context from images, the challenge of learning from noisy images in MMRE emerges as a research problem by itself. For instance, subtle variations in similar images can act as noise and potentially impact the predictions made by MMRE models. To tackle this problem, current work utilizes attention mechanisms to fuse relevant text and image features or devise data augmentation techniques (e.g., via generative models) to improve generalization. However, the current performance still remains unsatisfactory. In an effort to improve upon the performance, we propose a Dual-Aspect Noise-based Regularization framework that encompasses two techniques: 1) noise removal through an adaptive gating mechanism, 2) fighting noise with noise to improve feature stability in the learning process. We find that combining these techniques encourages the model to focus on more relevant image features for MMRE. We carry out extensive experiments and demonstrate that our proposed model is further enhanced by exploring data augmentation techniques. This additional improvement leads the model to achieve state-of-the-art performance on the widely-used Multi-modal Neural Relation Extraction (MNRE) dataset, and show its effectiveness and generalizability on the Multi-Modal Named Entity Recognition task.
Text-to-SQL, the task of translating natural language questions into SQL queries, plays a crucial role in enabling non-experts to interact with databases. While recent advancements in large language models (LLMs) have...
详细信息
Continuous cognitive diagnosis models (CDMs) are vital tools for assessing students’ mastery of knowledge points. However, traditional probability-based CDMs are prone to falling into local optima due to their u...
详细信息
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. First...
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.
Cross-modal retrieval is crucial in understanding latent correspondences across modalities. However, existing methods implicitly assume well-matched training data, which is impractical as real-world data inevitably in...
详细信息
Existing low-rank adaptation (LoRA) methods face challenges on sparse large language models (LLMs) due to the inability to maintain sparsity. Recent works introduced methods that maintain sparsity by augmenting LoRA t...
详细信息
Dual-view gaze target estimation in classroom environments has not been thoroughly explored. Existing methods lack consideration of depth information, primarily focusing on 2D image information and neglecting the late...
详细信息
knowledge Graph (KG)-augmented Large Language Models (LLMs) have recently propelled significant advances in complex reasoning tasks, thanks to their broad domain knowledge and contextual awareness. Unfortunately, curr...
详细信息
暂无评论