Large language Models (LLMs) have achieved notable success in commonsense reasoning tasks, benefiting from their rich world knowledge acquired through extensive pretraining. While approaches like Chain-of-Thought (CoT...
详细信息
Planning, as the core module of agents, is crucial in various fields such as embodied agents, web navigation, and tool using. With the development of large language models (LLMs), some researchers treat large language...
详细信息
Abductive naturallanguage commonsense reasoning is a task aiming at inferring the most plausible explanation in narrative text for observed events. Previous works mostly concentrate on utilizing powerful pre-trained ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Abductive naturallanguage commonsense reasoning is a task aiming at inferring the most plausible explanation in narrative text for observed events. Previous works mostly concentrate on utilizing powerful pre-trained language models and making better use of excess training data to learn abundant event commonsense knowledge. However, the utilization of causal effect is hidden in the language reasoning process and the explicit constraint of the causal effect between events has not been explored, resulting in biased inference. The model may focus on one observed event and make the wrong prediction while ignoring the other helpful events. To reveal the problem we modify the original task by appending unrelated text to the context which won't change the causal relation. And typical methods get worse in the new task as they are not good at utilizing the complementary between the two observations. Motivated by eliminating the shortcut from incomplete observation and utilizing the complementarity of the two observations, we propose an incomplete observation bias suppression method to guide the training process. Results show our approach can ease the problem revealed in the new task. Based on the proposed method and the new task, our method also get competitive result on the original task.
Multilingual neural machine translation (MNMT) offers the convenience of translating between multiple languages with a single model. However, MNMT often suffers from performance degradation in high-resource languages ...
详细信息
Large language Models (LLMs) have demonstrated proficiency in addressing tasks that necessitate a combination of task planning and the usage of external tools, such as weather and calculator APIs. However, real-world ...
详细信息
Passwords are the most widely used authentication method and play a crucial role in the field of information security. In this study, we explore the effectiveness of applying machine learning (ML) and naturallanguage...
详细信息
ISBN:
(纸本)9798331534110;9798331534103
Passwords are the most widely used authentication method and play a crucial role in the field of information security. In this study, we explore the effectiveness of applying machine learning (ML) and naturallanguageprocessing (NLP) techniques to password classification. We compare the performance of classifiers by using eight ML techniques and four NLP techniques to classify user-created passwords. The experimental results show that the classifier using a combination of Bag-of-Words and Logistic Regression outperforms other classifiers, achieving an accuracy of 98.53% and a recall for weak passwords of 99.68%.
Despite advancements, fine-tuning Large language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remai...
详细信息
News recommendation is one of the widest commercialization in naturallanguageprocessing research area, which aims to recommend news according to user interests. News recall plays an important role in news recommenda...
详细信息
ISBN:
(纸本)9798891760615
News recommendation is one of the widest commercialization in naturallanguageprocessing research area, which aims to recommend news according to user interests. News recall plays an important role in news recommendation. It is to recall candidates from a very large news database. Recent researches of news recall mostly adopt dual-encoder architecture as it provides a much faster recall scheme, and they encode each word equally. However, these works remain two challenges: irrelevant word distraction and weak dual-encoder interaction. Therefore, we propose a model Topic-aware Attention and powerful Dual-encoder Interaction for recall in news recommendation (TADI). To avoid irrelevant word distraction, TADI designs a Topic-aware Attention (TA) which weights words according to news topics. To enhance dual-encoder interaction, TADI provides a cheap yet powerful interaction module, namely Dual-encoder Interaction (DI). DI helps dual encoders interact powerfully based on two auxiliary targets. After performance comparisons between TADI and state-of-the-arts in a series of experiments, we verify the effectiveness of TADI.
Story video-text alignment, a core task in computational story understanding, aims to align video clips with corresponding sentences in their descriptions. However, progress on the task has been held back by the scarc...
详细信息
We investigate how to elicit compositional generalization capabilities in large language models (LLMs). Compositional generalization empowers LLMs to solve complex problems by combining foundational skills, a critical...
详细信息
暂无评论