The dynamic nature of language, particularly evident in the realm of slang and memes on the Internet, poses serious challenges to the adaptability of Large language Models (LLMs). Traditionally anchored to static data...
详细信息
We present Sailor, a family of open language models ranging from 0.5B to 14B parameters, tailored for South-East Asian (SEA) languages. From Qwen1.5, Sailor models accept 200B to 400B tokens during continual pre-train...
详细信息
Recently, Large language Models (LLMs) have shown impressive language capabilities, while most of them have very unbalanced performance across different languages. Multilingual alignment based on the translation paral...
详细信息
We evaluate the robustness of several large language models on multiple datasets. Robustness here refers to the relative insensitivity of the model's answers to meaning-preserving variants of their input. Benchmar...
详细信息
Reinforcement Learning from Human Feedback (RLHF) is a crucial approach to aligning language models with human values and intentions. A fundamental challenge in this method lies in ensuring that the reward model accur...
详细信息
MBPP is a popular dataset for evaluating the task of code generation from naturallanguage. Despite its popularity, there are three problems: (1) it relies on providing test cases to generate the right signature, (2) ...
详细信息
Multilingual pre-trained language models (mPLMs) have demonstrated notable effectiveness in zero-shot cross-lingual transfer ***, they can be fine-tuned solely on tasks in the source language and subsequently applied ...
详细信息
The disconnect between tokenizer creation and model training in language models allows for specific inputs, such as the infamous_SolidGoldMagikarp token, to induce unwanted model behaviour. Although such 'glitch t...
详细信息
Citywalk, a recently popular form of urban travel, requires genuine personalization and understanding of fine-grained requests compared to traditional itinerary planning. In this paper, we introduce the novel task of ...
详细信息
Large language Models (LLMs) have significantly advanced naturallanguageprocessing, demonstrating exceptional reasoning, tool usage, and memory capabilities. As their applications expand into multi-agent environment...
详细信息
暂无评论