GraphQL is a powerful query language for APIs that allows clients to fetch precise data efficiently and flexibly, querying multiple resources with a single request. However, crafting complex GraphQL query operations c...
详细信息
Finding appropriate prompts for the specific task has become an important issue as the usage of Large language Models (LLM) has expanded. Reinforcement Learning (RL) is widely used for prompt tuning, but its inherent ...
详细信息
Audio separation in real-world scenarios, where mixtures contain a variable number of sources, presents significant challenges due to limitations of existing models, such as over-separation, under-separation, and depe...
详细信息
We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better sample efficiency and the better generalization capability of...
详细信息
ISBN:
(纸本)9798891760608
We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better sample efficiency and the better generalization capability of models pretrained with masked language modeling can be attributed to the semantic similarity encoded in the pretraining data's distributional property. Via a synthetic dataset, our analysis suggests that distributional property indeed leads to the better sample efficiency of pretrained masked language models, but does not fully explain the generalization capability. We also conduct analyses over two real-world datasets and demonstrate that the distributional property does not explain the generalization ability of pretrained naturallanguage models either. Our results illustrate our limited understanding of model pretraining and provide future research directions.
Large language models (LLMs) have demonstrated emergent capabilities across diverse reasoning tasks via popular Chains-of-Thought (COT) prompting. However, such a simple and fast COT approach often encounters limitati...
详细信息
Automated red teaming is an effective method for identifying misaligned behaviors in large language models (LLMs). Existing approaches, however, often focus primarily on improving attack success rates while overlookin...
详细信息
Stories are not only designed to entertain but to encode lessons reflecting their authors' beliefs about the world. In this paper, we propose a new task of narrative schema labelling based on the concept of "...
详细信息
Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. language-specific modeling methods show promise in reducing interference. However, they often rely on ...
详细信息
Collecting diverse human opinions is costly and challenging. This leads to a recent trend in exploiting large language models (LLMs) for generating diverse data for potential scalable and efficient solutions. However,...
详细信息
Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepres...
详细信息
暂无评论