Nearest neighbor machine translation (kNNMT), which interpolates target token probabilities with estimates derived from additional examples, has achieved significant improvements and attracted extensive interest in re...
详细信息
ISBN:
(纸本)9798891760608
Nearest neighbor machine translation (kNNMT), which interpolates target token probabilities with estimates derived from additional examples, has achieved significant improvements and attracted extensive interest in recent years. However, existing research does not explicitly consider the source context when retrieving similar examples, potentially leading to suboptimal performance. To address this, we comprehensively revisit the role of source context and propose a simple and effective method for improving neural machine translation via source context enhancement, demonstrating its crucial role in both retrieving superior examples and determining more suitable interpolation coefficients. Furthermore, we reveal that the probability estimation can be further optimized by incorporating a source-aware distance calibration module. Comprehensive experiments show that our proposed approach can be seamlessly integrated with representative kNN-MT baselines, resulting in substantial improvements over these strong baselines across a number of settings and domains. Remarkably, these improvements can reach up to 1.6 BLEU points.(1)
Large language models (LLMs) have shown remarkable reasoning capabilities, particularly with chain-of-thought (CoT) prompting. However, LLMs sometimes still struggle with problems that are easy for humans, such as gen...
详细信息
ISBN:
(纸本)9798891760608
Large language models (LLMs) have shown remarkable reasoning capabilities, particularly with chain-of-thought (CoT) prompting. However, LLMs sometimes still struggle with problems that are easy for humans, such as generating action plans to achieve given goals in an environment, or performing complex math or logical reasoning. The deficiency stems from the key fact that LLMs lack an internal world model to predict the world state (e.g., environment status, intermediate variable values) and simulate long-term outcomes of actions. This prevents LLMs from performing deliberate planning akin to human brains, which involves exploring alternative reasoning paths, anticipating future states and rewards, and iteratively refining existing reasoning steps. To overcome the limitations, we propose a new LLM reasoning framework, Reasoning via Planning (RAP). RAP repurposes the LLM as both a world model and a reasoning agent, and incorporates a principled planning algorithm based on Monte Carlo Tree Search for strategic exploration in the vast reasoning space. During reasoning, the LLM (as agent) incrementally builds a reasoning tree under the guidance of the LLM (as world model) and rewards, and efficiently obtains a high-reward reasoning path with a proper balance between exploration vs. exploitation. We apply RAP to various challenging reasoning problems including plan generation, math reasoning, and logical inference, and demonstrate its superiority over strong baselines. RAP with LLaMA-33B even surpasses CoT with GPT-4, achieving 33% relative improvement in a plan generation setting.(1)
Large language Models (LLMs) have demonstrated remarkable performance in various application domains, largely due to their self-supervised pre-training on extensive high-quality text datasets. However, despite the imp...
详细信息
ISBN:
(数字)9798400712487
ISBN:
(纸本)9798400712487
Large language Models (LLMs) have demonstrated remarkable performance in various application domains, largely due to their self-supervised pre-training on extensive high-quality text datasets. However, despite the importance of constructing such datasets, many leading LLMs lack documentation of their dataset construction and training procedures, leaving LLM practitioners with a limited understanding of what makes a high-quality training dataset for LLMs. To fill this gap, we initially identified 18 characteristics of high-quality LLM training datasets, as well as 10 potential data pre-processingmethods and 6 data quality assessment methods, through detailed interviews with 13 experienced LLM professionals. We then surveyed 219 LLM practitioners from 23 countries across 5 continents. We asked our survey respondents to rate the importance of these characteristics, provide a rationale for their ratings, specify the key data pre-processing and data quality assessment methods they used, and highlight the challenges encountered during these processes. From our analysis, we identified 13 crucial characteristics of high-quality LLM datasets that receive a high rating, accompanied by key rationale provided by respondents. We also identified some widely-used data pre-processing and data quality assessment methods, along with 7 challenges encountered during these processes. Based on our findings, we discuss the implications for researchers and practitioners aiming to construct high-quality training datasets for optimizing LLMs.
Questions in open-domain question answering are often ambiguous, allowing multiple interpretations. One approach to handling them is to identify all possible interpretations of the ambiguous question (AQ) and to gener...
详细信息
ISBN:
(纸本)9798891760608
Questions in open-domain question answering are often ambiguous, allowing multiple interpretations. One approach to handling them is to identify all possible interpretations of the ambiguous question (AQ) and to generate a long-form answer addressing them all, as suggested by Stelmakh et al. (2022). While it provides a comprehensive response without bothering the user for clarification, considering multiple dimensions of ambiguity and gathering corresponding knowledge remains a challenge. To cope with the challenge, we propose a novel framework, TREE OF CLARIFICATIONS (TOC): It recursively constructs a tree of disambiguations for the AQ-via few-shot prompting leveraging external knowledge-and uses it to generate a long-form answer. TOC outperforms existing baselines on ASQA in a few-shot setup across all metrics, while surpassing fully-supervised baselines trained on the whole training set in terms of Disambig-F1 and Disambig-ROUGE. Code is available at ***/gankim/tree-of-clarifications.
Large language models (LLMs) have become foundational to numerous naturallanguageprocessing tasks;however, decoding coherent and contextually relevant text remains a complex challenge. In openended generation, maxim...
详细信息
ISBN:
(纸本)9789819794331;9789819794348
Large language models (LLMs) have become foundational to numerous naturallanguageprocessing tasks;however, decoding coherent and contextually relevant text remains a complex challenge. In openended generation, maximizing probability is often not the appropriate objective, as with sampling methods, the continuation tends to be incoherent and repetitive in various degrees. We propose Merge Decoding, merging information in the shallow layer, such as sequential information, with the final task-specific layer, thereby generating coherent and rich text. MD works across three scales of the LLaMA family(7B, 13B, 30B), achieving higher quality text in open-ended text generation (Wiki-Text, WikiNews, BookCorpus) and enhancing reasoning capabilities in downstream tasks (Gsm8k, StrategyQA) https://***/YcChou/MergeDecoding.
Multimodal Large language Models (MLLMs) have shown promising results in various tasks, but their ability to perceive the visual world with deep, hierarchical understanding similar to humans remains uncertain. To addr...
详细信息
The progress of naturallanguageprocessing (NLP) is primarily driven by machine learning that optimizes a system on a large-scale set of task-specific labeled examples. This learning paradigm limits the ability of ma...
详细信息
Built on the power of LLMs, numerous multimodal large language models (MLLMs) have recently achieved remarkable performance on various vision-language tasks. However, most existing MLLMs and benchmarks primarily focus...
详细信息
Retrieval-augmented large language models (R-LLMs) combine pre-trained large language models (LLMs) with information retrieval systems to improve the accuracy of factual question-answering. However, current libraries ...
详细信息
We study the code generation behavior of instruction-tuned models built on top of code pre-trained language models when they could access an auxiliary function to implement a function. We design several ways to provid...
详细信息
暂无评论