LLMs can help humans working with long documents, but are known to hallucinate. Attribution can increase trust in LLM responses: The LLM provides evidence that supports its response, which enhances verifiability. Exis...
详细信息
The advent of pre-trained language models (PLMs) has revolutionized the field of naturallanguageprocessing (NLP), enabling models to leverage vast amounts of language and world knowledge for various downstream tasks...
详细信息
ISBN:
(纸本)9798350349122;9798350349115
The advent of pre-trained language models (PLMs) has revolutionized the field of naturallanguageprocessing (NLP), enabling models to leverage vast amounts of language and world knowledge for various downstream tasks. However, the conventional fine-tuning of these models requires updating full parameters, which is computationally intensive and impractical in resource-constrained settings. Parameter-efficient tuning methods have been developed to mitigate this. However, these methods may need to be improved in terms of inference latency, optimization difficulty, and input sequence length reduction. In this paper, we introduce a new reparameterization-based parameter-efficient tuning method, Additive Delta Tuning (ADT). ADT only fine-tunes part of the row and column parameters in the PLM weight matrix. This approach significantly reduces the computational cost compared to existing methods, such as Low-Rank Adaptation (LoRA), while maintaining competitive performance. We conducted extensive experiments across multiple naturallanguage understanding (NLU) tasks, including text classification, text entailment, named entity recognition, etc., to evaluate the effectiveness of ADT. Our experimental results demonstrate that ADT achieves comparable or better performance than traditional fine-tuning methods, requiring less than 0.3% parameter updates. This shows that ADT is an effective method for parameter-efficient tuning. Our method achieves comparable results to LoRA on multiple NLU tasks, with the added benefit of requiring less computational cost.
Knights and knaves problems represent a classic genre of logical puzzles where characters either tell the truth or lie. The objective is to logically deduce each character's identity based on their statements. The...
详细信息
Finetuning language agents with reasoning-action trajectories is effective, but obtaining these trajectories from human annotations or stronger models is costly and sometimes impractical. In this paper, we investigate...
详细信息
Computer Vision (CV), naturallanguageprocessing (NLP), and Recommender Systems (RecSys) are three prominent AI applications that have traditionally developed independently, resulting in disparate modeling and engine...
详细信息
ISBN:
(纸本)9798891760615
Computer Vision (CV), naturallanguageprocessing (NLP), and Recommender Systems (RecSys) are three prominent AI applications that have traditionally developed independently, resulting in disparate modeling and engineering methodologies. This has impeded the ability for these fields to directly benefit from each other's advancements. With the recent development of foundation models, large language models have emerged as a potential generalpurpose interface for unifying different modalities and problem formulations. In light of this, we propose the development of a multimodal foundation model (MFM) considering visual, textual, and personalization modalities under the P5 recommendation paradigm, thus named VIP5 (Visual P5), to unify various modalities and recommendation tasks. This will enable the processing of multiple modalities in a shared architecture for improved recommendations. To achieve this, we introduce multimodal personalized prompts to accommodate multiple modalities under a shared format. Additionally, we propose a parameter-efficient training method for foundation models, which involves freezing the P5 backbone and fine-tuning lightweight adapters, resulting in improved recommendation performance and increased efficiency in terms of training time and memory usage. Code and data of VIP5 are available at https:// ***/jeykigung/VIP5.
As Large language Models (LLMs) continue to exhibit remarkable performance in naturallanguage understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existin...
详细信息
Legal case retrieval (LCR) aims to provide similar cases as references for a given fact description. This task is crucial for promoting consistent judgments in similar cases, effectively enhancing judicial fairness an...
详细信息
Recent advancements in large-scale pre-trained automatic speech recognition (ASR) foundation models (e.g., Whisper) have exhibited remarkable performance in speech processing tasks. But fine-tuning such models for low...
详细信息
ISBN:
(纸本)9789819794362;9789819794379
Recent advancements in large-scale pre-trained automatic speech recognition (ASR) foundation models (e.g., Whisper) have exhibited remarkable performance in speech processing tasks. But fine-tuning such models for low-resource languages can be computationally expensive and prone to overfitting. Prompting methods offer a solution by designing specific prompts in the inputs that guide the model's behavior for targeted tasks, facilitating parameter-efficient adaptation. This paper presents the first exploration of various prompt tuning methods and optimized strategies for low-resource ASR based on Whisper. Moreover, we propose a shallow integration method to utilize the advantage of deep prompt tuning and reparametrization. Extensive experiments on the Common Voice and FLEURS datasets show the competitive performance of prompt tuning compared to full fine-tuning and Lora with fewer trainable parameters. Notably, the shallow integration strategy yields impressive results, especially for small models.
Recent work has shown that Large language Models (LLMs) can unintentionally leak sensitive information present in their training data. In this paper, we present MoPeθ (Model Perturbations), a new method to identify w...
详细信息
The societal impact of pre-trained language models has prompted researchers to probe them for strong associations between protected attributes and value-loaded terms, from slur to prestigious job titles. Such work is ...
详细信息
ISBN:
(纸本)9781450372527
The societal impact of pre-trained language models has prompted researchers to probe them for strong associations between protected attributes and value-loaded terms, from slur to prestigious job titles. Such work is said to probe models for bias or fairness-or such probes 'into representational biases' are said to be 'motivated by fairness'-suggesting an intimate connection between bias and fairness. We provide conceptual clarity by distinguishing between association biases [11] and empirical fairness [56] and show the two can be independent. Our main contribution, however, is showing why this should not come as a surprise. To this end, we first provide a thought experiment, showing how association bias and empirical fairness can be completely orthogonal. Next, we provide empirical evidence that there is no correlation between bias metrics and fairness metrics across the most widely used language models. Finally, we survey the sociological and psychological literature and show how this literature provides ample support for expecting these metrics to be uncorrelated.
暂无评论