Knowledge graphs (KGs) can provide explainable reasoning for large language models (LLMs), alleviating their hallucination problem. Knowledge graph question answering (KGQA) is a typical benchmark to evaluate the meth...
详细信息
The rapid advancement of Large language Models (LLMs) has revolutionized both academia and industry, leveraging Transformer architectures and pre-training objectives to achieve unprecedented performance. To fully expl...
详细信息
ISBN:
(纸本)9789819794362;9789819794379
The rapid advancement of Large language Models (LLMs) has revolutionized both academia and industry, leveraging Transformer architectures and pre-training objectives to achieve unprecedented performance. To fully exploit the potential of LLMs, fine-tuning LLMs on specific downstream tasks is essential. However, traditional full fine-tuning methods pose significant computational challenges, prompting the emergence of Parameter-Efficient Fine-Tuning (PEFT) methods, especially reparameterization-based PEFT methods. In this survey, we delve into reparameterization-based PEFT methods, which aim to fine-tune LLMs with reduced computational costs while preserving their knowledge. We systematically analyze their design principles and divide these methods into six categories. We analyze the training parameter complexity, GPU memory consumption, training time costs, accuracy and limitations of each method. We summarize challenges within the reparameterization-based PEFT methods and propose future directions.
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even if not being trained explicitly for translation. Yet, they still struggle with translating l...
详细信息
This study aims to address the pervasive challenge of quantifying uncertainty in large language models (LLMs) without logit-access. Conformal Prediction (CP), known for its model-agnostic and distribution-free feature...
详细信息
Social intelligence is essential for understanding complex human expressions and social interactions. While large multimodal models (LMMs) have demonstrated remarkable performance in social intelligence question answe...
详细信息
We find arithmetic ability resides within a limited number of attention heads, with each head specializing in distinct operations. To delve into the reason, we introduce the Comparative Neuron Analysis (CNA) method, w...
详细信息
The proceedings contain 33 papers. The topics discussed include: LeGen: complex information extraction from legal sentences using generative models;summarizing long regulatory documents with a multi-step pipeline;enha...
ISBN:
(纸本)9798891761834
The proceedings contain 33 papers. The topics discussed include: LeGen: complex information extraction from legal sentences using generative models;summarizing long regulatory documents with a multi-step pipeline;enhancing legal expertise in large language models through composite model integration: the development and evaluation of Law-Neo;uOttawa at LegalLens-2024: transformer-based classification experiments;Quebec automobile insurance question-answering with retrieval-augmented generation;rethinking legal judgement prediction in a realistic scenario in the era of large language models;information extraction for planning court cases;and towards an automated pointwise evaluation metric for generated long-form legal summaries.
Large language models (LLMs) have achieved impressive performance across various domains, but the limited context window and the expensive computational cost of processing long texts restrict their more comprehensive ...
详细信息
We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non"standard" varieties f...
详细信息
Multilingual pre-trained models (mPLMs) have shown impressive performance on cross-lingual transfer tasks. However, the transfer performance is often hindered when a low-resource target language is written in a differ...
详细信息
暂无评论