The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination", which refers to instances where LLMs generate factually inaccurate or ungrounded info...
详细信息
The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge...
详细信息
As long-context large language models (LLMs) gain increasing attention for their ability to handle extensive inputs, the demand for effective evaluation methods has become critical. Existing evaluation methods, howeve...
详细信息
We explore the alignment of values in Large language Models (LLMs) with specific age groups, leveraging data from the World Value Survey across thirteen *** a diverse set of prompts tailored to ensure response robustn...
详细信息
language models can be manipulated by adversarial attacks, which introduce subtle perturbations to input data. While recent attack methods can achieve a relatively high attack success rate (ASR), we've observed th...
详细信息
Understanding satire and humor is a challenging task for even current Vision-language models. In this paper, we propose the challenging tasks of Satirical Image Detection (detecting whether an image is satirical), Und...
详细信息
The Abstract grammatical knowledge-of parts of speech and grammatical patterns-is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the...
详细信息
ISBN:
(纸本)9798891760608
The Abstract grammatical knowledge-of parts of speech and grammatical patterns-is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the human literature, compelling evidence for grammatical abstraction comes from structural priming. A sentence that shares the same grammatical structure as a preceding sentence is processed and produced more readily. Because confounds exist when using stimuli in a single language, evidence of abstraction is even more compelling from crosslingual structural priming, where use of a syntactic structure in one language primes an analogous structure in another language. We measure crosslingual structural priming in large language models, comparing model behavior to human experimental results from eight crosslingual experiments covering six languages, and four monolingual structural priming experiments in three non-English languages. We find evidence for abstract monolingual and crosslingual grammatical representations in the models that function similarly to those found in humans. These results demonstrate that grammatical representations in multilingual language models are not only similar across languages, but they can causally influence text produced in different languages.
language model pretraining generally targets a broad range of use cases and incorporates data from diverse sources. However, there are instances where we desire a model that excels in specific areas without markedly c...
详细信息
The prolific use of Large language Models (LLMs) as an alternate knowledge base requires them to be factually consistent, necessitating both correctness and consistency traits for paraphrased queries. Recently, signif...
详细信息
In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3) via in-context learning (ICL), they still lag significantly behind fully-supervised baselines (e.g., fin...
详细信息
ISBN:
(纸本)9798891760608
In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3) via in-context learning (ICL), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of ICL for RE: (1) low relevance regarding entity and relation in existing sentence-level demonstration retrieval approaches for ICL;and (2) the lack of explaining input-label mappings of demonstrations leading to poor ICL effectiveness. In this paper, we propose GPT-RE to successfully address the aforementioned issues by (1) incorporating task-aware representations in demonstration retrieval;and (2) enriching the demonstrations with gold label-induced reasoning logic. We evaluate GPT-RE on four widely-used RE datasets and observe that GPT-RE achieves improvements over not only existing GPT-3 baselines, but also fully-supervised baselines as in Figure 1. Specifically, GPT-RE achieves SOTA performances on the Semeval and SciERC datasets, and competitive performances on the TACRED and ACE05 datasets. Additionally, a critical issue of LLMs revealed by previous work, the strong inclination to wrongly classify NULL examples into other pre-defined labels, is substantially alleviated by our method. We show an empirical analysis.(1)
暂无评论