Role-playing agents (RPA) have been a popular application area for large language models (LLMs), attracting significant interest from both industry and academia. While existing RPAs well portray the characters' kn...
详细信息
Accurate knowledge selection is critical in knowledge-grounded dialogue systems. Towards a closer look at it, we offer a novel perspective to organize existing literature, i.e., knowledge selection coupled with, after...
详细信息
ISBN:
(纸本)9798891760608
Accurate knowledge selection is critical in knowledge-grounded dialogue systems. Towards a closer look at it, we offer a novel perspective to organize existing literature, i.e., knowledge selection coupled with, after, and before generation. We focus on the third underexplored category of study, which can not only select knowledge accurately in advance, but has the advantage to reduce the learning, adjustment, and interpretation burden of subsequent response generation models, especially LLMs. We propose GATE, a generator-agnostic knowledge selection method, to prepare knowledge for subsequent response generation models by selecting context-related knowledge among different knowledge structures and variable knowledge requirements. Experimental results demonstrate the superiority of GATE, and indicate that knowledge selection before generation is a lightweight yet effective way to facilitate LLMs (e.g., ChatGPT) to generate more informative responses.
Large language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating L...
详细信息
In translation, a concept represented by a single word in a source language can have multiple variations in a target language. The task of lexical selection requires using context to identify which variation is most a...
详细信息
Developing robust and reliable models for Named Entity Recognition (NER) in the Russian language presents significant challenges due to the linguistic complexity of Russian and the limited availability of suitable tra...
详细信息
Developing robust and reliable models for Named Entity Recognition (NER) in the Russian language presents significant challenges due to the linguistic complexity of Russian and the limited availability of suitable training datasets. This study introduces a semi-automated methodology for building a customized Russian dataset for NER specifically designed for literary purposes. The paper provides a detailed description of the methodology employed for collecting and proofreading the dataset, outlining the pipeline used for processing and annotating its contents. A comprehensive analysis highlights the dataset's richness and diversity. Central to the proposed approach is the use of a voting system to facilitate the efficient elicitation of entities, enabling significant time and cost savings compared to traditional methods of constructing NER datasets. The voting system is described theoretically and mathematically to highlight its impact on enhancing the annotation process. The results of testing the voting system with various thresholds show its impact in increasing the overall precision by 28% compared to using only the state-of-the-art model for auto-annotating. The dataset is meticulously annotated and thoroughly proofread, ensuring its value as a high-quality resource for training and evaluating NER models. empirical evaluations using multiple NER models underscore the dataset's importance and its potential to enhance the robustness and reliability of NER models in the Russian language.
Although large language models (LLMs) like ChatGPT (OpenAI et al., 2024) have demonstrated considerable capabilities in general domains, they often lack proficiency in specialized fields. Enhancing a model's perfo...
详细信息
Predictions of word-by-word conditional probabilities from Transformer-based language models are often evaluated to model the incremental processing difficulty of human readers. In this paper, we argue that there is a...
详细信息
language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products. The API vendors charge t...
详细信息
ISBN:
(纸本)9798891760608
language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products. The API vendors charge their users based on usage, more specifically on the number of "tokens" processed or generated by the underlying language models. What constitutes a token, however, is training data and model dependent with a large variance in the number of tokens required to convey the same information in different languages. In this work, we analyze the effect of this non-uniformity on the fairness of an API's pricing policy across languages. We conduct a systematic analysis of the cost and utility of OpenAI's language model API on multilingual benchmarks in 22 typologically diverse languages. We show evidence that speakers of a large number of the supported languages are overcharged while obtaining poorer results. These speakers tend to also come from regions where the APIs are less affordable to begin with. Through these analyses, we aim to increase transparency around language model APIs' pricing policies and encourage the vendors to make them more equitable.
The visual representation of a concept varies significantly depending on its meaning and the context where it occurs;this poses multiple challenges both for vision and multimodal models. Our study focuses on concreten...
详细信息
Large language models (LLMs) are widely used in question-answering (QA) systems but often generate information with hallucinations. Retrieval-augmented generation (RAG) offers a potential remedy, yet the uneven retrie...
详细信息
暂无评论