检索结果-内蒙古大学图书馆

conference on empirical methods in natural language processing (EMNLP)

作者： Deas, Nicholas Grieser, Jessi Kleiner, Shana Patton, Desmond Turcan, Elsbeth McKeown, Kathleen Columbia Univ Dept Comp Sci New York NY 10027 USA Univ Michigan Dept Linguist Ann Arbor MI USA Univ Penn Sch Social Policy & Practice Annenberg Sch Commun Philadelphia PA USA

ISBN: (纸本)9798891760608

Warning: This paper contains content and language that may be considered offensive to some readers. While biases disadvantaging African American language (AAL) have been uncovered in models for tasks such as speech recognition and toxicity detection, there has been little investigation of these biases for language generation models like ChatGPT. We evaluate how well LLMs understand AAL in comparison to White Mainstream English (WME), the encouraged "standard" form of English taught in American classrooms. We measure large language model performance on two tasks: a counterpart generation task, where a model generates AAL given WME and vice versa, as well as a masked span prediction (MSP) task, where models predict a phrase hidden from their input. Using a novel dataset of AAL texts from a variety of regions and contexts, we present evidence of dialectal bias for six pre-trained LLMs through performance gaps on these tasks.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Global Research on natural Disasters and Human Health: a Mapping Study Using natural language processing Techniques

引用

CURRENT ENVIRONMENTAL HEALTH REPORTS 2024年第1期11卷 61-70页

作者： Ye, Xin Lin, Hugo Fudan Univ Inst Global Publ Policy 220 Handan Rd Shanghai 200433 Peoples R China Fudan Univ LSE Fudan Res Ctr Global Publ Policy 220 Handan Rd Shanghai 200433 Peoples R China Paris Saclay Univ Cent Supelec F-91192 Paris France

Purpose of Review This review aimed to systematically synthesize the global evidence base for natural disasters and human health using natural language processing (NLP) techniques. Recent Findings We searched Embase, PubMed, Scopus, PsycInfo, and Web of Science Core Collection, using titles, abstracts, and keywords, and included only literature indexed in English. NLP techniques, including text classification, topic modeling, and geoparsing methods, were used to systematically identify and map scientific literature on natural disasters and human health published between January 1, 2012, and April 3, 2022. We predicted 6105 studies in the area of natural disasters and human health. Earthquakes, hurricanes, and tsunamis were the most frequent nature disasters;posttraumatic stress disorder (PTSD) and depression were the most frequently studied health outcomes;mental health services were the most common way of coping. Geographically, the evidence base was dominated by studies from high-income countries. Co-occurrence of natural disasters and psychological distress was common. Psychological distress was one of the top three most frequent topics in all continents except Africa, where infectious diseases was the most prevalent topic. Summary Our findings demonstrated the importance and feasibility of using NLP to comprehensively map natural disasters and human health in the growing literature. The review identifies clear topics for future clinical and public health research and can provide an empirical basis for reducing the negative health effects of natural disasters.

关键词： natural disasters Health natural language processing

来源：评论

学校读者我要写书评

暂无评论

Improving Diversity of Demographic Representation in Large language Models via Collective-Critiques and Self-Voting

Improving Diversity of Demographic Representation in Large L...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Lahoti, Preethi Blumni, Nicholas Ma, Xiao Kotikalapudi, Raghavendra Potluri, Sahitya Tan, Qijun Srinivasan, Hansa Packer, Ben Beirami, Ahmad Beutel, Alex Chen, Jilin Google Res Mountain View CA 94043 USA Google DeepMind London England OpenAI New York NY USA

ISBN: (纸本)9798891760608

A crucial challenge for generative large language models (LLMs) is diversity: when a user's prompt is under-specified, models may follow implicit assumptions while generating a response, which may result in homogenization of the responses, as well as certain demographic groups being under-represented or even erased from the generated responses. In this paper, we formalize diversity of representation in generative LLMs. We present evaluation datasets and propose metrics to measure diversity in generated responses along people and culture axes. We find that LLMs understand the notion of diversity, and that they can reason and critique their own responses for that goal. This finding motivated a new prompting technique called collective-critique and self-voting (CCSV) to self-improve people diversity of LLMs by tapping into its diversity reasoning capabilities, without relying on handcrafted examples or prompt tuning. Extensive empirical experiments with both human and automated evaluations show that our proposed approach is effective at improving people and culture diversity, and outperforms all baseline methods by a large margin.

关键词： Population statistics

来源：评论

学校读者我要写书评

暂无评论

Fine-tuning Smaller language Models for Question Answering over Financial Documents

Fine-tuning Smaller Language Models for Question Answering o...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Phogat, Karmvir Singh Puranam, Sai Akhil Dasaratha, Sridhar Harsha, Chetan Ramakrishna, Shashishekar EY Global Delivery Services India LLP India

ISBN: (纸本)9798891761681

Recent research has shown that smaller language models can acquire substantial reasoning abilities when fine-tuned with reasoning exemplars crafted by a significantly larger teacher model. We explore this paradigm for the financial domain, focusing on the challenge of answering questions that require multi-hop numerical reasoning over financial texts. We assess the performance of several smaller models that have been fine-tuned to generate programs that encode the required financial reasoning and calculations. Our findings demonstrate that these fine-tuned smaller models approach the performance of the teacher model. To provide a granular analysis of model performance, we propose an approach to investigate the specific student model capabilities that are enhanced by fine-tuning. Our empirical analysis indicates that fine-tuning refines the student models ability to express and apply the required financial concepts along with adapting the entity extraction for the specific data format. In addition, we hypothesize and demonstrate that comparable financial reasoning capability can be induced using relatively smaller datasets. © 2024 Association for Computational Linguistics.

关键词： Question answering

来源：评论

学校读者我要写书评

暂无评论

The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of language Models

The Effect of Scaling, Retrieval Augmentation and Form on th...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Hagstrom, Lovisa Saynova, Denitsa Norlund, Tobias Johansson, Moa Johansson, Richard Chalmers Univ Technol Gothenburg Sweden Univ Gothenburg Gothenburg Sweden

ISBN: (纸本)9798891760608

Large language Models (LLMs) make natural interfaces to factual knowledge, but their usefulness is limited by their tendency to deliver inconsistent answers to semantically equivalent questions. For example, a model might predict both "Anne Redpath passed away in Edinburgh." and "Anne Redpath's life ended in London." In this work, we identify potential causes of inconsistency and evaluate the effectiveness of two mitigation strategies: up-scaling and augmenting the LM with a retrieval corpus. Our results on the LLaMA and Atlas models show that both strategies reduce inconsistency while retrieval augmentation is considerably more efficient. We further consider and disentangle the consistency contributions of different components of Atlas. For all LMs evaluated we find that syntactical form and other evaluation task artifacts impact consistency. Taken together, our results provide a better understanding of the factors affecting the factual consistency of language models.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Self-Bootstrapped Visual-language Model for Knowledge Selection and Question Answering

Self-Bootstrapped Visual-Language Model for Knowledge Select...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Hao, Dongze Wang, Qunbo Guo, Longteng Jiang, Jie Liu, Jing Institute of Automation Chinese Academy of Sciences China School of Artificial Intelligence University of Chinese Academy of Sciences China

ISBN: (纸本)9798891761643

While large visual-language models (LVLM) have shown promising results on traditional visual question answering benchmarks, it is still challenging for them to answer complex VQA problems which requires diverse world knowledge. Motivated by the research of retrieval-augmented generation in the field of natural language processing, we use Dense Passage Retrieval (DPR) to retrieve related knowledge to help the model answer questions. However, DPR conduct retrieving in natural language space, which may not ensure comprehensive acquisition of image information. Thus, the retrieved knowledge is not truly conducive to helping answer the question, affecting the performance of the overall system. To address this issue, we propose a novel framework that leverages the visual-language model to select the key knowledge retrieved by DPR and answer questions. The framework consists of two modules: Selector and Answerer, where both are initialized by the LVLM and parameter-efficiently finetuned by self-bootstrapping: find key knowledge in the retrieved knowledge documents using the Selector, and then use them to finetune the Answerer to predict answers;obtain the pseudo-labels of key knowledge documents based on the predictions of the Answerer and weak supervision labels, and then finetune the Selector to select key knowledge;repeat. Our framework significantly enhances the performance of the baseline on the challenging open-domain Knowledge-based VQA benchmark, OK-VQA, achieving a state-of-the-art accuracy of 62.83%. Our code is publicly available at https://***/haodongze/Self-KSel-QAns. © 2024 Association for Computational Linguistics.

关键词： Question answering

来源：评论

学校读者我要写书评

暂无评论

TextLap: Customizing language Models for Text-to-Layout Planning

TextLap: Customizing Language Models for Text-to-Layout Plan...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Chen, Jian Zhang, Ruiyi Zhou, Yufan Healey, Jennifer Gu, Jiuxiang Xu, Zhiqiang Chen, Changyou University at Buffalo United States Adobe Research United States MBZUAI United Arab Emirates

ISBN: (纸本)9798891761681

Automatic generation of graphical layouts is crucial for many real-world applications, including designing posters, flyers, advertisements, and graphical user interfaces. Given the incredible ability of Large language models (LLMs) in both natural language understanding and generation, we believe that we could customize an LLM to help people create compelling graphical layouts starting with only text instructions from the user. We call our method TextLap (text-based layout planning) 1. It uses a curated instruction-based layout planning dataset (InsLap) to customize LLMs as a graphic designer. We demonstrate the effectiveness of TextLap and show that it outperforms strong baselines, including GPT-4 based methods, for image generation and graphical design benchmarks. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Does the Correctness of Factual Knowledge Matter for Factual Knowledge-Enhanced Pre-trained language Models?

Does the Correctness of Factual Knowledge Matter for Factual...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Cao, Boxi Tang, Qiaoyu Lin, Hongyu Han, Xianpei Sun, Le Chinese Informat Proc Lab Guangzhou Peoples R China Chinese Acad Sci State Key Lab Comp Sci Inst Software Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China

ISBN: (纸本)9798891760608

In recent years, the injection of factual knowledge has been observed to have a significant positive correlation to the downstream task performance of pre-trained language models. However, existing work neither demonstrates that pre-trained models successfully learn the injected factual knowledge nor proves that there is a causal relation between injected factual knowledge and downstream performance improvements. In this paper, we introduce a counterfactual-based analysis framework to explore the causal effects of factual knowledge injection on the performance of language models within pretrain-finetune paradigm. Instead of directly probing the language model or exhaustively enumerating potential confounding factors, we analyze this issue by perturbing the factual knowledge sources at different scales and comparing the performance of pre-trained language models before and after the perturbation. Surprisingly, throughout our experiments, we find that although the knowledge seems to be successfully injected, the correctness of injected knowledge only has a very limited effect on the models' downstream performance. This finding strongly challenges previous assumptions that the injected factual knowledge is the key for language models to achieve performance improvements on downstream tasks in pretrain-finetune paradigm.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

On the Representational Capacity of Recurrent Neural language Models

On the Representational Capacity of Recurrent Neural Languag...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Nowak, Franz Svete, Anej Du, Li Cotterell, Ryan Swiss Fed Inst Technol Zurich Switzerland Johns Hopkins Univ Baltimore MD 21218 USA

ISBN: (纸本)9798891760608

This work investigates the computational expressivity of language models (LMs) based on recurrent neural networks (RNNs). Siegelmann and Sontag (1992) famously showed that RNNs with rational weights and hidden states and unbounded computation time are Turing complete. However, LMs define weightings over strings in addition to just (unweighted) language membership and the analysis of the computational power of RNN LMs (RLMs) should reflect this. We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions. Since, in practice, RLMs work in real-time, processing a symbol at every time step, we treat the above result as an upper bound on the expressivity of RLMs. We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs. (sic) https://***/rycolab/rnn-turing-completeness

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

TrueTeacher: Learning Factual Consistency Evaluation with Large language Models

TrueTeacher: Learning Factual Consistency Evaluation with La...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Gekhman, Zorik Herzig, Jonathan Aharoni, Roee Elkind, Chen Szpektor, Idan Technion Israel Inst Technol Haifa Israel Google Res Mountain View CA 94043 USA

ISBN: (纸本)9798891760608

Factual consistency evaluation is often conducted using natural language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries. Previous work improved such models with synthetic training data. However, the data is typically based on perturbed human-written summaries, which often differ in their characteristics from real model-generated summaries and have limited coverage of possible factual errors. Alternatively, large language models (LLMs) have recently shown promising results in directly evaluating generative tasks, but are too computationally expensive for practical use. Motivated by these limitations, we introduce TrueTeacher, a method for generating synthetic data by annotating diverse model-generated summaries using a LLM. Unlike prior work, TrueTeacher does not rely on human-written summaries, and is multilingual by nature. Experiments on the TRUE benchmark show that a student model trained using our data, substantially outperforms both the state-of-the-art model with similar capacity, and the LLM teacher. In a systematic study, we compare TrueTeacher to existing synthetic data generation methods and demonstrate its superiority and robustness to domain-shift. We also show that our method generalizes to multilingual scenarios. Lastly, we release our large-scale synthetic dataset (1.4M examples), generated using TrueTeacher, and a checkpoint trained on this data.(1)

关键词： Large datasets

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：