检索结果-内蒙古大学图书馆

conference on empirical methods in natural language processing (EMNLP)

作者： Yang, Sen Li, Xin Bing, Lidong Lam, Wai Chinese Univ Hong Kong Hong Kong Peoples R China Alibaba Grp DAMO Acad Hangzhou Peoples R China Hupan Lab Hangzhou 310023 Peoples R China

ISBN: (纸本)9798891760608

Our physical world is constantly evolving over time, rendering challenges for pre-trained language models to understand and reason over the temporal contexts of texts. Existing work focuses on strengthening the direct association between a piece of text and its time-stamp. However, the knowledge-time association is usually insufficient for the downstream tasks that require reasoning over temporal dependencies between knowledge. In this work, we make use of the underlying nature of time, all temporally-scoped sentences are strung together through a one-dimensional time axis, and suggest creating a graph structure based on the relative placements of events along the time axis. Inspired by the graph view, we propose REMEMO (Relative Time Modeling), which explicitly connects all temporally-scoped facts by modeling the time relations between any two sentences. Experimental results show that REMEMO outperforms the baseline T5 on multiple temporal question answering datasets under various settings. Further analysis suggests that REMEMO is especially good at modeling long-range complex temporal dependencies. We release our code and pretrained checkpoints at https://***/DAMO-NLP-SG/RemeMo.

关键词： Graphic methods

来源：评论

学校读者我要写书评

暂无评论

DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with language Models

DecorateLM: Data Engineering through Corpus Rating, Tagging,...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Zhao, Ranchi Thai, Zhen Leng Zhang, Yifan Hu, Shengding Ba, Yunqi Zhou, Jie Cai, Jie Liu, Zhiyuan Sun, Maosong Modelbest Inc. China Department of Computer Science and Technology Tsinghua University China

ISBN: (纸本)9798891761643

The performance of Large language Models (LLMs) is substantially influenced by the pretraining corpus, which consists of vast quantities of unsupervised data processed by the models. Despite its critical role in model performance, ensuring the quality of this data is challenging due to its sheer volume and the absence of sample-level quality annotations and enhancements. In this paper, we introduce DecorateLM, a data engineering method designed to refine the pretraining corpus through data rating, tagging and editing. Specifically, DecorateLM rates texts against quality criteria, tags texts with hierarchical labels, and edits texts into a more formalized format. Due to the massive size of the pretraining corpus, adopting an LLM for decorating the entire corpus is less efficient. Therefore, to balance performance with efficiency, we curate a meticulously annotated training corpus for DecorateLM using a large language model and distill data engineering expertise into a compact 1.2 billion parameter small language model (SLM). We then apply DecorateLM to enhance 100 billion tokens of the training corpus, selecting 45 billion tokens that exemplify high quality and diversity for the further training of another 1.2 billion parameter LLM. Our results demonstrate that employing such high-quality data can significantly boost model performance, showcasing a powerful approach to enhance the quality of the pretraining corpus. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Seemingly Plausible Distractors in Multi-Hop Reasoning: Are Large language Models Attentive Readers?

Seemingly Plausible Distractors in Multi-Hop Reasoning: Are ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Bhuiya, Neeladri Schlegel, Viktor Winkler, Stefan Singapore National University of Singapore Singapore University of Manchester United Kingdom Imperial Global Singapore Singapore

ISBN: (纸本)9798891761643

State-of-the-art Large language Models (LLMs) are accredited with an increasing number of different capabilities, ranging from reading comprehension over advanced mathematical and reasoning skills to possessing scientific knowledge. In this paper we focus on multi-hop reasoning-the ability to identify and integrate information from multiple textual sources. Given the concerns with the presence of simplifying cues in existing multi-hop reasoning benchmarks, which allow models to circumvent the reasoning requirement, we set out to investigate whether LLMs are prone to exploiting such simplifying cues. We find evidence that they indeed circumvent the requirement to perform multi-hop reasoning, but they do so in more subtle ways than what was reported about their fine-tuned pre-trained language model (PLM) predecessors. We propose a challenging multi-hop reasoning benchmark by generating seemingly plausible multi-hop reasoning chains that ultimately lead to incorrect answers. We evaluate multiple open and proprietary state-of-the-art LLMs and show that their multi-hop reasoning performance is affected, as indicated by up to 45% relative decrease in F1 score when presented with such seemingly plausible alternatives. We also find that-while LLMs tend to ignore misleading lexical cues-misleading reasoning paths indeed present a significant challenge. The code and data are made available at https://***/zawedcvg/Are-Large-language-Models-Attentive-Readers. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Assessing the influence of attractor-verb distance on grammatical agreement in humans and language models

Assessing the influence of attractor-verb distance on gramma...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Zacharopoulos, Christos-Nikolaos Desbordes, Theo Sable-Meyer, Mathias NeuroSpin Ctr Cognit Neuroimaging Unit Gif Sur Yvette France Sensome SAS Massy France Meta AI Res Menlo Pk CA USA Univ PSL Coll France Paris France

ISBN: (纸本)9798891760608

Subject-verb agreement in the presence of an attractor noun located between the main noun and the verb elicits complex behavior: judgments of grammaticality are modulated by the grammatical features of the attractor. For example, in the sentence "The girl near the boys likes climbing", the attractor (boys) disagrees in grammatical number with the verb (likes), creating a locally implausible transition probability. Here, we parametrically modulate the distance between the attractor and the verb while keeping the length of the sentence equal. We evaluate the performance of both humans and two artificial neural network models: both make more mistakes when the attractor is closer to the verb, but neural networks get close to the chance level while humans are mostly able to overcome the attractor interference. Additionally, we report a linear effect of attractor distance on reaction times. We hypothesize that a possible reason for the proximity effect is the calculation of transition probabilities between adjacent words. Nevertheless, classical models of attraction such as the cue-based model might suffice to explain this phenomenon, thus paving the way for new research. Data and analyses available at https://***/d4g6k

关键词： Neural networks

来源：评论

学校读者我要写书评

暂无评论

Gender Identity in Pretrained language Models: An Inclusive Approach to Data Creation and Probing

Gender Identity in Pretrained Language Models: An Inclusive ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Knupleš, Urban Falenska, Agnieszka Miletić, Filip Institute for Natural Language Processing University of Stuttgart Germany Interchange Forum for Reflecting on Intelligent Systems University of Stuttgart Germany

ISBN: (纸本)9798891761681

Pretrained language models (PLMs) have been shown to encode binary gender information of text authors, raising the risk of skewed representations and downstream harms. This effect is yet to be examined for transgender and non-binary identities, whose frequent marginalization may exacerbate harmful system behaviors. Addressing this gap, we first create TRANSCRIPT, a corpus of YouTube transcripts from transgender, cisgender, and non-binary speakers. Using this dataset, we probe various PLMs to assess if they encode the gender identity information, examining both frozen and fine-tuned representations as well as representations for inputs with author-specific words removed. Our findings reveal that PLM representations encode information for all gender identities but to different extents. The divergence is most pronounced for cis women and non-binary individuals, underscoring the critical need for gender-inclusive approaches to NLP systems. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

An empirical investigation of the neural base approaches based on the sentence length using low-resource language: English-to-Nyishi

引用

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2025年 1-11页

作者： Kakum, Nabam Kri, Rushanti Sambyo, Koj Natl Inst Technol Dept CSE Jote Arunachal Prade India

Machine translation eliminates the obstacles caused by linguistic disparities around the world. The automatic translation of natural languages using machine translation methods breaks communication barriers and brings people closer together, regardless of language differences. Over the years, neural-based automatic natural language translation has achieved tremendous success. Despite its massive success, the neural-based approach is corpus-based, meaning that prediction accuracy depends on the input data volume. Recent years have witnessed an enormous growth in research into machine translation of various Indian languages. However, several Indian languages are still under investigation because their resources are inefficient for machine translation methods. In our experiment, we used a variety of neural-based techniques to assess the translation performance of the Nyishi-to-English corpora based on sentence lengths, a language with extremely limited online and offline resources. The experiment aims to identify the model's performance based on the lengths of sentences with limited resources. We used the BLEU score up to 4-gram precision, Chrf, sacreBLEU and TER to judge the quality of prediction. Finally, we use the human evaluation method to evaluate the prediction error. We use BPE tokenization to handle the rare word difficulties of low-resource tonal languages. With BPE, all models with sentence lengths of 1-10 words and a BLEU score perform flawlessly.

关键词： BPE NMT Transformer Chrf sacreBLEU Nyishi

来源：评论

学校读者我要写书评

暂无评论

natural language Annotations for Reasoning about Program Semantics

Natural Language Annotations for Reasoning about Program Sem...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Zocca, Marco UnfoldML Gothenburg Sweden

ISBN: (纸本)9798891760615

By grounding natural language inference in code (and vice versa), researchers aim to create programming assistants that explain their work, are "coachable" and can surface any gaps in their reasoning. Can we deduce automatically interesting properties of programs from their syntax and common-sense annotations alone, without resorting to static analysis? How much of program logic and behaviour can be captured in natural language? To stimulate research in this direction and attempt to answer these questions we propose HTL, a dataset and protocol for annotating programs with natural language predicates at a finer granularity than code comments and without relying on internal compiler representations. The dataset is available at the following address: https://***/10.5281/zenodo.7893113

关键词： Static analysis

来源：评论

学校读者我要写书评

暂无评论

FAC2E: Better Understanding Large language Model Capabilities by Dissociating language and Cognition

FAC2E: Better Understanding Large Language Model Capabilitie...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Wang, Xiaoqiang Wu, Lingfei Ma, Tengfei Liu, Bang DIRO Institut Courtois Université de Montréal Canada Mila - Quebec AI Institute Canada Anytime.AI Stony Brook University United States

ISBN: (纸本)9798891761643

Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. However, such a paradigm fails to comprehensively differentiate the fine-grained language and cognitive skills, rendering the lack of sufficient interpretation to LLMs' capabilities. In this paper, we present FAC2E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation. Specifically, we formulate LLMs' evaluation in a multi-dimensional and explainable manner by dissociating the language-related capabilities and the cognition-related ones. Besides, through extracting the intermediate reasoning from LLMs, we further break down the process of applying a specific capability into three sub-steps: recalling relevant knowledge, utilizing knowledge, and solving problems. Finally, FAC2E evaluates each sub-step of each fine-grained capability, providing a two-faceted diagnosis for LLMs. Utilizing FAC2E, we identify a common shortfall in knowledge utilization among models and propose a straightforward, knowledge-enhanced method to mitigate this issue. Our results not only showcase promising performance enhancements but also highlight a direction for future LLM advancements. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

KnowTuning: Knowledge-aware Fine-tuning for Large language Models

KnowTuning: Knowledge-aware Fine-tuning for Large Language M...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Lyu, Yougang Yan, Lingyong Wang, Shuaiqiang Shi, Haibo Yin, Dawei Ren, Pengjie Chen, Zhumin de Rijke, Maarten Ren, Zhaochun Shandong University Qingdao China Baidu Inc. Beijing China University of Amsterdam Amsterdam Netherlands Leiden University Leiden Netherlands

ISBN: (纸本)9798891761643

Despite their success at many natural language processing (NLP) tasks, large language models (LLMs) still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla finetuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. We devise a fine-grained knowledge augmentation stage to train LLMs to identify difficult fine-grained knowledge in answers. We also propose a coarse-grained knowledge comparison stage to train LLMs to distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. We further verify that KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

The Lou Dataset Exploring the Impact of Gender-Fair language in German Text Classification

The Lou Dataset Exploring the Impact of Gender-Fair Language...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Waldis, Andreas Birrer, Joel Lauscher, Anne Gurevych, Iryna Technical University of Darmstadt Germany Information Systems Research Lab Lucerne University of Applied Sciences and Arts Switzerland Data Science Group University of Hamburg Germany

ISBN: (纸本)9798891761643

Gender-fair language, an evolving German linguistic variation, fosters inclusion by addressing all genders or using neutral forms. Nevertheless, there is a significant lack of resources to assess the impact of this linguistic shift on classification using language models (LMs), which are probably not trained on such variations. To address this gap, we present Lou, the first dataset featuring high-quality reformulations for German text classification covering seven tasks, like stance detection and toxicity classification. Evaluating 16 mono- and multi-lingual LMs on Lou shows that gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns. However, existing evaluations remain valid, as LM rankings of original and reformulated instances do not significantly differ. While we offer initial insights on the effect on German text classification, the findings likely apply to other languages, as consistent patterns were observed in multi-lingual and English LMs. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：