检索结果-内蒙古大学图书馆

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Sharma, Aditya Saxon, Michael Wang, William Yang University of California Santa Barbara United States

ISBN: (纸本)9798891761681

We present LOCOVQA, a dynamic benchmark generator for evaluating long-context extractive reasoning in vision language models (VLMs). LOCOVQA augments test examples for mathematical reasoning, VQA, and character recognition tasks with increasingly long visual contexts composed of both in-distribution and out-of-distribution distractor images. Across these tasks, a diverse set of VLMs rapidly lose performance as the visual context length grows, often exhibiting a striking logarithmic decay trend. This test assesses how well VLMs can ignore irrelevant information when answering queries-a task that is quite easy for language models (LMs) in the text domain-demonstrating that current state-ofthe-art VLMs lack this essential capability for many long-context applications. © 2024 Association for Computational Linguistics.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

How to Leverage Demonstration Data in Alignment for Large language Model? A Self-Imitation Learning Perspective

How to Leverage Demonstration Data in Alignment for Large La...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Xiao, Teng Li, Mingxiao Yuan, Yige Zhu, Huaisheng Cui, Chao Honavar, Vasant G. Artificial Intelligence Research Laboratory Pennsylvania State University United States Tencent AI Lab China Institute of Computing Technology Chinese Academy of Sciences China Tsinghua University China

ISBN: (纸本)9798891761643

This paper introduces a novel generalized self-imitation learning (GSIL) framework, which effectively and efficiently aligns large language models with offline demonstration data. We develop GSIL by deriving a surrogate objective of imitation learning with density ratio estimates, facilitating the use of self-generated data and optimizing the imitation learning objective with simple classification losses. GSIL eliminates the need for complex adversarial training in standard imitation learning, achieving lightweight and efficient fine-tuning for large language models. In addition, GSIL encompasses a family of offline losses parameterized by a general class of convex functions for density ratio estimation and enables a unified view for alignment with demonstration data. Extensive experiments show that GSIL consistently and significantly outperforms baselines in many challenging benchmarks, such as coding (HuamnEval), mathematical reasoning (GSM8K) and instruction-following benchmark (MT-Bench). Code is public available at https://***/tengxiao1/GSIL. © 2024 Association for Computational Linguistics.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Scaling Parameter-Constrained language Models with Quality Data

Scaling Parameter-Constrained Language Models with Quality D...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Chang, Ernie Paltenghi, Matteo Li, Yang Lin, Pin-Jie Zhao, Changsheng Huber, Patrick Liu, Zechun Rabatin, Rastislav Shi, Yangyang Chandra, Vikas AI at Meta United States Iowa State University United States Virginia Tech United States

ISBN: (纸本)9798891761667

Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization. In this paper, we extend the conventional understanding of scaling law by offering a microscopic view of data quality within the original formulation – effective training tokens – which we posit to be a critical determinant of performance for parameter-constrained language models. Specifically, we formulate the proposed term of effective training tokens to be a combination of two readily-computed indicators of text: (i) text diversity and (ii) syntheticity as measured by a teacher model. We pretrained over 200 models of 25M to 1.5B parameters on a diverse set of sampled, synthetic data, and estimated the constants that relate text quality, model size, training tokens, and eight reasoning task accuracy scores. We demonstrated the estimated constants yield +0.83 Pearson correlation with true accuracies, and analyzed it in scenarios involving widely-used data techniques such as data sampling and synthesis which aim to improve data quality. © 2024 Association for Computational Linguistics.

关键词： Data assimilation

来源：评论

学校读者我要写书评

暂无评论

Unveiling Multi-level and Multi-modal Semantic Representations in the Human Brain using Large language Models

Unveiling Multi-level and Multi-modal Semantic Representatio...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Nakagi, Yuko Matsuyama, Takuya Koide-Majima, Naoko Yamaguchi, Hiroto Q. Kubo, Rieko Nishimoto, Shinji Takagi, Yu Osaka University Japan National Institute of Information and Communications Technology Japan National Institute of Informatics Japan

ISBN: (纸本)9798891761643

In recent studies, researchers have used large language models (LLMs) to explore semantic representations in the brain;however, they have typically assessed different levels of semantic content, such as speech, objects, and stories, *** this study, we recorded brain activity using functional magnetic resonance imaging (fMRI) while participants viewed 8.3 hours of dramas and *** annotated these stimuli at multiple semantic levels, which enabled us to extract latent representations of LLMs for this *** findings demonstrate that LLMs predict human brain activity more accurately than traditional language models, particularly for complex background ***, we identify distinct brain regions associated with different semantic representations, including multi-modal vision-semantic representations, which highlights the importance of modeling multi-level and multimodal semantic representations *** will make our fMRI dataset publicly available to facilitate further research on aligning LLMs with human brain function. © 2024 Association for Computational Linguistics.

关键词： Functional neuroimaging

来源：评论

学校读者我要写书评

暂无评论

A Unified Framework and Dataset for Assessing Societal Bias in Vision-language Models

A Unified Framework and Dataset for Assessing Societal Bias ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Sathe, Ashutosh Jain, Prachi Sitaram, Sunayana Microsoft Research India MSR

ISBN: (纸本)9798891761681

Vision-language models (VLMs) have gained widespread adoption in both industry and academia. In this study, we propose a unified framework for systematically evaluating gender, race, and age biases in VLMs with respect to professions. Our evaluation encompasses all supported inference modes of the recent VLMs, including image-to-text, text-to-text, text-to-image, and image-to-image. Additionally, we propose an automated pipeline to generate high-quality synthetic datasets that intentionally conceal gender, race, and age information across different professional domains, both in generated text and images. The dataset includes action-based descriptions of each profession and serves as a benchmark for evaluating societal biases in vision-language models (VLMs). In our comparative analysis of widely used VLMs, we have identified that varying input-output modalities lead to discernible differences in bias magnitudes and directions. Additionally, we find that VLM models exhibit distinct biases across different bias attributes we investigated. We hope our work will help guide future progress in improving VLMs to learn socially unbiased representations. We will release our data and code. © 2024 Association for Computational Linguistics.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

Word-Conditioned 3D American Sign language Motion Generation

Word-Conditioned 3D American Sign Language Motion Generation

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Dong, Lu Wang, Xiao Nwogu, Ifeoma University at Buffalo SUNY United States

ISBN: (纸本)9798891761681

Sign words are the building blocks of any sign *** this work, we present wSignGen, a word-conditioned 3D American Sign language (ASL) generation model dedicated to synthesizing realistic and grammatically accurate motion sequences for sign *** approach leverages a transformer-based diffusion model, trained on a curated dataset of 3D motion meshes from word-level ASL *** integrating CLIP, wSignGen offers two advantages: image-based generation, which is particularly useful for children learning sign language but not yet able to read, and the ability to generalize to unseen *** demonstrate that wSignGen significantly outperforms the baseline model in the task of sign word ***, human evaluation experiments show that wSignGen can generate high-quality, grammatically correct ASL signs effectively conveyed through 3D avatars. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Can LMs Generalize to Future Data? An empirical Analysis on Text Summarization

Can LMs Generalize to Future Data? An Empirical Analysis on ...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Cheang, Chi Seng Chan, Hou Pong Wong, Derek F. Liu, Xuebo Li, Zhaocong Sun, Yanming Liu, Shudong Chao, Lidia S. Univ Macau Dept Comp & Informat Sci NLP2CT Lab Macau Peoples R China Univ Macau Inst Collaborat Innovat Macau Peoples R China Harbin Inst Technol Inst Comp & Intelligence Shenzhen Peoples R China

ISBN: (纸本)9798891760608

Recent pre-trained language models (PLMs) achieve promising results in existing abstractive summarization datasets. However, existing summarization benchmarks overlap in time with the standard pre-training corpora and fine-tuning datasets. Hence, the strong performance of PLMs may rely on the parametric knowledge that is memorized during pre-training and fine-tuning. Moreover, the knowledge memorized by PLMs may quickly become outdated, which affects the generalization performance of PLMs on future data. In this work, we propose TEMPOSUM, a novel benchmark that contains data samples from 2010 to 2022, to understand the temporal generalization ability of abstractive summarization models. Through extensive human evaluation, we show that parametric knowledge stored in summarization models significantly affects the faithfulness of the generated summaries on future data. Moreover, existing faithfulness enhancement methods cannot reliably improve the faithfulness of summarization models on future data. Finally, we discuss several recommendations to the research community on how to evaluate and improve the temporal generalization capability of text summarization models.(1)

关键词： Text processing

来源：评论

学校读者我要写书评

暂无评论

Tasneef: A Fast and Effective Hybrid Representation Approach for Arabic Text Classification

引用

IEEE ACCESS 2024年 12卷 120804-120826页

作者： Louail, Maroua Hamdi-Cherif, Chafia Kara-Mohamed Hamdi-Cherif, Aboubekeur Ferhat Abbas Univ Setif 1 Comp Sci Dept LRSD Lab Setif 19000 Algeria Ferhat Abbas Univ Setif 1 Comp Sci Dept Setif 19000 Algeria

The Arabic language role in actual global affairs entails sophisticated natural language processing techniques, especially in text classification. This paper presents Tasneef as a novel hybrid approach to tackle computational challenges by reducing memory usage and runtime overhead for actual Arabic text classification (ATC). Tasneef integrates distance-based meta-features (DBMFs) representation with word embeddings. This integration is useful because using a single text representation technique can be limiting in capturing the essential range of features necessary for effective classification, especially in complex languages like Arabic. By addressing the intricacies arising from the high dimensionality and sparsity inherent in Term Frequency-Inverse Document Frequency (TF-IDF) representation, the utilization of DBMFs is shown to offer a promising solution. The DBMFs rely on document labels and statistical features to establish meaningful distance relationships between documents, thereby facilitating effective reduction. Furthermore, word embeddings encapsulate semantic attributes. empirical assessments reveal a significant reduction of two orders of magnitude in both memory usage and runtime. This reduction translates to memory savings ranging from 158x to 361x and runtime reductions from 120x to 524x across three popular datasets;maintaining comparable MicroF1 and MacroF1 values, while notably reducing learning time. Moreover, Tasneef outperforms ten state-of-the-art deep learning models and seven dimension reduction methods in accuracy, with enhancements ranging from 0.3% to 39.6%;and F-Measure, with improvements from 4.6% to 26.8%, across four additional datasets. These findings highlight Tasneef as a promising solution for diverse ATC applications in real-world scenarios, offering concise and rapid classification with reduced computational learning costs.

关键词： Text categorization Accuracy Runtime Linguistics Semantics Feature extraction natural language processing Text processing Arabic natural language processing Arabic text classification Arabic text representation feature extraction meta-features word embeddings

来源：评论

学校读者我要写书评

暂无评论

Distilling Instruction-following Abilities of Large language Models with Task-aware Curriculum Planning

Distilling Instruction-following Abilities of Large Language...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Yue, Yuanhao Wang, Chengyu Huang, Jun Wang, Peng School of Computer Science Fudan University Shanghai China Alibaba Cloud Computing Hangzhou China

ISBN: (纸本)9798891761681

Instruction tuning aims to align large language models (LLMs) with open-domain instructions and human-preferred responses. While several studies have explored autonomous approaches to distilling and annotating instructions from powerful proprietary LLMs, such as ChatGPT, they often neglect the impact of the distributions and characteristics of tasks, together with the varying difficulty of instructions in training sets. This oversight can lead to imbalanced knowledge capabilities and poor generalization powers of student LLMs. To address these challenges, we introduce Task-Aware Curriculum Planning for Instruction Refinement (TAPIR), a multi-round distillation framework that utilizes an oracle LLM to select instructions that are difficult for a student LLM to follow. To balance the student's capabilities, task distributions in training sets are adjusted with responses automatically refined according to their corresponding tasks. In addition, by incorporating curriculum planning, our approach systematically escalates the difficulty levels of tasks, progressively enhancing the student LLM's capabilities. We rigorously evaluate TAPIR using several widely recognized benchmarks (such as AlpacaEval 2.0, MT-Bench, etc.) and multiple student LLMs. empirical results demonstrate that student LLMs, trained with our method and less training data, outperform larger instruction-tuned models and strong distillation baselines. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Providing a natural language processing App for language Teachers 26th

Providing a Natural Language Processing App for Language Tea...

引用

26th International conference on Interactive Collaborative Learning (ICL) - Towards a Hybrid, Flexible and Socially Engaged Higher Education / 52nd IGIP International conference on Engineering Pedagogy

作者： Posekany, Alexandra Dolezal, Dominik TU Vienna Univ Technol Vienna Austria TGM Vienna Inst Technol Vienna Austria Univ Vienna Vienna Austria

ISBN: (纸本)9783031519789;9783031519796

natural language processing (NLP) is a common application for Artificial Intelligence. The goal is to provide language teachers with a simple to apply tool for topic model analyses to integrate into their classroom. The project also involves project based learning for students programming the actual AI web application. The original notion is to provide language teacher with AI methodology without requiring any technical knowledge in AI or any programming skills. natural language processing provides various tools for word frequencies, but also topic modelling, allowing to track the relevance of topics over time in the media or in the literature. In collaboration with University of Technology linguistics, we intend to provide a corpus of classical English and German literature, as well as the option of uploading your own corpus which can be obtained from webscraping or other sources. A team of students of the vocational high school TGM Wien specialised in IT and Software Development is working on the design of the interactive GUI for this NLP application, learning in this way the methods of natural language processing and Artrificial Intelligence in a project based setting. For this the statistical programming language R is utilized which already provides packages with implementation for natural language processing and in addition the shiny package which allows to develop interactie web apps without additional web and app programming. A team of teachers supervises and supports the students during the development process, providing expertise in AI and NLP, in web and app programming, as well as server management. Two intended outcomes exist. Ont the one hand, we want our students to learn natural language processing first hand through development of this application. On the other hand, we intend to obtain an interactive AI tool which can assist language teachers and their students on the long term in the classroom. In times of GPT3 and GPT4 dominating the media and per

关键词： Artificial intelligence natural language processing Interactive web application Project based learning Software development

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：