检索结果-内蒙古大学图书馆

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Zeng, Zihao Miao, Yibo Gao, Hongcheng Zhang, Hao Deng, Zhijie Qing Yuan Research Institute SEIEE Shanghai Jiao Tong University China University of Chinese Academy of Sciences China University of California San Diego United States

ISBN: (纸本)9798891761681

Mixture of experts (MoE) has become the standard for constructing production-level large language models (LLMs) due to its promise to boost model capacity without causing significant overheads. Nevertheless, existing MoE methods usually enforce a constant top-k routing for all tokens, which is arguably restrictive because various tokens (e.g., "" vs. "apple") may require various numbers of experts for feature abstraction. Lifting such a constraint can help make the most of limited resources and unleash the potential of the model for downstream tasks. In this sense, we introduce AdaMOE to realize token-adaptive routing for MoE, where different tokens are permitted to select a various number of experts. AdaMOE makes minimal modifications to the vanilla MoE with top-k routing-it simply introduces a fixed number of null experts, which do not consume any FLOPs, to the expert set and increases the value of k. AdaMOE does not force each token to occupy a fixed number of null experts but ensures the average usage of the null experts with a load-balancing loss, leading to an adaptive number of null/true experts used by each token. AdaMOE exhibits a strong resemblance to MoEs with expert choice routing while allowing for trivial auto-regressive modeling. AdaMOE is easy to implement and can be effectively applied to pre-trained (MoE-)LLMs. Extensive studies show that AdaMOE can reduce average expert load (FLOPs) while achieving superior performance. For example, on the ARC-C dataset, applying our method to fine-tuning Mixtral-8x7B can reduce FLOPs by 14.5% while increasing accuracy by 1.69%. Code is available at this link. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Are Large language Models (LLMs) Good Social Predictors?

Are Large Language Models (LLMs) Good Social Predictors?

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Yang, Kaiqi Li, Hang Wen, Hongzhi Peng, Tai-Quan Tang, Jiliang Liu, Hui Michigan State University East LansingMI United States

ISBN: (纸本)9798891761681

With the recent advancement of Large language Models (LLMs), efforts have been made to leverage LLMs in crucial social science study methods, including predicting human features of social life such as presidential voting. Existing works suggest that LLMs are capable of generating human-like responses. Nevertheless, it is unclear how well LLMs work and where the plausible predictions derive from. This paper critically examines the performance of LLMs as social predictors, pointing out the source of correct predictions and limitations. Based on the notion of mutability that classifies social features, we design three realistic settings and a novel social prediction task, where the LLMs make predictions with input features of the same mutability and accessibility with the response feature. We find that the promising performance achieved by previous studies is because of input shortcut features to the response, which are hard to capture in reality;the performance degrades dramatically to near-random after removing the shortcuts. With the comprehensive investigations on various LLMs, we reveal that LLMs struggle to work as expected on social prediction when given ordinarily available input features without shortcuts. We further investigate possible reasons for this phenomenon and suggest potential ways to enhance LLMs for social prediction. © 2024 Association for Computational Linguistics.

关键词： Prediction models

来源：评论

学校读者我要写书评

暂无评论

GAMA: A Large Audio-language Model with Advanced Audio Understanding and Complex Reasoning Abilities

GAMA: A Large Audio-Language Model with Advanced Audio Under...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Ghosh, Sreyan Kumar, Sonal Seth, Ashish Evuru, Chandra Kiran Reddy Tyagi, Utkarsh Sakshi, S. Nieto, Oriol Duraiswami, Ramani Manocha, Dinesh University of Maryland College Park United States Adobe United States

ISBN: (纸本)9798891761643

Perceiving and understanding non-speech sounds and non-verbal speech is essential to making decisions that help us interact with our surroundings. In this paper, we propose GAMA, a novel General-purpose Large Audiolanguage Model (LALM) with Advanced Audio Understanding and Complex Reasoning Abilities. We build GAMA by integrating an LLM with multiple types of audio representations, including features from a custom Audio Q-Former, a multi-layer aggregator that aggregates features from multiple layers of an audio encoder. We fine-tune GAMA on a large-scale audio-language dataset, which augments it with audio understanding capabilities. Next, we propose CompA-R (Instruction-Tuning for Complex Audio Reasoning), a synthetically generated instruction-tuning (IT) dataset with instructions that require the model to perform complex reasoning on the input audio. We instruction-tune GAMA with CompA-R to endow it with complex reasoning abilities, where we further add a soft prompt as input with high-level semantic evidence by leveraging event tags of the input audio. Finally, we also propose CompA-R-test, a human-labeled evaluation dataset for evaluating the capabilities of LALMs on open-ended audio question-answering that requires complex reasoning. Through automated and expert human evaluations, we show that GAMA outperforms all other LALMs in literature on diverse audio understanding tasks by margins of 1%-84% and demonstrates state-of-the-art performance on deductive reasoning and hallucination evaluation benchmarks. Further, GAMA IT-ed on CompA-R proves to be superior in its complex reasoning capabilities. © 2024 Association for Computational Linguistics.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Data-Centric AI in the Age of Large language Models

Data-Centric AI in the Age of Large Language Models

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Xu, Xinyi Wu, Zhaoxuan Qiao, Rui Verma, Arun Shu, Yao Wang, Jingtan Niu, Xinyuan He, Zhenfeng Chen, Jiangwei Zhou, Zijian Lau, Gregory Kang Ruey Dao, Hieu Agussurja, Lucas Sim, Rachael Hwee Ling Lin, Xiaoqiang Hu, Wenyang Dai, Zhongxiang Koh, Pang Wei Low, Bryan Kian Hsiang National University of Singapore Singapore Agency for Science Technology and Research Singapore Singapore-MIT Alliance for Research and Technology Singapore China CNRS@CREATE France The Chinese University of Hong Kong Shenzhen China Allen Institute for AI United States University of Washington United States

ISBN: (纸本)9798891761681

This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making a key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and advocate that data-centric research should receive more attention from the community. We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization. In each scenario, we underscore the importance of data, highlight promising research directions, and articulate the potential impacts on the research community and, where applicable, the society as a whole. For instance, we advocate for a suite of data-centric benchmarks tailored to the scale and complexity of data for LLMs. These benchmarks can be used to develop new data curation methods and document research efforts and results, which can help promote openness and transparency in AI and LLM research. © 2024 Association for Computational Linguistics.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Creating legitimacy for cultured meat in Germany: The role of social cohesion

引用

ENVIRONMENTAL INNOVATION AND SOCIETAL TRANSITIONS 2024年 52卷

作者： Weckowska, D. Weiss, D. Fiala, V. Nemeczek, F. Voss, F. Dreher, C. Free Univ Berlin Chair Innovat Management Dept Management Thielallee 73 D-14195 Berlin Germany Free Univ Berlin Dept Polit & Social Sci Otto Suhr Inst Polit Sci Berlin Germany Goethe Univ Frankfurt Main Fac Econ & Business Adm House Finance Chair Personal Finance Frankfurt Germany

Few studies on legitimation of new technologies were able to provide insights into the longitudinal changes in legitimacy outcomes and the social dynamics that underpin such outcomes. Using a novel mixed-methods approach, combining natural language processing with a qualitative text analysis, and drawing on the concept of social cohesion to investigate the social relations among actors, the study offers new insights into the legitimation of cultured meat in Germany. Using 424 newspaper articles, we identify four topics in the public discourse related to cultured meat and positive average sentiment on each topic over the period 2011 -2021. Furthermore, we find the actors, groups, and social relations that shape the observed legitimacy outcomes. The empirical findings are used to develop propositions about the role of social cohesion in legitimacy creation. The study paves the way for future studies on social cohesion dynamics in socio-technical change.

关键词： Technological innovation system Legitimacy Cultured meat natural language processing Social cohesion Protein transitions

来源：评论

学校读者我要写书评

暂无评论

Cross-modality Information Check for Detecting Jailbreaking in Multimodal Large language Models

Cross-modality Information Check for Detecting Jailbreaking ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Xu, Yue Qi, Xiuyuan Qin, Zhan Wang, Wenjie School of Information Science and Technology ShanghaiTech University China The State Key Laboratory of Blockchain and Data Security Zhejiang University China

ISBN: (纸本)9798891761681

Multimodal Large language Models (MLLMs) extend the capacity of LLMs to understand multimodal information comprehensively, achieving remarkable performance in many vision-centric tasks. Despite that, recent studies have shown that these models are susceptible to jailbreak attacks, which refer to an exploitative technique where malicious users can break the safety alignment of the target model and generate misleading and harmful answers. This potential threat is caused by both the inherent vulnerabilities of LLM and the larger attack scope introduced by vision input. To enhance the security of MLLMs against jailbreak attacks, researchers have developed various defense techniques. However, these methods either require modifications to the model's internal structure or demand significant computational resources during the inference phase. Multimodal information is a double-edged sword. While it increases the risk of attacks, it also provides additional data that can enhance safeguards. Inspired by this, we propose Cross-modality Information DEtectoR (CIDER), a plug-and-play jailbreaking detector designed to identify maliciously perturbed image inputs, utilizing the cross-modal similarity between harmful queries and adversarial images. CIDER is independent of the target MLLMs and requires less computation cost. Extensive experimental results demonstrate the effectiveness and efficiency of CIDER, as well as its transferability to both white-box and black-box MLLMs. The resource is available at https://***/PandragonXIII/CIDER. © 2024 Association for Computational Linguistics.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Zero-shot Commonsense Reasoning over Machine Imagination

Zero-shot Commonsense Reasoning over Machine Imagination

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Park, Hyuntae Kim, Yeachan Park, Jun-Hyung Lee, SangKeun Department of Artificial Intelligence Korea University Seoul Korea Republic of Division of Language & AI Hankuk University of Foreign Studies Seoul Korea Republic of Department of Computer Science and Engineering Korea University Seoul Korea Republic of

ISBN: (纸本)9798891761681

Recent approaches to zero-shot commonsense reasoning have enabled Pre-trained language Models (PLMs) to learn a broad range of commonsense knowledge without being tailored to specific situations. However, they often suffer from human reporting bias inherent in textual commonsense knowledge, leading to discrepancies in understanding between PLMs and humans. In this work, we aim to bridge this gap by introducing an additional information channel to PLMs. We propose IMAGINE (Machine Imagination-based Reasoning), a novel zero-shot commonsense reasoning framework designed to complement textual inputs with visual signals derived from machine-generated images. To achieve this, we enhance PLMs with imagination capabilities by incorporating an image generator into the reasoning process. To guide PLMs in effectively leveraging machine imagination, we create a synthetic pretraining dataset that simulates visual question-answering. Our extensive experiments on diverse reasoning benchmarks and analysis show that IMAGINE outperforms existing methods by a large margin, highlighting the strength of machine imagination in mitigating reporting bias and enhancing generalization capabilities. © 2024 Association for Computational Linguistics.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

TRIAGEAGENT: Towards Better Multi-Agents Collaborations for Large language Model-Based Clinical Triage

TRIAGEAGENT: Towards Better Multi-Agents Collaborations for ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Lu, Meng Ho, Brandon Ren, Dennis Wang, Xuan Department of Computer Science Virginia Tech BlacksburgVA United States Children's National Hospital WashingtonDC United States

ISBN: (纸本)9798891761681

The global escalation in emergency department patient visits poses significant challenges to efficient clinical management, particularly in clinical triage. Traditionally managed by human professionals, clinical triage is susceptible to substantial variability and high workloads. Although large language models (LLMs) demonstrate promising reasoning and understanding capabilities, directly applying them to clinical triage remains challenging due to the complex and dynamic nature of the clinical triage task. To address these issues, we introduce TRIAGEAGENT, a novel heterogeneous multi-agent framework designed to enhance collaborative decision-making in clinical triage. TRIAGEAGENT leverages LLMs for role-playing, incorporating self-confidence and early-stopping mechanisms in multi-round discussions to improve document reasoning and classification precision for triage tasks. In addition, TRIAGEAGENT employs the medical Emergency Severity Index (ESI) handbook through a retrieval-augmented generation (RAG) approach to provide precise clinical knowledge and integrates both coarse- and fine-grained ESI-level predictions in the decision-making process. Extensive experiments demonstrate that TRIAGEAGENT outperforms state-of-the-art LLM-based methods on three clinical triage test sets. Furthermore, we have released the first public benchmark dataset for clinical triage with corresponding ESI levels and human expert performance for comparison. Our dataset and code can be found at https://***/Lucanyc/TriageAgent. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

MobileVLM: A Vision-language Model for Better Intra-and Inter-UI Understanding

MobileVLM: A Vision-Language Model for Better Intra-and Inte...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Wu, Qinzhuo Xu, Weikai Liu, Wei Tan, Tao Liu, Jianfeng Li, Ang Luan, Jian Wang, Bin Shang, Shuo XiaoMi AI Lab China University of Electronic Science and Technology of China China Gaoling School of Artificial Intelligence Renmin University of China China

ISBN: (纸本)9798891761681

Recently, mobile AI agents based on VLMs have gained increasing *** works typically utilize VLM pre-trained on general-domain data as a foundation, fine-tuning it on instruction-based mobile ***, the proportion of mobile UI in general pretraining data is very ***, the general pre-training task does not particularly consider the characteristics of mobile ***, directly applying such pre-trained models for mobile UI instruction fine-tuning will not yield the desired *** this paper, we propose MobileVLM for Chinese UI *** top of the general pre-training model, two additional pre-training stages are implemented with four specific tasks to enhance both intra-and inter-UI *** addition, a large Chinese mobile UI corpus, named Mobile3M, is built from scratch to compensate for the lack of relevant *** 3 million static UI pages, it also contains directed graph structures formed by real-world UI transition *** results show MobileVLM excels on both in-house test sets and public mobile benchmarks, outperforming existing *** and Code are available at https://***/XiaoMi/mobilevlm. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

ASRLM: ASR-Robust language Model Pre-training via Generative and Discriminative Learning 13th

ASRLM: ASR-Robust Language Model Pre-training via Generative...

引用

13th International conference on natural language processing and Chinese Computing

作者： Hu, Qian Han, Xue Wang, Yiting Wang, Yitong Deng, Chao Feng, Junlan China Mobile Res Inst JiuTian Team Beijing Peoples R China

ISBN: (纸本)9789819794362;9789819794379

The rise of voice interface applications has renewed interest in improving the robustness of spoken language understanding(SLU). Many advances have come from end-to-end speech-language joint training, such as inferring semantics directly from speech signals and post-editing automatic speech recognition (ASR) output. Despite their performance achievements, these methods either suffer from the unavailability of a large number of paired error-prone ASR transcriptions and ground-truth annotations or are computationally costly. To mitigate these issues, we propose an ASR-robust pre-trained language model (ASRLM), which involves a generator generating simulated ASR transcriptions from ground-truth annotations and a sample-efficient discriminator distinguishing reasonable ASR errors from unrealistic ones. Experimental results demonstrate that ASRLM improves performance on a wide range of SLU tasks in the presence of ASR errors while saving 27% of the computation cost compared to baselines. Analysis also shows that our proposed generator is better than other simulation methods, including both BERT and GPT4-based, at simulating real-world ASR error situations.

关键词： Spoken language understanding Automatic speech recognition Pre-trained language model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：