检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

14,600 篇 会议
627 篇 期刊文献
101 册 图书
37 篇 学位论文

馆藏范围

15,364 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

10,996 篇 工学
- 10,331 篇 计算机科学与技术...
- 5,391 篇 软件工程
- 1,449 篇 信息与通信工程
- 957 篇 电气工程
- 878 篇 控制科学与工程
- 433 篇 生物工程
- 222 篇 网络空间安全
- 218 篇 化学工程与技术
- 185 篇 机械工程
- 177 篇 生物医学工程（可授...
- 141 篇 电子科学与技术（可...
- 101 篇 仪器科学与技术
- 100 篇 安全科学与工程
2,447 篇 理学
- 1,138 篇 数学
- 652 篇 物理学
- 503 篇 生物学
- 379 篇 统计学（可授理学、...
- 240 篇 系统科学
- 231 篇 化学
2,381 篇 管理学
- 1,726 篇 图书情报与档案管...
- 742 篇 管理科学与工程(可...
- 235 篇 工商管理
- 104 篇 公共管理
1,823 篇 文学
- 1,771 篇 外国语言文学
- 169 篇 中国语言文学
503 篇 医学
- 301 篇 临床医学
- 282 篇 基础医学(可授医学...
- 111 篇 公共卫生与预防医...
275 篇 法学
- 245 篇 社会学
237 篇 教育学
- 225 篇 教育学
100 篇 农学
93 篇 经济学
10 篇 艺术学
7 篇 哲学
4 篇 军事学

主题

3,563 篇 natural language...
1,792 篇 natural language...
950 篇 computational li...
752 篇 semantics
678 篇 machine learning
620 篇 deep learning
518 篇 natural language...
376 篇 computational mo...
368 篇 accuracy
355 篇 training
351 篇 sentiment analys...
349 篇 large language m...
337 篇 feature extracti...
313 篇 data mining
289 篇 speech processin...
262 篇 transformers
255 篇 speech recogniti...
234 篇 neural networks
217 篇 iterative method...
216 篇 support vector m...

机构

85 篇 carnegie mellon ...
51 篇 university of ch...
45 篇 tsinghua univers...
44 篇 carnegie mellon ...
42 篇 zhejiang univers...
41 篇 national univers...
35 篇 univ chinese aca...
35 篇 nanyang technolo...
35 篇 carnegie mellon ...
34 篇 university of sc...
34 篇 university of wa...
33 篇 alibaba grp peop...
32 篇 gaoling school o...
32 篇 stanford univers...
30 篇 tsinghua univ de...
30 篇 school of artifi...
28 篇 peking universit...
27 篇 harbin institute...
27 篇 language technol...
26 篇 univ sci & techn...

作者

55 篇 zhou guodong
50 篇 neubig graham
46 篇 liu yang
39 篇 sun maosong
36 篇 zhang min
34 篇 liu qun
31 篇 smith noah a.
29 篇 lapata mirella
28 篇 schütze hinrich
26 篇 wen ji-rong
26 篇 liu zhiyuan
24 篇 chang kai-wei
23 篇 zhou jie
23 篇 yang diyi
23 篇 zhao hai
23 篇 zhao wayne xin
22 篇 wang wei
21 篇 chua tat-seng
20 篇 dredze mark
18 篇 biemann chris

语言

13,828 篇 英文
1,418 篇 其他
123 篇 中文
18 篇 法文
14 篇 土耳其文
2 篇 德文
2 篇 西班牙文
2 篇 俄文

检索条件"任意字段=Conference on empirical methods in natural language processing"

共 15365 条记录，以下是1291-1300 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

language and Mental Health: Measures of Emotion Dynamics from Text as Linguistic Biosocial Markers

Language and Mental Health: Measures of Emotion Dynamics fro...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Teodorescu, Daniela Cheng, Tiffany Fyshe, Alona Mohammad, Saif M. Univ Alberta Alberta Machine Intelligence Inst Amii Dept Comp Sci Edmonton AB Canada Ludwig Maximilians Univ Munchen Ctr Informat & Language Proc MaiNLP Munich Germany Carleton Univ Ottawa ON Canada Univ Alberta Dept Psychol Edmonton AB Canada Natl Res Council Canada Ottawa ON Canada Univ Alberta Edmonton AB Canada

ISBN: (纸本)9798891760608

Research in psychopathology has shown that, at an aggregate level, the patterns of emotional change over time-emotion dynamics-are indicators of one's mental health. One's patterns of emotion change have traditionally been determined through self-reports of emotions;however, there are known issues with accuracy, bias, and ease of data collection. Recent approaches to determining emotion dynamics from one's everyday utterances addresses many of these concerns, but it is not yet known whether these measures of utterance emotion dynamics (UED) correlate with mental health diagnoses. Here, for the first time, we study the relationship between tweet emotion dynamics and mental health disorders. We find that each of the UED metrics studied varied by the user's self-disclosed diagnosis. For example: average valence was significantly higher (i.e., more positive text) in the control group compared to users with ADHD, MDD, and PTSD. Valence variability was significantly lower in the control group compared to ADHD, depression, bipolar disorder, MDD, PTSD, and OCD but not PPD. Rise and recovery rates of valence also exhibited significant differences from the control. This work provides important early evidence for how linguistic cues pertaining to emotion dynamics can play a crucial role as biosocial markers for mental illnesses and aid in the understanding, diagnosis, and management of mental health disorders.

关键词： Dynamics

来源：评论

学校读者我要写书评

暂无评论

Mitigating the Alignment Tax of RLHF

Mitigating the Alignment Tax of RLHF

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Lin, Yong Lin, Hangyu Xiong, Wei Diao, Shizhe Liu, Jianmeng Zhang, Jipeng Pan, Rui Wang, Haoxiang Hu, Wenbin Zhang, Hanning Dong, Hanze Pi, Renjie Zhao, Han Jiang, Nan Ji, Heng Yao, Yuan Zhang, Tong Princeton University Princeton Language and Intelligence United States The Hong Kong University of Science and Technology Hong Kong University of Illinois Urbana-Champaign United States NVIDIA United States

ISBN: (纸本)9798891761643

LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting pretrained abilities, which is also known as the alignment tax. To investigate alignment tax, we conducted experiments with existing RLHF algorithms using OpenLLaMA-3B, which revealed a pronounced alignment tax in NLP tasks. Whereas, despite various techniques to mitigate forgetting, they are often at odds with the RLHF performance, leading to a trade-off between alignment performance and forgetting mitigation, leading to an alignment-forgetting trade-off. In this paper we show that model averaging, which simply interpolates between pre and post RLHF model weights, surprisingly achieves the most strongest alignment-forgetting Pareto front among a wide range of competing methods. To understand its effectiveness, we offer theoretical insights into model averaging, revealing that it enhances performance Pareto front by increasing feature diversity on the layers where tasks share overlapped feature spaces. empirical evidence corroborates our analysis by showing the benefits of averaging low-level transformer layers. Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different alignment-forgetting trade-offs, we propose Heterogeneous Model Averaging (HMA) to Heterogeneously find various combination ratios of model layers. HMA seeks to maximize the alignment performance while incurring minimal alignment tax. Moreover, we validate HMA's performance across a range of RLHF algorithms over OpenLLaMA-3B and further extend our findings to Mistral-7B which is evaluated by open-sourced preference model and GPT4. Code available here. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Large language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course

Large Language Model as an Assignment Evaluator: Insights, F...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Chiang, Cheng-Han Chen, Wei-Chih Kuan, Chun-Yi Yang, Chienchou Lee, Hung-Yi National Taiwan University Taiwan Mediatek Inc. Taiwan

ISBN: (纸本)9798891761643

Using large language models (LLMs) for automatic evaluation has become an important evaluation method in NLP research. However, it is unclear whether these LLM-based evaluators can be applied in real-world classrooms to assess student assignments. This empirical report shares how we use GPT-4 as an automatic assignment evaluator in a university course with 1,028 students. Based on student responses, we find that LLM-based assignment evaluators are generally acceptable to students when students have free access to these LLM-based evaluators. However, students also noted that the LLM sometimes fails to adhere to the evaluation instructions. Additionally, we observe that students can easily manipulate the LLM-based evaluator to output specific strings, allowing them to achieve high scores without meeting the assignment rubric. Based on student feedback and our experience, we provide several recommendations for integrating LLM-based evaluators into future classrooms. Our observation also highlights potential directions for improving LLM-based evaluators, including their instruction-following ability and vulnerability to prompt hacking. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Evaluating n-Gram Novelty of language Models Using RUSTY-DAWG

Evaluating n-Gram Novelty of Language Models Using RUSTY-DAW...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Merrill, William Smith, Noah A. Elazar, Yanai New York University United States Allen Institute for AI United States University of Washington United States

ISBN: (纸本)9798891761643

How novel are texts generated by language models (LMs) relative to their training corpora? In this work, we investigate the extent to which modern LMs generate n-grams from their training data, evaluating both (i) the probability LMs assign to complete training n-grams and (ii) n-novelty, the proportion of n-grams generated by an LM that did not appear in the training data (for arbitrarily large n). To enable arbitrary-length n-gram search over a corpus in constant time w.r.t. corpus size, we develop RUSTY-DAWG, a novel search tool inspired by indexing of genomic data. We compare the novelty of LM-generated text to human-written text and explore factors that affect generation novelty, focusing on the Pythia models. We find that, for n > 4, LM-generated text is less novel than human-written text, though it is more novel for smaller n. Larger LMs and more constrained decoding strategies both decrease novelty. Finally, we show that LMs complete n-grams with lower loss if they are more frequent in the training data. Overall, our results reveal factors influencing the novelty of LM-generated text, and we release RUSTY-DAWG to facilitate further pretraining data research. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

language is Scary when Over-Analyzed: Unpacking Implied Misogynistic Reasoning with Argumentation Theory-Driven Prompts

Language is Scary when Over-Analyzed: Unpacking Implied Miso...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Muti, Arianna Ruggeri, Federico Al-Khatib, Khalid Barrón-Cedeño, Alberto Caselli, Tommaso DIT Università di Bologna Forlì Italy DISI Università di Bologna Bologna Italy CLCG University of Groningen Groningen Netherlands

ISBN: (纸本)9798891761643

We propose misogyny detection as an Argumentative Reasoning task and we investigate the capacity of large language models (LLMs) to understand the implicit reasoning used to convey misogyny in both Italian and English. The central aim is to generate the missing reasoning link between a message and the implied meanings encoding the misogyny. Our study uses argumentation theory as a foundation to form a collection of prompts in both zero-shot and few-shot settings. These prompts integrate different techniques, including chain-of-thought reasoning and augmented knowledge. Our findings show that LLMs fall short on reasoning capabilities about misogynistic comments relying on their implicit knowledge derived from internalized common stereotypes about women to generate implied assumptions, rather than on inductive reasoning. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Where am I? Large language Models Wandering between Semantics and Structures in Long Contexts

Where am I? Large Language Models Wandering between Semantic...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Koo, Seonmin Kim, Jinsung Jang, YoungJoon Park, Chanjun Lim, Heuiseok Department of Computer science and Engineering Korea University Korea Republic of Upstage AI

ISBN: (纸本)9798891761643

As the utilization of Large language Models (LLMs) becomes more widespread, there is a growing demand for their ability to handle more complex and longer external knowledge across various use cases. Most existing evaluations of the open-ended question answering (ODQA) task, which necessitates the use of external knowledge, focus solely on whether the model provides the correct answer. However, even when LLMs answer correctly, they often fail to provide an obvious source for their responses. Therefore, it is necessary to jointly evaluate and verify the correctness of the answers and the appropriateness of grounded evidence in complex external contexts. To address this issue, we examine the phenomenon of discrepancies in abilities across two distinct tasks-QA and evidence selection-when performed simultaneously, from the perspective of task alignment. To verify LLMs' task alignment, we introduce a verification framework and resources considering both semantic relevancy and structural diversity of the given long context knowledge. Through extensive experiments and detailed analysis, we provide insights into the task misalignment between QA and evidence selection. Our code and resources can be found at https://***/seonminkoo/WAI. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

The COT COLLECTION: Improving Zero-shot and Few-shot Learning of language Models via Chain-of-Thought Fine-Tuning

The COT COLLECTION: Improving Zero-shot and Few-shot Learnin...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Kim, Seungone Jool, Se June Kim, Doyoung Jang, Joel Ye, Seonghyeon Shin, Jamin Seo, Minjoon KAIST AI Seoul South Korea NAVER AI Lab Bundangdong South Korea Univ Washington Seattle WA 98195 USA

ISBN: (纸本)9798891760608

language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this work, we aim to equip smaller LMs with the step-by-step reasoning capability by instruction tuning with CoT rationales. In order to achieve this goal, we first introduce a new instruction-tuning dataset called the COT COLLECTION, which augments the existing Flan Collection (including only 9 CoT tasks) with additional 1.84 million rationales across 1,060 tasks. We show that CoT fine-tuning Flan-T5 (3B & 11B) with COT COLLECTION enables smaller LMs to have better CoT capabilities on unseen tasks. On the BIG-Bench-Hard (BBH) benchmark, we report an average improvement of +4.34% (Flan-T5 3B) and +2.60% (Flan-T5 11B), in terms of zero-shot task accuracy. Furthermore, we show that instruction tuning with COT COLLECTION allows LMs to possess stronger few-shot learning capabilities on 4 domain-specific tasks, resulting in an improvement of +2.24% (Flan-T5 3B) and +2.37% (Flan-T5 11B), even outperforming ChatGPT utilizing demonstrations until the max length by a +13.98% margin. Our code, the COT COLLECTION data, and model checkpoints are publicly available (1).

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

Measuring Psychological Depth in language Models

Measuring Psychological Depth in Language Models

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Harel-Canada, Fabrice Zhou, Hanyu Muppalla, Sreya Yildiz, Zeynep Kim, Miryung Sahai, Amit Peng, Nanyun University of California Los Angeles United States

ISBN: (纸本)9798891761643

Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and diversity. While these metrics are indispensable, they do not speak to a story's subjective, psychological impact from a reader's perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories that provoke emotion, empathy, and engagement. We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha). We also explore techniques for automating the PDS to easily scale future analyses. GPT-4o, combined with a novel Mixture-of-Personas (MoP) prompting strategy, achieves an average Spearman correlation of 0.51 with human judgment while Llama-3-70B with constrained decoding scores as high as 0.68 for empathy. Finally, we compared the depth of stories authored by both humans and LLMs. Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit. By shifting the focus from text to reader, the Psychological Depth Scale is a validated, automated, and systematic means of measuring the capacity of LLMs to connect with humans through the stories they tell. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large language Models

ToolBeHonest: A Multi-level Hallucination Diagnostic Benchma...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Zhang, Yuxiang Chen, Jing Wang, Junjie Liu, Yaxin Yang, Cheng Shi, Chufan Zhu, Xinyu Lin, Zihao Wan, Hanwen Yang, Yujiu Sakai, Tetsuya Feng, Tian Yamana, Hayato Waseda University Japan Zhejiang University China Tsinghua University China CUHK Hong Kong Virginia Tech United States CUHK Shenzhen China

ISBN: (纸本)9798891761643

Tool-augmented large language models (LLMs) are rapidly being integrated into real-world applications. Due to the lack of benchmarks, the community has yet to fully understand the hallucination issues within these models. To address this challenge, we introduce a comprehensive diagnostic benchmark, ToolBH. Specifically, we assess the LLM's hallucinations through two perspectives: depth and breadth. In terms of depth, we propose a multi-level diagnostic process, including (1) solvability detection, (2) solution planning, and (3) missing-tool analysis. For breadth, we consider three scenarios based on the characteristics of the toolset: missing necessary tools, potential tools, and limited functionality tools. Furthermore, we developed seven tasks and collected 700 evaluation samples through multiple rounds of manual annotation. The results show the significant challenges presented by the ToolBH benchmark. The current advanced models Gemini-1.5-Pro and GPT-4o only achieve total scores of 45.3 and 37.0, respectively, on a scale of 100. In this benchmark, larger model parameters do not guarantee better performance;the training data and response strategies also play crucial roles in tool-enhanced LLM scenarios. Our diagnostic analysis indicates that the primary reason for model errors lies in assessing task solvability. Additionally, open-weight models suffer from performance drops with verbose replies, whereas proprietary models excel with longer reasoning. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-language Models

Towards Difficulty-Agnostic Efficient Transfer Learning for ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Yang, Yongjin Ko, Jongwoo Yun, Se-Young KAIST AI Korea Republic of

ISBN: (纸本)9798891761643

Vision-language models (VLMs) like CLIP have demonstrated remarkable applicability across a variety of downstream tasks, including zero-shot image classification. Recently, the use of prompts or adapters for efficient transfer learning (ETL) has gained significant attention for effectively adapting to downstream tasks. However, previous studies have overlooked the challenge of varying transfer difficulty of downstream tasks. In this paper, we empirically analyze how each ETL method behaves with respect to transfer difficulty. Our observations indicate that utilizing vision prompts and text adapters is crucial for adaptability and generalizability in domains with high difficulty. Also, by applying an adaptive ensemble approach that integrates task-adapted VLMs with pretrained VLMs and strategically leverages more general knowledge in low-difficulty and less in high-difficulty domains, we consistently enhance performance across both types of domains. Based on these observations, we propose an adaptive ensemble method that combines visual prompts and text adapters with pre-trained VLMs, tailored by transfer difficulty, to achieve optimal performance for any target domain. Upon experimenting with extensive benchmarks, our method consistently outperforms all baselines, particularly on unseen tasks, demonstrating its effectiveness. © 2024 Association for Computational Linguistics.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 126 127 128 129 130 131 132 133 134 135 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：