检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

14,413 篇 会议
646 篇 期刊文献
39 篇 学位论文
36 册 图书
1 篇 科技报告

馆藏范围

15,134 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

10,934 篇 工学
- 10,275 篇 计算机科学与技术...
- 5,404 篇 软件工程
- 1,460 篇 信息与通信工程
- 953 篇 电气工程
- 875 篇 控制科学与工程
- 446 篇 生物工程
- 221 篇 网络空间安全
- 220 篇 化学工程与技术
- 186 篇 机械工程
- 174 篇 生物医学工程（可授...
- 141 篇 电子科学与技术（可...
- 100 篇 仪器科学与技术
- 100 篇 安全科学与工程
2,473 篇 理学
- 1,150 篇 数学
- 649 篇 物理学
- 518 篇 生物学
- 391 篇 统计学（可授理学、...
- 241 篇 系统科学
- 232 篇 化学
2,413 篇 管理学
- 1,747 篇 图书情报与档案管...
- 754 篇 管理科学与工程(可...
- 239 篇 工商管理
- 104 篇 公共管理
1,761 篇 文学
- 1,709 篇 外国语言文学
- 184 篇 中国语言文学
510 篇 医学
- 299 篇 临床医学
- 282 篇 基础医学(可授医学...
- 112 篇 公共卫生与预防医...
277 篇 法学
- 249 篇 社会学
237 篇 教育学
- 224 篇 教育学
100 篇 农学
97 篇 经济学
9 篇 艺术学
7 篇 哲学
4 篇 军事学

主题

3,523 篇 natural language...
1,768 篇 natural language...
945 篇 computational li...
736 篇 semantics
676 篇 machine learning
606 篇 deep learning
520 篇 natural language...
346 篇 computational mo...
334 篇 training
333 篇 sentiment analys...
330 篇 accuracy
327 篇 large language m...
322 篇 feature extracti...
311 篇 data mining
290 篇 speech processin...
263 篇 speech recogniti...
250 篇 transformers
235 篇 neural networks
217 篇 iterative method...
211 篇 support vector m...

机构

85 篇 carnegie mellon ...
51 篇 university of ch...
45 篇 carnegie mellon ...
44 篇 tsinghua univers...
42 篇 zhejiang univers...
41 篇 national univers...
37 篇 nanyang technolo...
36 篇 university of wa...
35 篇 univ chinese aca...
34 篇 university of sc...
34 篇 carnegie mellon ...
33 篇 stanford univers...
32 篇 gaoling school o...
32 篇 school of artifi...
32 篇 alibaba grp peop...
29 篇 tsinghua univ de...
28 篇 harbin institute...
28 篇 peking universit...
27 篇 language technol...
26 篇 microsoft resear...

作者

55 篇 zhou guodong
50 篇 neubig graham
46 篇 liu yang
39 篇 sun maosong
36 篇 zhang min
34 篇 liu qun
33 篇 smith noah a.
28 篇 schütze hinrich
28 篇 lapata mirella
27 篇 liu zhiyuan
26 篇 wen ji-rong
24 篇 chang kai-wei
23 篇 zhou jie
23 篇 yang diyi
23 篇 zhao hai
23 篇 zhao wayne xin
21 篇 chua tat-seng
20 篇 dredze mark
18 篇 biemann chris
18 篇 fung pascale

语言

14,541 篇 英文
481 篇 其他
104 篇 中文
18 篇 法文
15 篇 土耳其文
2 篇 西班牙文
2 篇 俄文

检索条件"任意字段=Conference on empirical methods in natural language processing"

共 15135 条记录，以下是351-360 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Select, Prompt, Filter: Distilling Large language Models for Summarizing Conversations

Select, Prompt, Filter: Distilling Large Language Models for...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Pham, Minh-Quang Indurthi, Sathish Reddy Chollampatt, Shamil Turchi, Marco Zoom Video Commun San Jose CA 95113 USA

ISBN: (纸本)9798891760608

Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.

关键词： Distillation

来源：评论

学校读者我要写书评

暂无评论

IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models

IBADR: an Iterative Bias-Aware Dataset Refinement Framework ...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Wang, Xiaoyue Liu, Xin Wang, Lijie Wang, Yaoxiang Su, Jinsong Wu, Hua Xiamen Univ Sch Informat Xiamen 361005 Peoples R China Baidu Inc Beijing 100085 Peoples R China Xiamen Univ Minist Culture & Tourism Key Lab Digital Protect & Intelligent Proc Intang Xiamen Peoples R China

ISBN: (纸本)9798891760608

As commonly-used methods for debiasing natural language understanding (NLU) models, dataset refinement approaches heavily rely on manual data analysis, and thus maybe unable to cover all the potential biased features. In this paper, we propose IBADR, an Iterative BiasAware Dataset Refinement framework, which debiases NLU models without predefining biased features. We maintain an iteratively expanded sample pool. Specifically, at each iteration, we first train a shallow model to quantify the bias degree of samples in the pool. Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator. In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples. Furthermore, we employ the generator to produce pseudo samples with fewer biased features by feeding specific bias indicators. Finally, we incorporate the generated pseudo samples into the pool. Experimental results and in-depth analyses on two NLU tasks show that IBADR not only significantly outperforms existing dataset refinement approaches, achieving SOTA, but also is compatible with model-centric methods.

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

GEM: Gestalt Enhanced Markup language Model for Web Understanding via Render Tree

GEM: Gestalt Enhanced Markup Language Model for Web Understa...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Shao, Zirui Gao, Feiyu Qi, Zhongda Xing, Hangdi Bu, Jiajun Yu, Zhi Zheng, Qi Liu, Xiaozhong Zhejiang Univ Zhejiang Prov Key Lab Serv Robot Hangzhou Zhejiang Peoples R China Alibaba Grp Hangzhou Peoples R China Worcester Polytech Inst Worcester MA USA

ISBN: (纸本)9798891760608

Inexhaustible web content carries abundant perceptible information beyond text. Unfortunately, most prior efforts in pre-trained language Models (LMs) ignore such cyber-richness, while few of them only employ plain HTMLs, and crucial information in the rendered web, such as visual, layout, and style, are excluded. Intuitively, those perceptible web information can provide essential intelligence to facilitate content understanding tasks. This study presents an innovative Gestalt Enhanced Markup (GEM) language Model inspired by Gestalt psychological theory for hosting heterogeneous visual information from the render tree into the language model without requiring additional visual input. Comprehensive experiments on multiple downstream tasks, i.e., web question answering and web information extraction, validate GEM superiority.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful language Detection

People Make Better Edits: Measuring the Efficacy of LLM-Gene...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Sen, Indira Assenmacher, Dennis Samory, Mattia Augenstein, Isabelle van der Aalst, Wil Wagner, Claudia Rhein Westfal TH Aachen Aachen Germany Univ Konstanz Constance Germany GESIS Leibniz Inst Social Sci Mannheim Germany Sapienza Univ Rome Rome Italy Univ Copenhagen Copenhagen Denmark

ISBN: (纸本)9798891760608

NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. Therefore, it is imperative that these models are robust to spurious features. Past work has attempted to tackle such spurious features using training data augmentation, including Counterfactually Augmented Data (CADs). CADs introduce minimal changes to existing training data points and flip their labels;training on them may reduce model dependency on spurious features. However, manually generating CADs can be time-consuming and expensive. Hence in this work, we assess if this task can be automated using generative NLP models. We automatically generate CADs using Polyjuice, ChatGPT, and Flan-T5, and evaluate their usefulness in improving model robustness compared to manually-generated CADs. By testing both model performance on multiple out-of-domain test sets and individual data point efficacy, our results show that while manual CADs are still the most effective, CADs generated by ChatGPT come a close second. One key reason for the lower performance of automated methods is that the changes they introduce are often insufficient to flip the original label. Warning: This paper has instances of hateful and sexist language to serve as examples.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Beyond Label Attention: Transparency in language Models for Automated Medical Coding via Dictionary Learning

Beyond Label Attention: Transparency in Language Models for ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Wu, John Wu, David Sun, Jimeng University of Illinois Urbana-Champaign United States Vanderbilt University United States

ISBN: (纸本)9798891761643

Medical coding, the translation of unstructured clinical text into standardized medical codes, is a crucial but time-consuming healthcare practice. Though large language models (LLM) could automate the coding process and improve the efficiency of such tasks, interpretability remains paramount for maintaining patient trust. Current efforts in interpretability of medical coding applications rely heavily on label attention mechanisms, which often leads to the highlighting of extraneous tokens irrelevant to the ICD code. To facilitate accurate interpretability in medical language models, this paper leverages dictionary learning that can efficiently extract sparsely activated representations from dense language model embeddings in superposition. Compared with common label attention mechanisms, our model goes beyond token-level representations by building an interpretable dictionary which enhances the mechanistic-based explanations for each ICD code prediction, even when the highlighted tokens are medically irrelevant. We show that dictionary features can steer model behavior, elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and are human interpretable. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

I Learn Better If You Speak My language: Understanding the Superior Performance of Fine-Tuning Large language Models with LLM-Generated Responses

I Learn Better If You Speak My Language: Understanding the S...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Ren, Xuan Wu, Biao Liu, Lingqiao University of Adelaide Australia University of Technology Sydney Australia

ISBN: (纸本)9798891761643

This paper explores an intriguing observation: fine-tuning a large language model (LLM) with responses generated by a LLM often yields better results than using responses generated by humans, particularly in reasoning tasks. We conduct an in-depth investigation to understand why this occurs. Contrary to the common belief that these instances is due to the more detailed nature of LLM-generated content, our study identifies another contributing factor: an LLM is inherently more "familiar" with LLM generated responses. This familiarity is evidenced by lower perplexity before fine-tuning. We design a series of experiments to understand the impact of the "familiarity" and our conclusion reveals that this "familiarity" significantly impacts learning performance. Training with LLM-generated responses not only enhances performance but also helps maintain the model's capabilities in other reasoning tasks after fine-tuning on a specific task. Our code and data are public at https://***/XuanRen4470/I-LearnBetter-If-You-Speak-My-language. © 2024 Association for Computational Linguistics.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Scaling Sentence Embeddings with Large language Models

Scaling Sentence Embeddings with Large Language Models

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Jiang, Ting Huang, Shaohan Luan, Zhongzhi Wang, Deqing Zhuang, Fuzhen SKLSDE Lab School of Computer Beihang University Beijing China Sino-German Joint Software Institute Beihang University Beijing China Institute of Artificial Intelligence Beihang University Beijing China Zhongguancun Laboratory Beijing China

ISBN: (纸本)9798891761681

Large language Models (LLMs) have recently gained significant interest due to their impressive results in various natural language tasks. However, their application to sentence embeddings is still under active research. In this work, we introduce PromptEOL, a simple and efficient method designed to enhance LLM performance on sentence embeddings with a one-word limitation. We further integrate PromptEOL with in-context learning and alignment to leverage LLMs in two settings: without fine-tuning and with fine-tuning. Our extensive experiments show that PromptEOL enables LLMs to generate superior sentence embeddings without fine-tuning, outperforming contrastive learning methods. Additionally, with fine-tuning, a 2.7B parameter model using PromptEOL surpasses the performance of a 4.8B parameter model from previous methods. We also analyze how scaling model parameters, from 125 million to 66 billion, impacts sentence embedding performance. Our code and model is available at https://***/kongds/scaling_sentemb. © 2024 Association for Computational Linguistics.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Scalable Efficient Training of Large language Models with Low-dimensional Projected Attention

Scalable Efficient Training of Large Language Models with Lo...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Lv, Xingtai Ding, Ning Zhang, Kaiyan Hua, Ermo Cui, Ganqu Zhou, Bowen Department of Electronic Engineering Tsinghua University China Shanghai AI Laboratory China Department of Computer Science and Technology Tsinghua University China

ISBN: (纸本)9798891761643

Improving the effectiveness and efficiency of large language models (LLMs) simultaneously is a critical yet challenging research goal. In this paper, we find that low-rank pre-training, normally considered as efficient methods that will compromise performance, can be scalably effective when reduced parameters are precisely targeted. Specifically, applying the low-dimensional module only to the attention layer - resolves this issue and enhances both effectiveness and efficiency. We refer to this structure as Low-dimensional Projected Attention (LPA) and provide an explanatory analysis. Through extensive experimentation at parameter scales of 130M, 370M, and scaling up to 3B, we have validated the effectiveness and scalability of LPA. Our results show that LPA model can save up to 12.4% in time while achieving an approximate 5% improvement in test perplexity (ppl) and on downstream tasks compared with the vanilla Transformer. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Stochastic Fine-Tuning of language Models Using Masked Gradients

Stochastic Fine-Tuning of Language Models Using Masked Gradi...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Akbar-Tajari, Mohammad Pilehvar, Mohammad Taher Sharif University of Technology Iran Cardiff University United Kingdom Tehran Institute for Advanced Studies Iran

ISBN: (纸本)9798891761681

Large language Models (LLMs) have emerged as the dominant paradigm in natural language processing owing to their remarkable performance across various target tasks. However, naively fine-tuning them for specific downstream tasks often requires updating a vast number of parameters, resulting in high computational costs and overfitting when training data is limited. In this paper, we propose a novel approach, called Stochastic Tuning, that addresses these challenges by selectively updating a small subset of parameters in each step of the tuning process. Our approach is characterized by its customization of updates based on task-specific partial gradients with respect to stochastic sub-networks. The advantage of Stochastic Tuning over existing solutions lies in its ability to consider both parameter weights as well as forward values which guarantees a context-sensitive fine-tuning. Our experiments demonstrate that Stochastic Tuning outperforms existing lightweight fine-tuning methods, improving average performance by over two points on RoBERTa across several tasks in the GLUE benchmark while updating merely 0.08% of the model's parameters. The code for our implementation can be found at https://***/m-Tajari/StocTuning_LLMs. © 2024 Association for Computational Linguistics.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large language Models

ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Larg...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Mekala, Dheeraj Wolfe, Jason Roy, Subhro Univ Calif San Diego La Jolla CA 92093 USA OpenAI San Francisco CA USA Microsoft Semant Machines Newton MA USA

ISBN: (纸本)9798891760608

We explore the use of large language models (LLMs) for zero-shot semantic parsing. Semantic parsing involves mapping natural language utterances to task-specific meaning representations. LLMs are generally trained on publicly available text and code and cannot be expected to directly generalize to domain-specific parsing tasks in a zero-shot setting. In this work, we propose ZEROTOP, a zero-shot task-oriented parsing method that decomposes semantic parsing problem into a set of abstractive and extractive question-answering (QA) problems. For each utterance, we prompt the LLM with questions corresponding to its top-level intent and a set of slots and use the LLM generations to construct the target meaning representation. We observe that current LLMs fail to detect unanswerable questions;and as a result, cannot handle questions corresponding to missing slots. We address this by fine-tuning a language model on public QA datasets using synthetic negative samples. Experimental results show that our QA-based decomposition paired with the fine-tuned LLM can zero-shot parse similar to 16% of utterances in the MTOP dataset.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 32 33 34 35 36 37 38 39 40 41 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：