检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

14,413 篇 会议
650 篇 期刊文献
101 册 图书
40 篇 学位论文
1 篇 科技报告

馆藏范围

15,204 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

10,937 篇 工学
- 10,278 篇 计算机科学与技术...
- 5,404 篇 软件工程
- 1,460 篇 信息与通信工程
- 953 篇 电气工程
- 875 篇 控制科学与工程
- 446 篇 生物工程
- 221 篇 网络空间安全
- 220 篇 化学工程与技术
- 186 篇 机械工程
- 174 篇 生物医学工程（可授...
- 141 篇 电子科学与技术（可...
- 100 篇 仪器科学与技术
- 100 篇 安全科学与工程
2,473 篇 理学
- 1,150 篇 数学
- 649 篇 物理学
- 518 篇 生物学
- 391 篇 统计学（可授理学、...
- 241 篇 系统科学
- 232 篇 化学
2,413 篇 管理学
- 1,747 篇 图书情报与档案管...
- 754 篇 管理科学与工程(可...
- 239 篇 工商管理
- 104 篇 公共管理
1,761 篇 文学
- 1,709 篇 外国语言文学
- 184 篇 中国语言文学
510 篇 医学
- 299 篇 临床医学
- 282 篇 基础医学(可授医学...
- 112 篇 公共卫生与预防医...
277 篇 法学
- 249 篇 社会学
237 篇 教育学
- 224 篇 教育学
100 篇 农学
97 篇 经济学
9 篇 艺术学
7 篇 哲学
4 篇 军事学

主题

3,523 篇 natural language...
1,768 篇 natural language...
952 篇 computational li...
736 篇 semantics
680 篇 machine learning
606 篇 deep learning
520 篇 natural language...
345 篇 computational mo...
334 篇 training
331 篇 sentiment analys...
330 篇 accuracy
325 篇 large language m...
320 篇 feature extracti...
311 篇 data mining
290 篇 speech processin...
263 篇 speech recogniti...
250 篇 transformers
235 篇 neural networks
217 篇 iterative method...
211 篇 support vector m...

机构

85 篇 carnegie mellon ...
51 篇 university of ch...
45 篇 carnegie mellon ...
44 篇 tsinghua univers...
42 篇 zhejiang univers...
42 篇 national univers...
38 篇 nanyang technolo...
36 篇 university of wa...
35 篇 univ chinese aca...
34 篇 university of sc...
34 篇 carnegie mellon ...
33 篇 stanford univers...
32 篇 gaoling school o...
32 篇 school of artifi...
32 篇 alibaba grp peop...
29 篇 tsinghua univ de...
28 篇 harbin institute...
28 篇 peking universit...
27 篇 language technol...
26 篇 microsoft resear...

作者

55 篇 zhou guodong
50 篇 neubig graham
46 篇 liu yang
39 篇 sun maosong
36 篇 zhang min
34 篇 liu qun
33 篇 smith noah a.
28 篇 schütze hinrich
27 篇 liu zhiyuan
27 篇 lapata mirella
26 篇 wen ji-rong
24 篇 chang kai-wei
23 篇 zhou jie
23 篇 yang diyi
23 篇 zhao hai
23 篇 zhao wayne xin
21 篇 chua tat-seng
20 篇 dredze mark
18 篇 biemann chris
18 篇 fung pascale

语言

14,611 篇 英文
481 篇 其他
104 篇 中文
18 篇 法文
15 篇 土耳其文
2 篇 西班牙文
2 篇 俄文

检索条件"任意字段=Conference on empirical methods in natural language processing"

共 15205 条记录，以下是451-460 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

RECOVERING FROM PRIVACY-PRESERVING MASKING WITH LARGE language MODELS 49

RECOVERING FROM PRIVACY-PRESERVING MASKING WITH LARGE LANGUA...

引用

49th IEEE International conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Vats, Arpita Liu, Zhe Sue, Peng Paul, Debjyoti Ma, Yingyi Pang, Yutong Ahmed, Zeeshan Kalinli, Ozlem Santa Clara Univ Santa Clara CA 95053 USA Meta Menlo Pk CA USA

ISBN: (纸本)9798350344868;9798350344851

Model adaptation is crucial to handle the discrepancy between proxy training data and actual users' data received. To effectively perform adaptation, textual data of users is typically stored on servers or their local devices, where downstream natural language processing (NLP) models can be directly trained using such in-domain data. However, this might raise privacy and security concerns due to the extra risks of exposing user information to adversaries. Replacing identifying information in textual data with a generic marker has been recently explored. In this work, we leverage large language models (LLMs) to suggest substitutes of masked tokens and have their effectiveness evaluated on downstream language modeling tasks. Specifically, we propose multiple pre-trained and fine-tuned LLM-based approaches and perform empirical studies on various datasets for the comparison of these methods. Experimental results show that models trained on the obfuscation corpora are able to achieve comparable performance with the ones trained on the original data without privacy-preserving token masking.

关键词： Privacy-preserving machine learning language modeling large language models automatic speech recognition

来源：评论

学校读者我要写书评

暂无评论

MalayMMLU: A Multitask Benchmark for the Low-Resource Malay language

MalayMMLU: A Multitask Benchmark for the Low-Resource Malay ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Poh, Soon Chang Yang, Sze Jue Tan, Jeraelyn Ming Li Chieng, Lawrence Leroy Tze Yao Tan, Jia Xuan Yu, Zhenyu Foong, Chee Mun Chan, Chee Seng Universiti Malaya Malaysia YTL AI Labs Malaysia

ISBN: (纸本)9798891761681

Large language Models (LLMs) and Large Vision language Models (LVLMs) exhibit advanced proficiency in language reasoning and comprehension across a wide array of languages. While their performance is notably robust in well-resourced languages, their capabilities in low-resource languages, such as Bahasa Melayu (hereinafter referred to as Malay), remain less explored due to a scarcity of dedicated studies and benchmarks. To enhance our understanding of LLMs/LVLMs performance in Malay, we introduce the first multi-task language understanding benchmark specifically for this language, named MalayMMLU. This benchmark comprises 24,213 questions spanning both primary (Year 1-6) and secondary (Form 1-5) education levels in Malaysia, encompassing 5 broad topics that further divided into 22 subjects. We conducted an empirical evaluation of 44 LLMs/LVLMs, assessing their proficiency in both Malay and the nuanced contexts of Malaysian culture using this benchmark. The benchmark and evaluation code are available at https://***/UMxYTL-AI-Labs/MalayMMLU. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Grounding language in Multi-Perspective Referential Communication

Grounding Language in Multi-Perspective Referential Communic...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Tang, Zineng Mao, Lingjun Suhr, Alane University of California Berkeley United States

ISBN: (纸本)9798891761643

We introduce a task and dataset for referring expression generation and comprehension in multi-agent embodied *** this task, two agents in a shared scene must take into account one another's visual perspective, which may be different from their own, to both produce and understand references to objects in a scene and the spatial relations between *** collect a dataset of 2,970 human-written referring expressions, each paired with human comprehension judgments, and evaluate the performance of automated models as speakers and listeners paired with human partners, finding that model performance in both reference generation and comprehension lags behind that of pairs of human ***, we experiment training an open-weight speaker model with evidence of communicative success when paired with a listener, resulting in an improvement from 58.9 to 69.3% in communicative success and even outperforming the strongest proprietary model. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

GraphQL Query Generation: A Large Training and Benchmarking Dataset

GraphQL Query Generation: A Large Training and Benchmarking ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Kesarwani, Manish Ghosh, Sambit Gupta, Nitin Chakraborty, Shramona Sindhgatta, Renuka Mehta, Sameep Eberhardt, Carlos Debrunner, Dan IBM Research India IBM StepZen United States

ISBN: (纸本)9798891761667

GraphQL is a powerful query language for APIs that allows clients to fetch precise data efficiently and flexibly, querying multiple resources with a single request. However, crafting complex GraphQL query operations can be challenging. Large language Models (LLMs) offer an alternative by generating GraphQL queries from natural language, but they struggle due to limited exposure to publicly available GraphQL schemas, often resulting in invalid or suboptimal queries. Furthermore, no benchmark test data suite is available to reliably evaluate the performance of contemporary LLMs. To address this, we present a large-scale, cross-domain Text-to-GraphQL query operation dataset. The dataset includes 10,940 training triples spanning 185 cross-source data stores and 957 test triples over 14 data stores. Each triple consists of a GraphQL schema, GraphQL query operation, and corresponding natural language query. The dataset has been predominantly manually created, with natural language paraphrasing, and carefully validated, requiring approximately 1200 person-hours. In our evaluation, we tested 10 state-of-the-art LLMs using our test dataset. The best-performing model achieved an accuracy of only around 50% with one in-context few-shot example, underscoring the necessity for custom fine-tuning. To support further research and benchmarking, we are releasing the training and test datasets under the MIT License. The dataset is available at https://***/stepzen-dev/NL2GQL. © 2024 Association for Computational Linguistics.

关键词： Structured Query language

来源：评论

学校读者我要写书评

暂无评论

PSST: A Benchmark for Evaluation-driven Text Public-Speaking Style Transfer

PSST: A Benchmark for Evaluation-driven Text Public-Speaking...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Sun, Huashan Wu, Yixiao Ye, Yuhao Yang, Yizhe Li, Yinghao Li, Jiawei Gao, Yang School of Computer Science and Technology Beijing Institute of Technology Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications China

ISBN: (纸本)9798891761681

language style is necessary for AI systems to understand and generate diverse human language ***, previous text style transfer primarily focused on sentence-level data-driven approaches, limiting exploration of potential problems in large language models (LLMs) and the ability to meet complex application *** overcome these limitations, we introduce a novel task called Public-Speaking Style Transfer (PSST), which aims to simulate humans to transform passage-level, official texts into a public-speaking *** in the analysis of real-world data from a linguistic perspective, we decompose public-speaking style into key sub-styles to pose challenges and quantify the style modeling capability of *** such intricate text style transfer, we further propose a fine-grained evaluation framework to analyze the characteristics and identify the problems of stylized *** experiments suggest that current LLMs struggle to generate public speaking texts that align with human preferences, primarily due to excessive stylization and loss of semantic information. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Beyond Shared Vocabulary: Increasing Representational Word Similarities across languages for Multilingual Machine Translation

Beyond Shared Vocabulary: Increasing Representational Word S...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Wu, Di Monz, Christof Univ Amsterdam Language Technol Lab Amsterdam Netherlands

ISBN: (纸本)9798891760608

Using a vocabulary that is shared across languages is common practice in Multilingual Neural Machine Translation (MNMT). In addition to its simple design, shared tokens play an important role in positive knowledge transfer, assuming that shared tokens refer to similar meanings across languages. However, when word overlap is small, especially due to different writing systems, transfer is inhibited. In this paper, we define word-level information transfer pathways via word equivalence classes and rely on graph networks to fuse word embeddings across languages. Our experiments demonstrate the advantages of our approach: 1) embeddings of words with similar meanings are better aligned across languages, 2) our method achieves consistent BLEU improvements of up to 2.3 points for high- and low-resource MNMT, and 3) less than 1.0% additional trainable parameters are required with a limited increase in computational costs, while inference time remains identical to the baseline. We release the codebase to the community.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

SecureSQL: Evaluating Data Leakage of Large language Models as natural language Interfaces to Databases

SecureSQL: Evaluating Data Leakage of Large Language Models ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Song, Yanqi Liu, Ruiheng Chen, Shu Ren, Qianhao Zhang, Yu Yu, Yongqi Harbin Institute of Technology China Xi'an Research Institute of High-Tech China

ISBN: (纸本)9798891761681

With the widespread application of Large language Models (LLMs) in natural language Interfaces to Databases (NLIDBs), concerns about security issues in NLIDBs have been increasing gradually. However, research on sensitive data leakage in NLIDBs is relatively limited. Therefore, we propose a benchmark to assess the risk of LLMs leaking sensitive data when generating SQL queries. This benchmark covers 932 samples from 34 different domains, including medical, legal, financial, and political aspects. We evaluate 15 models from six LLM families, and the results show that the model with the best performance has an accuracy of 61.7%, whereas humans achieve an accuracy of 94%. Most models perform close to or even below the level of random selection. We also evaluate two common attack methods, namely prompt injection and inference attacks, as well as a defense method based on chain-of-thoughts (COT) prompting. Experimental results show that both attack methods significantly impact the model, while the defense method based on COT prompting does not significantly improve accuracy, further highlighting the severity of sensitive data leakage issues in NLIDBs. We hope this research will draw more attention and further study from the researchers on this issue. The benchmark and source code are available at https://***/JacobiSong/SecureSQL. © 2024 Association for Computational Linguistics.

关键词： Information leakage

来源：评论

学校读者我要写书评

暂无评论

PEFTDebias : Capturing debiasing information using PEFTs

PEFTDebias : Capturing debiasing information using PEFTs

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Agarwal, Sumit Veerubhotla, Aditya Srikanth Bansal, Srijan Carnegie Mellon Univ Language Technol Inst Pittsburgh PA 15213 USA

ISBN: (纸本)9798891760608

The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pre-training. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing parameters along a specific bias axis, and a downstream phase where these parameters are incorporated into the model and frozen during the fine-tuning process. By evaluating on four datasets across two bias axes namely gender and race, we find that downstream biases can be effectively reduced with PEFTs. In addition, we show that these parameters possess axis-specific debiasing characteristics, enabling their effective transferability in mitigating biases in various downstream tasks. To ensure reproducibility, we release the code to do our experiments(1).

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large language Model

DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution ...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Gao, Chao Zhang, Sai Qian University of California Riverside United States New York University United States

ISBN: (纸本)9798891761681

To enhance the performance of large language models (LLM) on downstream tasks, one solution is to fine-tune certain LLM parameters and make it better align with the characteristics of the training dataset. This process is commonly known as parameter-efficient fine-tuning (PEFT). Due to the scale of LLM, PEFT operations are usually executed in the public environment (e.g., cloud server). This necessitates the sharing of sensitive user data across public environments, thereby raising potential privacy concerns. To tackle these challenges, we propose a distributed PEFT framework called DLoRA. DLoRA enables scalable PEFT operations to be performed collaboratively between the cloud and user devices. Coupled with the proposed Halt and Proceed algorithm, the evaluation results demonstrate that DLoRA can significantly reduce the computation and communication workload over user devices while achieving superior accuracy and privacy protection. The source code can be accessed through the provided link. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

WALLEDEVAL: A Comprehensive Safety Evaluation Toolkit for Large language Models

WALLEDEVAL: A Comprehensive Safety Evaluation Toolkit for La...

引用

2024 conference on empirical methods in natural language processing, EMNLP 2024

作者： Gupta, Prannaya Yau, Le Qi Low, Hao Han Lee, I-Shiang Lim, Hugo M. Teoh, Yu Xin Koh, Jia Hng Liew, Dar Win Bhardwaj, Rishabh Bhardwaj, Rajat Poria, Soujanya Walled AI Labs

ISBN: (纸本)9798891761674

WALLEDEVAL is a comprehensive AI safety testing toolkit designed to evaluate large language models (LLMs). It accommodates a diverse range of models, including both open-weight and API-based ones, and features over 35 safety benchmarks covering areas such as multilingual safety, exaggerated safety, and prompt injections. The framework supports both LLM and judge benchmarking and incorporates custom mutators to test safety against various text-style mutations, such as future tense and paraphrasing. Additionally, WALLEDEVAL introduces WALLEDGUARD, a new, small, and performant content moderation tool, and two datasets: SGXSTEST and HIXSTEST, which serve as benchmarks for assessing the exaggerated safety of LLMs and judges in cultural contexts. We make WALLEDEVAL publicly available at https://***/walledai/walledeval. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 42 43 44 45 46 47 48 49 50 51 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：