检索结果-内蒙古大学图书馆

conference on Empirical Methods in natural language processing (EMNLP)

作者： Baek, Jinheon Jeong, Soyeong Kang, Minki Park, Jong C. Hwang, Sung Ju Korea Adv Inst Sci & Technol Daejeon South Korea

ISBN: (纸本)9798891760608

Recent language Models (LMs) have shown impressive capabilities in generating texts with the knowledge internalized in parameters. Yet, LMs often generate the factually incorrect responses to the given queries, since their knowledge may be inaccurate, incomplete, and outdated. To address this problem, previous works propose to augment LMs with the knowledge retrieved from an external knowledge source. However, such approaches often show suboptimal text generation performance due to two reasons: 1) the model may fail to retrieve the knowledge relevant to the given query, or 2) the model may not faithfully reflect the retrieved knowledge in the generated text. To overcome these, we propose to verify the output and the knowledge of the knowledge-augmented LMs with a separate verifier, which is a small LM that is trained to detect those two types of errors through instruction-finetuning. Then, when the verifier recognizes an error, we can rectify it by either retrieving new knowledge or generating new text. Further, we use an ensemble of the outputs from different instructions with a single verifier to enhance the reliability of the verification processes. We validate the effectiveness of the proposed verification steps on multiple question answering benchmarks, whose results show that the proposed verifier effectively identifies retrieval and generation errors, allowing LMs to provide more factually correct outputs. Our code is available at https://***/JinheonBaek/KALMV.

关键词： Errors

来源：评论

学校读者我要写书评

暂无评论

Sequential Text-based knowledge Update with Self-Supervised Learning for Generative language Models 23

Sequential Text-based Knowledge Update with Self-Supervised ...

引用

32nd ACM international conference on Information and knowledge Management (CIKM)

作者： Sung, Hao-Ru Tang, Ying-Jhe Cheng, Yu-Chung Chen, Pai-Lin Li, Tsai-Yen Huang, Hen-Hsen Natl Chengchi Univ Taipei Taiwan Acad Sinica Taipei Taiwan

ISBN: (纸本)9798400701245

This work proposes a new natural language processing (NLP) task to tackle the issue of multi-round, sequential text-based knowledge update. The study introduces a hybrid learning architecture and a novel self-supervised training strategy to enable generative language models to consolidate knowledge in the same way as humans. A dataset was also created for evaluation and results showed the effectiveness of our methodology. Experimental results confirm the superiority of the proposed approach over existing models and large language models (LLMs). The proposed task and model framework have the potential to significantly improve the automation of knowledge organization, making text-based knowledge an increasingly crucial resource for powerful LLMs to perform various tasks for humans.

关键词： natural language generation temporal knowledge modeling update summarization self-supervision

来源：评论

学校读者我要写书评

暂无评论

Prompt-based Generation of natural language Explanations of Synthetic Lethality for Cancer Drug Discovery 30

Prompt-based Generation of Natural Language Explanations of ...

引用

Joint 30th international conference on Computational Linguistics and 14th international conference on language Resources and Evaluation, LREC-COLING 2024

作者： Zhang, Ke Feng, Yimiao Zheng, Jie School of Information Science and Technology ShanghaiTech University Shanghai China Shanghai Institute of Microsystem and Information Technology Chinese Academy of Sciences Shanghai China University of Chinese Academy of Sciences Beijing China Lingang Laboratory Shanghai China Shanghai Engineering Research Center of Intelligent Vision and Imaging ShanghaiTech University Shanghai China

ISBN: (纸本)9782493814104

Synthetic lethality (SL) offers a promising approach for targeted anti-cancer therapy. Deeply understanding SL gene pair mechanisms is vital for anti-cancer drug discovery. However, current wet-lab and machine learning-based SL prediction methods lack user-friendly and quantitatively evaluable explanations. To address these problems, we propose a prompt-based pipeline for generating natural language explanations. We first construct a natural language dataset named NexLeth. This dataset is derived from New Bing through prompt-based queries and expert annotations and contains 707 instances. NexLeth enhances the understanding of SL mechanisms and it is a benchmark for evaluating SL explanation methods. For the task of natural language generation for SL explanations, we combine subgraph explanations from an SL knowledge graph (KG) with instructions to construct novel personalized prompts, so as to inject the domain knowledge into the generation process. We then leverage the prompts to fine-tune pre-trained biomedical language models on our dataset. Experimental results show that the fine-tuned model equipped with designed prompts performs better than existing biomedical language models in terms of text quality and explainability, suggesting the potential of our dataset and the fine-tuned model for generating understandable and reliable explanations of SL mechanisms. © 2024 ELRA language Resource Association: CC BY-NC 4.0.

关键词： knowledge graph

来源：评论

学校读者我要写书评

暂无评论

SCF-Stega: Controllable Linguistic Steganography Based on Semantic Communications Framework

SCF-Stega: Controllable Linguistic Steganography Based on Se...

引用

2025 IEEE international conference on Acoustics, Speech, and Signal processing, ICASSP 2025

作者： Long, Yilin Yang, Zhongliang Wang, Zhuang Zhou, Zhili Huang, Yongfeng Zhou, Linna School of Cyberspace Security Beijing University of Posts and Telecommunications Beijing China Department of Electronic Engineering Tsinghua University Beijing China School of Artificial Intelligence Guangzhou University Guangzhou China

ISBN: (纸本)9798350368741

Linguistic steganography is a key information hiding technique but faces challenges like abrupt content shifts, detection risks, and high training resource demands. To address these, this paper introduces SCF-Stega, a controllable method based on Semantic Communications Framework. By using a knowledge graph to guide secret encoding and dynamically adjusting large language model outputs, SCF-Stega enhances text imperceptibility and semantic coherence. Experiments show improved text quality and strong resistance to steganalysis, without needing additional training data. © 2025 IEEE.

关键词： knowledge graph large language models Linguistic steganography semantic communications

来源：评论

学校读者我要写书评

暂无评论

Semi-Supervised knowledge Distillation Framework towards Lightweight Large language Model for Spoken language Translation

Semi-Supervised Knowledge Distillation Framework towards Lig...

引用

2025 IEEE international conference on Acoustics, Speech, and Signal processing, ICASSP 2025

作者： Rajkhowa, Tonmoy Chowdhury, Amartya Roy Tripathi, Achyut Mani Sharma, Sanjeev Pandey, Om Jee Varanasi India Indian Institute of Technology Dharwad Karnataka India

ISBN: (纸本)9798350368741

Even though large language models (LLMs) have demonstrated remarkable performance across various natural language processing tasks, their application in speech-related tasks has largely remained underexplored. This work addresses this gap by incorporating acoustic features into an LLM which can be fine-tuned for downstream direct speech-to-text translation and automatic speech recognition tasks. To address the computational demands associated with fine-tuning LLMs, a novel self and semi-supervised knowledge distillation technique is proposed to implement a lightweight LLM having 50% lesser parameters. Validated on the MuST-C and Librispeech datasets, this technique achieves over 92% of the performance of the larger LLM, demonstrating both robust performance and computational efficiency. © 2025 IEEE.

关键词： Direct Speech-to-Text Translation knowledge Distillation Large language Model

来源：评论

学校读者我要写书评

暂无评论

Towards Understanding Contracts Grammar: A Large language Model-based Extractive Question-Answering Approach 32

Towards Understanding Contracts Grammar: A Large Language Mo...

引用

32nd IEEE international Requirements engineering conference (RE)

作者： Rejithkumar, Gokul Anish, Preethu Rose Ghaisas, Smita TCS Res Pune India

ISBN: (纸本)9798350395129;9798350395112

Software engineering (SE) contracts play a pivotal role in Information Technology Outsourcing (ITO) projects. The obligations in SE contracts are known to be a useful source for deriving software requirements, thereby contributing to the overall Software Development Life Cycle (SDLC). Making sense of contractual obligations is an important first step in successfully executing software projects. This includes building compliant systems, meeting delivery deadlines, avoiding heavy penalties, and steering clear of expensive litigations. In this work, we present an approach to capture the essence of a contractual clause by extracting its Contracts Grammar. Through an exploratory study, we first identify the constituents of Contracts Grammar. Subsequently, we experiment with multiple approaches for the automated extraction of these constituents, including extractive question-answering, token classification, text-to-text generation, prompting, and regular expressions. The question-answering based approach performed the best in terms of high average ROUGE-L score of 0.81, and faster inference times. The work presented in this paper is a part of the Contracts Governance System (CGS) and is in the process of deployment within a large IT vendor organization.

关键词： text extraction deep learning natural language processing large language models question-answering token classification text-to-text generation prompting empirical research

来源：评论

学校读者我要写书评

暂无评论

Explicable knowledge graph (X-KG): generating knowledge graphs for explainable artificial intelligence and querying them by translating natural language queries to SPARQL

引用

international Journal of Information Technology (Singapore) 2024年第3期16卷 1605-1615页

作者： Shaikh, Numair Chauhan, Tavishee Patil, Jayesh Sonawane, Sheetal Department of Computer Engineering SCTR’s Pune Institute of Computer Technology Maharashtra Dhankawdi Pune 411043 India

knowledge graphs represent a potent instrument for the classification and exhibition of data, as they encompass a systematic approach for the containment and retrieval of multifarious datasets. In finance, the utilization of knowledge graphs for the organization of company-oriented data constitutes an invaluable source of insights, thus enabling informed decision-making. In a parallel, knowledge graph systems centered on COVID-19 within the healthcare sphere may assist medical professionals in the making of resolute choices. These applications highlight knowledge graphs’ ability to revolutionize decision-making procedures by providing a comprehensive comprehension of the given subject. To tackle this, we propose a solution that begets and implements knowledge graphs in two separate domains: finance and healthcare. To ensure the creation of explicable AI systems and improve the accessibility of information within these knowledge graphs, we introduce the conversion of natural language queries into SPARQL queries. By fine-tuning our model, we illustrate the system’s superior performance. Furthermore, we appraise the adequacy of the constructed knowledge graphs and contrast them with widely employed alternatives. Our work accentuates the adaptability of the proposed solution, as it can operate seamlessly with diverse datasets requiring minimal modifications. © 2024, The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management.

关键词： Explainable artificial intelligence knowledge graphs natural language processing Q& A systems Query translation

来源：评论

学校读者我要写书评

暂无评论

Enhancing Multi-Person Dialogue with Large language Models: A Structured Approach to natural Communication 24

Enhancing Multi-Person Dialogue with Large Language Models: ...

引用

8th international conference on natural language processing and Information Retrieval, NLPIR 2024

作者： Murogaki, Takumi Nishimura, Toshikazu Graduate School of Information Science and Engineering Ritsumeikan University Ibaraki Japan College of Information Science and Engineering Ritsumeikan University Ibaraki Japan

ISBN: (纸本)9798400717383

The rise of social networking services has increased text-based communication, often leading to misunderstandings. This study aims to develop a system using large language models (LLMs) like ChatGPT to provide real-time support in human dialogues. Traditional LLM chatbots, designed for one-on-one interactions, struggle with multi-person conversations, often leading to unnatural responses. This research proposes methods to enhance LLM's ability to distinguish between different speakers and improve its reasoning capabilities. By implementing "conversation structure tags"and simulating multi-person arguments, the system aims to generate natural, context-aware responses, enhancing the dialogue's quality and engagement. © 2024 Copyright held by the owner/author(s).

关键词： Speech enhancement

来源：评论

学校读者我要写书评

暂无评论

NLP Research: A Historical Survey and Current Trends in Global, Indic, and Gujarati languages 4

NLP Research: A Historical Survey and Current Trends in Glob...

引用

4th international conference on Ubiquitous Computing and Intelligent Information Systems, ICUIS 2024

作者： Panchal, Brijeshkumar Y. Shah, Apurva The Maharaja Sayajirao University of Baroda Faculty of Technology and Engineering Computer Science and Engineering Department Gujarat Vadodara India Computer Engineering Department Gujarat Anand India

ISBN: (纸本)9798331529635

This research study presents a comprehensive survey of natural language processing (NLP) research, tracing its historical evolution from its inception to the present. The survey explores the key milestones and advancements in NLP, focusing on global trends, as well as specific contributions from Indic languages, with a particular focus on Gujarati. The paper analyzes the current state of Gujarati NLP research, identifying existing gaps and challenges. By providing a comprehensive overview of NLP research, this study aims to guide future research directions and foster advancements in the field, particularly in the context of Indic languages. © 2024 IEEE.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

SEPLN-P 2024 - Poster Proceedings of the 40th Annual conference of the Spanish Association for natural language processing 2024, co-located with the 40th international conference of the Spanish Society for natural language processing, SEPLN 2024

SEPLN-P 2024 - Poster Proceedings of the 40th Annual Confere...

引用

40th Annual conference of the Spanish Association for natural language processing, SEPLN-P 2024

The proceedings contain 17 papers. The topics discussed include: on the relationship of social gender equality and grammatical gender in pre-trained large language models;the difficulty of misinformation labelling: a case study for radon gas-related searches;findings of a machine translation shared task focused on Covid-19 related documents;COCOTEROS: a Spanish corpus with contextual knowledge for natural language generation;emotions and news structure: an analysis of the language of fake news in Spanish;towards multi-class smishing detection: a novel feature vector approach and the Smishing-4C Dataset;synthetic annotated data for named entity recognition in computed tomography scan reports;Spanish FatPhoCorpus 2023: combating fatphobia in social media in Spanish using transformers;Spanish-language platform for drug-disease evidence search based on scientific articles;and automatic pathology detection in Spanish clinical notes combining language models and medical ontologies.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：