检索结果-内蒙古大学图书馆

Euro-Asia conference on Frontiers of Computer Science and Information Technology (FCSIT)

作者： Zhao, Yufeng Soerjodjojo, Evelyn Che, Haiying Beijing Inst Technol Sch Comp Sci & Technol Beijing Peoples R China

ISBN: (纸本)9781665463539

BERT is a widely used pre-trained model in natural language processing tasks, including Aspect-Based sentiment classification. BERT is equipped with sufficient prior language knowledge in the enormous amount of pre-trained model parameters, for which the fine-tuning of BERT has become a critical issue. Previous works mainly focused on specialized downstream networks or additional knowledge to fine-tune the BERT to the sentiment classification tasks. In this paper, we design experiments to find the fine-tuning techniques that can be used by all models with BERT in the Aspect-Based Sentiment Classification tasks. Through these experiments, we verify different feature extraction, regularization, and continual learning methods, then we summarize 8 universally applicable conclusions to enhance the training and performance of the BERT model.

关键词： natural language processing BERT Aspect-Based Sentiment Classification

来源：评论

学校读者我要写书评

暂无评论

Towards smarter hiring: resume parsing and ranking with YOLOv5 and DistilBERT

引用

MULTIMEDIA TOOLS AND APPLICATIONS 2024年第35期83卷 82069页

作者： Kinger, Shakti Kinger, Divija Thakkar, Shivam Bhake, Devashish Sch Engn & Technol Dr Vishwanath Karad MIT WPU 124 Paud Rd Pune 411038 Maharashtra India KJ Somaiya Inst Technol Dept Neurosurg Somaiya Ayurvihar Complex Eastern Express Highway Mumbai 400022 Maharashtra India

In the contemporary landscape of recruitment, the Applicant Tracking System (ATS) plays a pivotal role in automating the screening and shortlisting of candidates. However, the prevailing ATS encounters challenges such as imprecise data extraction, erroneous keyword selection, and a lack of standardized criteria for comparison. As a result, many deserving applicants are turned away, highlighting the necessity for a more complex and human-centered strategy. In response to these limitations, our research introduces an innovative Resume Parsing and Ranking solution. Leveraging advanced natural language processing techniques and machine learning algorithms, our system provides a customized experience for the automated screening process. The naive methods underscore the distinct advantages of our innovative approach, emphasizing the need to build a robust and accurate model for Resume Parsing and Ranking. Notably, it addresses discrepancies arising from diverse resume structures, ensuring a standardized and equitable evaluation of all applicants. The main contribution of our work lies in the development of a state-of-the-art Resume Parser that enhances efficiency, reduces bias, and optimizes candidate selection outcomes. Our proposed method integrates cutting-edge technologies to refine the existing ATS process, offering a tailored and precise approach to resume evaluation. The primary problem addressed is the lack of precision and standardization in thecurrent ATS, leading to suboptimal candidate shortlisting. Our solution tackles this by introducing a comprehensive framework that mitigates the impact of varied resume structures, thereby promoting fair and consistent candidate assessment. Through empirical validation, our obtained results showcase an accuracy of 96.2% in resume parsing, thereby significantly improving the efficiency of the candidate selection process.

关键词： natural language processing (NLP) DistilBERT Resume Parsing Resume Ranking

来源：评论

学校读者我要写书评

暂无评论

A Benchmark Study of Contrastive Learning for Arabic Social Meaning 7

A Benchmark Study of Contrastive Learning for Arabic Social ...

引用

7th Arabic natural language processing Workshop, WANLP 2022 held with EMNLP 2022

作者： Khondaker, Md Tawkat Islam Nagoudi, El Moatez Billah Elmadany, AbdelRahim Abdul-Mageed, Muhammad Lakshmanan, Laks V.S. Deep Learning & Natural Language Processing Group The University of British Columbia Canada

ISBN: (纸本)9781959429272

Contrastive learning (CL) brought significant progress to various NLP tasks. Despite this progress, CL has not been applied to Arabic NLP to date. Nor is it clear how much benefits it could bring to particular classes of tasks such as those involved in Arabic social meaning (e.g., sentiment analysis, dialect identification, hate speech detection). In this work, we present a comprehensive benchmark study of state-of-the-art supervised CL methods on a wide array of Arabic social meaning tasks. Through extensive empirical analyses, we show that CL methods outperform vanilla finetuning on most tasks we consider. We also show that CL can be data efficient and quantify this efficiency. Overall, our work allows us to demonstrate the promise of CL methods, including in low-resource settings. © 2022 Association for Computational Linguistics.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

A comparative study on unsupervised feature selection methods for text clustering

A comparative study on unsupervised feature selection method...

引用

International conference on natural language processing and Knowledge Engineering

作者： Liu, LY Kang, JC Yu, J Wang, ZL Beihang Univ Sch Comp Sci Beijing 100083 Peoples R China

ISBN: (纸本)0780393619

Text clustering is one of the central problems in text mining and information retrieval area. For the high dimensionality of feature space and the inherent data sparsity, performance of clustering algorithms will dramatically decline. Two techniques are used to deal with this problem: feature extraction and feature selection. Feature selection methods have been successfully applied to text categorization but seldom applied to text clustering due to the unavailability of class label information. In this paper, four unsupervised feature selection methods, DF, TC, TVQ, and a new proposed method TV are introduced. Experiments are taken to show that feature selection methods can improves efficiency as well as accuracy of text clustering. Three clustering validity criterions are studied and used to evaluate clustering results.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Factorising Meaning and Form for Intent-Preserving Paraphrasing 59

Factorising Meaning and Form for Intent-Preserving Paraphras...

引用

Joint conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint conference on natural language processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP)

作者： Hosking, Tom Lapata, Mirella Univ Edinburgh Sch Informat Inst Language Cognit & Computat 10 Crichton St Edinburgh EH8 9AB Midlothian Scotland

ISBN: (纸本)9781954085527

We propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form. Our model combines a careful choice of training objective with a principled information bottleneck, to induce a latent encoding space that disentangles meaning and form. We train an encoder-decoder model to reconstruct a question from a paraphrase with the same meaning and an exemplar with the same surface form, leading to separated encoding spaces. We use a Vector-Quantized Variational Autoencoder to represent the surface form as a set of discrete latent variables, allowing us to use a classifier to select a different surface form at test time. Crucially, our method does not require access to an external source of target exemplars. Extensive experiments and a human evaluation show that we are able to generate paraphrases with a better tradeoff between semantic preservation and syntactic novelty compared to previous methods.

关键词： Signal encoding

来源：评论

学校读者我要写书评

暂无评论

AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained language Models 59

AutoTinyBERT: Automatic Hyper-parameter Optimization for Eff...

引用

作者： Yin, Yichun Chen, Cheng Shang, Lifeng Jiang, Xin Chen, Xiao Liu, Qun Huawei Noahs Ark Lab Hong Kong Peoples R China Tsinghua Univ Dept Comp Sci & Technol Beijing Peoples R China

ISBN: (纸本)9781954085527

Pre-trained language models (PLMs) have achieved great success in natural language processing. Most of PLMs follow the default setting of architecture hyper-parameters (e.g., the hidden dimension is a quarter of the intermediate dimension in feed-forward sub-networks) in BERT (Devlin et al., 2019). Few studies have been conducted to explore the design of architecture hyper-parameters in BERT, especially for the more efficient PLMs with tiny sizes, which are essential for practical deployment on resource-constrained devices. In this paper, we adopt the one-shot Neural Architecture Search (NAS) to automatically search architecture hyper-parameters. Specifically, we carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints. We name our method AutoTinyBERT(1) and evaluate its effectiveness on the GLUE and SQuAD benchmarks. The extensive experiments show that our method outperforms both the SOTA searchbased baseline (NAS-BERT) and the SOTA distillation-based methods (such as DistilBERT, TinyBERT, MiniLM and MobileBERT). In addition, based on the obtained architectures, we propose a more efficient development method that is even faster than the development of a single PLM.

关键词： Distillation

来源：评论

学校读者我要写书评

暂无评论

Enhancing the Japanese WordNet 7

Enhancing the Japanese WordNet

引用

7th Workshop on Asian language Resources, ALR 2009

作者： Bond, Francis Isahara, Hitoshi Fujita, Sanae Uchimoto, Kiyotaka Kuribayashi, Takayuki Kanzaki, Kyoko NICT Language Infrastructure Group NICT Language Translation Group NTT Communications Science Laboratory

ISBN: (纸本)1932432566

The Japanese WordNet currently has 51,000 synsets with Japanese entries. In this paper, we discuss three methods of extending it: increasing the cover, linking it to examples in corpora and linking it to other resources (SUMO and GoiTaikei). In addition, we outline our plans to make it more useful by adding Japanese definition sentences to each synset. Finally, we discuss how releasing the corpus under an open license has led to the construction of interfaces in a variety of programming languages. © 2009 ACL and AFNLP

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Towards unsupervised and language-independent compound splitting using inflectional morphological transformations 15

Towards unsupervised and language-independent compound split...

引用

15th conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, NAACL HLT 2016

作者： Ziering, Patrick Van Der Plas, Lonneke Institute for Natural Language Processing University of Stuttgart Germany Institute of Linguistics University of Malta Malta

ISBN: (纸本)9781941643914

In this paper, we address the task of language-independent, knowledge-lean and unsupervised compound splitting, which is an essential component for many natural language processing tasks such as machine translation. Previous methods on statistical compound splitting either include language-specific knowledge (e.g., linking elements) or rely on parallel data, which results in limited applicability. We aim to overcome these limitations by learning compounding morphology from inflectional information derived from lemmatized monolingual corpora. In experiments for Germanic languages, we show that our approach significantly outperforms language-dependent stateof-the-art methods in finding the correct split point and that word inflection is a good approximation for compounding morphology. ©2016 Association for Computational Linguistics.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

empirical methods in natural language Generation 1

引用

丛书名： Lecture Notes in Computer Science

1000年

作者： Emiel Krahmer Mariët Theune

ISBN: (数字)9783642155734

ISBN: (纸本)9783642155727

natural language generation (NLG) is a subfield of natural language processing (NLP) that is often characterized as the study of automatically converting non-linguistic representations (e.g., from databases or other knowledge sources) into coherent natural language text. In recent years the field has evolved substantially. Perhaps the most important new development is the current emphasis on data-oriented methods and empirical evaluation. Progress in related areas such as machine translation, dialogue system design and automatic text summarization and the resulting awareness of the importance of language generation, the increasing availability of suitable corpora in recent years, and the organization of shared tasks for NLG, where different teams of researchers develop and evaluate their algorithms on a shared, held out data set have had a considerable impact on the field, and this book offers the first comprehensive overview of recent empirically oriented NLG research.

关键词： natural language processing (NLP) Artificial Intelligence Information Storage and Retrieval Information Systems Applications (incl. Internet) Database Management Data Mining and Knowledge Discovery

来源：评论

学校读者我要写书评

暂无评论

Co²PT: Mitigating Bias in Pre-trained language Models through Counterfactual Contrastive Prompt Tuning

Co<SUP>2</SUP>PT: Mitigating Bias in Pre-trained Language Mo...

引用

conference on empirical methods in natural language processing (EMNLP)

作者： Dong, Xiangjue Zhu, Ziwei Wang, Zhuoer Teleki, Maria Caverlee, James Texas A&M Univ College Stn TX 77843 USA George Mason Univ Fairfax VA 22030 USA

ISBN: (纸本)9798891760615

Pre-trained language Models are widely used in many important real-world applications. However, recent studies show that these models can encode social biases from large pre-training corpora and even amplify biases in downstream applications. To address this challenge, we propose (CoPT)-P-2, an efficient and effective debiaswhile-prompt tuning method for mitigating biases via counterfactual contrastive prompt tuning on downstream tasks. Our experiments conducted on three extrinsic bias benchmarks demonstrate the effectiveness of (CoPT)-P-2 on bias mitigation during the prompt tuning process and its adaptability to existing upstream debiased language models. These findings indicate the strength of (CoPT)-P-2 and provide promising avenues for further enhancement in bias mitigation on downstream tasks.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：