检索结果-内蒙古大学图书馆

20th International Conference on Advances in ICT for Emerging Regions (ICTer)

作者： Goonathilake, M. D. P. P. Kumara, P. P. N., V Gen Sir John Kotelawala Def Univ Dept Comp Sci Ratmalana Sri Lanka

ISBN: (纸本)9781728186535

Fake news is a new phenomenon related to false information and fraud that spreads through online social media or traditional news media. Today, fake news can be easily created and distributed across many social media platforms and has a widespread impact on the real world. It is critical to develop efficient algorithms and tools for early detection of how false information is disseminated on social media platforms and why it is successful in deceiving users. Most research methods today are based on machine learning, deep learning, feature engineering, graph mining, image and video analysis and newly developed datasets and web services for detecting deceptive content. Therefore, a strong need emerges to find a suitable method that can easily detect false information. A hybrid approach has suggested using the CNN model and RNN-LSTM model to detect false information from this study. first, NLTK toolkit has used to remove stop words, punctuations and special characters from the text. Then the same toolkit applies to tokenize the text and preprocesses the text. From there on, GloVe word embeddings have added to the preprocessed text. Higher-level features of the input text extract from the CNN model using convolutional layers and max-pooling layers. Long-term dependencies between word sequences capture from RNN-LSTM model. The suggested model also applies dropout technology with Dense layers to enhance the efficiency of the hybrid model. Results of the suggested hybrid model have shown that the suggested CNN, RNN-LSTM based Hybrid approach achieves the highest accuracy of 92% by surpassing most of the classical models today with Adam optimizer and Binary Cross-Entropy loss function.

关键词： Hybrid Approach Fake News Detection Deep Learning natural language processing

来源：评论

学校读者我要写书评

暂无评论

Learning word-level dialectal variation as phonological replacement rules using a limited parallel corpus 11

Learning word-level dialectal variation as phonological repl...

引用

workshop on Algorithms and Resources for Modelling of Dialects and language Varieties

作者： Mans Hulden Inaki Alegria Izaskun Etxeberris Montse Maritxalar University of Helsinki Language Technology IXA taldea UPV-EHU

ISBN: (纸本)9781618392473

This paper explores two different methods of learning dialectal morphology from a small parallel corpus of standard and dialect-form text, given that a computational description of the standard morphology is available. The goal is to produce a model that translates individual lexical dialectal items to their standard dialect counterparts in order to facilitate dialectal use of available NLP tools that only assume standard-form input. The results show that a learning method based on inductive logic programming quickly converges to the correct model with respect to many phonological and morphological differences that are regular in nature.

关键词： Learning methods Learning Parallel Lines inductive logic programming Dialect metasomatism Surgical Replantation Regular natural language processing NINL gene

来源：评论

学校读者我要写书评

暂无评论

Improving graph-based Text Representations with Character and Word Level N-grams

arXiv

引用

arXiv 2022年

作者： Li, Wenzhe Aletras, Nikolaos Computer Science Department University of Sheffield United Kingdom

graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus. Despite the increasing interest in graph representation learning, there is limited research in exploring new ways for graph-based text representation, which is important in downstream natural language processing tasks. In this paper, we first propose a new heterogeneous word-character text graph that combines word and character n-gram nodes together with document nodes, allowing us to better learn dependencies among these entities. Additionally, we propose two new graph-based neural models, WCTextGCN and WCTextGAT, for modeling our proposed text graph. Extensive experiments in text classification and automatic text summarization benchmarks demonstrate that our proposed models consistently outperform competitive baselines and state-of-the-art graph-based models. © 2022, CC BY.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

A cross-comparison of two clustering methods 01

A cross-comparison of two clustering methods

引用

proceedings of the workshop on Evaluation for language and Dialogue Systems - Volume 9

作者： Olivier Ferret Brigitte Grau Michèle Jardino CEA Saclay DTI/SITI Gif-sur-Yvette Cedex LIMSI CNRS Orsay France

Many natural language processing applications require semantic knowledge about topics in order to be possible or to be efficient. So we developed a system, SEGAPSITH, that acquires it automatically from text segments by using an unsupervised and incremental clustering method. In such an approach, an important problem consists of the validation of the learned classes. To do that, we applied another clustering method, that only needs to know the number of classes to build, on the same subset of text segments and we reformulate our evaluation problem in comparing the two classifications. So, we established different criteria to compare them, based either on the words as class descriptors or on the thematic units. Our first results lead to show a great correlation between the two classifications.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Neural Network methods for natural language processing

引用

Synthesis Lectures on Human language Technologies 2017年第1期10卷 1-311页

作者： Goldberg, Yoav Bar Ilan University Israel

ISBN: (纸本)9781681736129

Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries. The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning. Copyright © 2017 by Morgan & Claypool.

关键词： Supervised learning

来源：评论

学校读者我要写书评

暂无评论

Automatically Generating natural language Documentation for methods

Automatically Generating Natural Language Documentation for ...

引用

Dynamic Software Documentation (DySDoc3)

作者： Christian D. Newman Natalia Dragan Michael L. Collard Jonathan I. Maletic Michael J. Decker Drew T. Guarnera Nahla Abid Rochester Institute of Technology Rochester NY US Kent State University Ohio USA The University of Akron Ohio USA Kent State University Kent OH US Bowling Green State University Bowling Green OH US Taibah University Saudi Arbia

A tool to automatically generate natural language documentation summaries for methods is presented. The approach uses prior work by the authors on stereotyping methods along with the source code analysis framework srcML. first, each method is automatically assigned a stereotype(s) based on static analysis and a set of heuristics. Then, the approach uses the stereotype information, static analysis, and predefined templates to generate a natural-language summary for each method. This summary is automatically added to the code base as a comment for each method. The predefined templates are designed to produce a generic summary for specific method stereotypes.

关键词： Documentation Conferences Software maintenance Tools Static analysis natural language processing

来源：评论

学校读者我要写书评

暂无评论

Multi-Stream graph Convolutional Networks for Text Classification via Representative-Word Document Mining

引用

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2022年第4期21卷 2250028-2250028页

作者： Li, Meng Chen, Shenyu Yang, Weifeng Wang, Qianying Hebei Univ Econ & Business Coll Math & Stat Hebei Peoples R China Alibaba Grp Hangzhou Zhejiang Peoples R China

Recently, graph convolutional networks (GCNs) for text classification have received considerable attention in natural language processing. However, most current methods just use original documents and words in the corpus to construct the topology of graph which may lose some effective information. In this paper, we propose a Multi-Stream graph Convolutional Network (MS-GCN) for text classification via Representative-Word Document (RWD) mining, which is implemented in PyTorch. In the proposed method, we first introduce temporary labels and mine the RWDs which are treated as additional documents in the corpus. Then, we build a heterogeneous graph based on relations among a Group of RWDs (GRWDs), words and original documents. Furthermore, we construct the MS-GCN based on multiple heterogeneous graphs according to different GRWDs. Finally, we optimize our MS-GCN model through updated mechanism of GRWDs. We evaluate the proposed approach on six text classification datasets, 20NG, R8, R52, Ohsumed, MR and Pheme. Extensive experiments on these datasets show that our proposed approach outperforms state-of-the-art methods for text classification.

关键词： Text classification neural network heterogeneous graph corpus graph convolution

来源：评论

学校读者我要写书评

暂无评论

AsU-OSum: Aspect-augmented unsupervised opinion summarization

引用

INFORMATION processing & MANAGEMENT 2023年第1期60卷

作者： Zhang, Mengli Zhou, Gang Huang, Ningbo He, Peng Yu, Wanting Liu, Wenfen State Key Lab Math Engn & Adv Comp Zhengzhou Peoples R China Guilin Univ Elect Technol Guilin Peoples R China

Opinion summarization can facilitate user's decision-making by mining the salient review information. However, due to the lack of sufficient annotated data, most of the early works are based on extractive methods, which restricts the performance of opinion summarization. In this work, we aim to improve the informativeness of opinion summarization to provide better guidance to users. We consider the setting with only reviews without corresponding summaries, and propose an aspect-augmented model for unsupervised abstractive opinion summarization, denoted as AsU-OSum. We first employ an aspect-based sentiment analysis system to extract opinion phrases from reviews. Then, we construct a heterogeneous graph consisting of reviews and opinion clusters as nodes, which is used to enhance the Transformer-based encoder-decoder framework. Furthermore, we design a novel cascaded attention mechanism to prompt the decoder to pay more attention to the aspects that are more likely to appear in summary. During training, we introduce a sentiment accuracy reward that further enhances the learning ability of our model. We conduct comprehensive experiments on the Yelp, Amazon, and Rotten Tomatoes datasets. Automatic evaluation results show that our model is competitive and performs better than the state-of-the-art (SOTA) models on some ROUGE metrics. Human evaluation results further verify that our model can generate more informative summaries and reduce redundancy.

关键词： natural language processing Opinion summarization Aspect-augmented Heterogeneous graph Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

融合语义与结构信息的知识图谱补全模型研究

引用

数据分析与知识发现 2024年第4期8卷 39-49页

作者：马志远高颖张强周洪李兵陶皖安徽工程大学计算机与信息学院芜湖241000 华中师范大学信息管理学院武汉430079 中国科学院武汉文献情报中心武汉430071 科学技术部科技人才交流开发服务中心北京100045

【目的】针对知识图谱补全任务,挖掘语义与结构信息,完善知识图谱并提升质量与可靠性。【方法】提出一种融合语义与结构信息的知识图谱补全模型,通过预训练语言模型增强知识图谱内文本及上下文数据的嵌入表示,捕获实体与关系的语义信息... 详细信息

【目的】针对知识图谱补全任务,挖掘语义与结构信息,完善知识图谱并提升质量与可靠性。【方法】提出一种融合语义与结构信息的知识图谱补全模型,通过预训练语言模型增强知识图谱内文本及上下文数据的嵌入表示,捕获实体与关系的语义信息,并构建实体-关系矩阵映射知识图谱网络结构,获取实体的邻域信息与关系约束,进一步融合潜在数据,进行模型训练并预测丢失实体,最终达成知识图谱补全任务。【结果】与基线方法性能相比,该模型的Hits@3评测指标在FB15k-237、WN18RR和UMLS数据集上分别提升0.5、0.6和0.6个百分点。【局限】受限于语言模型的基础表示能力,未能结合多模态数据进一步提升补全任务效果。【结论】该模型具有较好的补全性能,融合语义与结构信息的方式对比其他方法具有一定优势,能够较好地完成知识图谱补全任务,对知识图谱及其下游应用的发展具有重要意义。

关键词：知识图谱补全预训练语言模型自然语言处理深度学习

来源：评论

学校读者我要写书评

暂无评论

Agreeing to Disagree: Choosing Among Eight Topic-Modeling methods

引用

BIG DATA RESEARCH 2021年 23卷

作者： Fu, Qiang Zhuang, Yufan Gu, Jiaxin Zhu, Yushu Guo, Xin Univ British Columbia Dept Sociol Vancouver BC Canada IBM Res Yorktown Hts NY USA Simon Fraser Univ Urban Studies Program Vancouver BC Canada Simon Fraser Univ Sch Publ Policy Vancouver BC Canada Hong Kong Polytech Univ Dept Appl Math Hong Kong Peoples R China Univ Queensland Sch Math & Phys Brisbane Qld Australia

Topic modeling is a key research area in natural language processing and has inspired innovative studies in a wide array of social-science disciplines. Yet, the use of topic modeling in computational social science has been hampered by two critical issues. first, social scientists tend to focus on a few standard ways of topic modeling. Our understanding of semantic patterns has not been informed by rapid methodological advances in topic modeling. Moreover, a systematic comparison of the performance of different methods in this field is warranted. Second, the choice of the optimal number of topics remains a challenging task. A comparison of topic-modeling techniques has rarely been situated in a social-science context and the choice appears to be arbitrary for most social scientists. based on about 120,000 Canadian newspaper articles since 1977, we review and compare eight traditional, generative, and neural methods for topic modeling (Latent Semantic Analysis, Principal Component Analysis, Factor Analysis, Non-negative Matrix Factorization, Latent Dirichlet Allocation, Neural Autoregressive Topic Model, Neural Variational Document Model, and Hierarchical Dirichlet Process). Three measures (coherence statistics, held-out likelihood, and graph-based dimensionality selection) are then used to assess the performance of these methods. Findings are presented and discussed to guide the choice of topic-modeling methods, especially in social science research. (C) 2020 Elsevier Inc. All rights reserved.

关键词： Topic modeling natural language processing Computational social science Optimal number of topics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：