检索结果-内蒙古大学图书馆

计算机系统应用 2023年第10期32卷 293-300页

作者：罗松汪春梅袁非牛戴维上海师范大学信息与机电工程学院上海201418

当前的英文语法纠错模型往往忽略了有利于语法纠错的文本句法知识,从而使得英语语法纠错模型的纠错能力受到影响.针对上述问题,提出一种基于差分融合句法特征的英语语法纠错模型.首先,本文提出的句法编码器不仅可以直接从文本中无监督... 详细信息

当前的英文语法纠错模型往往忽略了有利于语法纠错的文本句法知识,从而使得英语语法纠错模型的纠错能力受到影响.针对上述问题,提出一种基于差分融合句法特征的英语语法纠错模型.首先,本文提出的句法编码器不仅可以直接从文本中无监督地生成依存关系图和成分句法树信息,而且还能将上述两种异构的句法结构进行特征融合,编码成高维的句法表征.其次,为了同时利用文本中的语义和句法信息,差分融合模块先使用差分正则化加强语义编码器捕获句法编码器未能生成的语义特征,然后采用协同注意力将句法表征和语义表征进一步融合,作为Transformer编码端的输出特征,最终输入到解码端,从而生成语法正确的文本.在CoNLL-2014英文纠错任务数据集上进行对比实验,结果表明,该方法的准确率和F0.5值优于基于Copy-Augmented Transformer的语法纠错模型,其F0.5值提升了5.2个百分点,并且句法知识避免了标注数据过少问题,具有更优的文本纠错效果.

关键词：自然语言处理语法纠错句法知识协同注意力差分融合

来源：评论

学校读者我要写书评

暂无评论

What Should I Learn first: Introducing Lecture Bank for NLP Education and Prerequisite Chain Learning 33

What Should I Learn First: Introducing Lecture Bank for NLP ...

引用

33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence

作者： Li, Irene Fabbri, Alexander R. Tung, Robert R. Radev, Dragomir R. Yale Univ Dept Comp Sci New Haven CT 06520 USA

ISBN: (纸本)9781577358091

Recent years have witnessed the rising popularity of natural language processing (NLP) and related fields such as Artificial Intelligence (AI) and Machine Learning (ML). Many online courses and resources are available even for those without a strong background in the field. Often the student is curious about a specific topic but does not quite know where to begin studying. To answer the question of "what should one learn first,"we apply an embedding-based method to learn prerequisite relations for course concepts in the domain of NLP. We introduce Lecture Bank, a dataset containing 1,352 English lecture files collected from university courses which are each classified according to an existing taxonomy as well as 208 manually-labeled prerequisite relation topics, which is publicly available(1). The dataset will be useful for educational purposes such as lecture preparation and organization as well as applications such as reading list generation. Additionally, we experiment with neural graph-based networks and non-neural classifiers to learn these prerequisite relations from our dataset.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Medical knowledge infused convolutional neural networks for cohort selection in clinical trials

引用

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 2019年第11期26卷 1227-1236页

作者： Chen, Chi-Jen Warikoo, Neha Chang, Yung-Chun Chen, Jin-Hua Hsu, Wen-Lian Taipei Med Univ Coll Management Grad Inst Data Sci 11F172-1Sec 2Keelung Rd Taipei Taiwan Natl Yang Ming Univ Inst Biomed Informat Taipei Taiwan Acad Sinica Inst Informat Sci Taiwan Int Grad Program Bioinformat Program Taipei Taiwan Taipei Med Univ Hosp Clin Big Data Res Ctr Taipei Taiwan Minist Sci & Technol Pervas AI Res Labs Taipei Taiwan Acad Sinica Inst Informat Sci Taipei Taiwan

Objective: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. Materials and methods: In this study, we propose a clinical text representation infused with medical knowledge (MK). first, we isolate the noise from the relevant data using a medically relevant description extractor;then we utilize log-likelihood ratio based weights from selected sentences to highlight "met" and "not-met" knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. Results: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3%;notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1% on F1metric, a gain of 6% above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. Conclusion: MKCNN scored 86.1% on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.

关键词： natural language processing cohort selection clinical trials convolutional neural network medical records

来源：评论

学校读者我要写书评

暂无评论

Advanced Analysis methods for Large-scale Structured Data

Advanced Analysis Methods for Large-scale Structured Data

引用

作者： Zhou, Fan The University of North Carolina at Chapel Hill

学位级别：Ph.D.

In the era of ’big data’, advanced storage and computing technologies allow people to build and process large-scale datasets, which promote the development of many fields such as speech recognition, natural language processing and computer vision. Traditional approaches can not handle the heterogeneity and complexity of some novel data structures. In this dissertation, we want to explore how to combine different tools to develop new methodologies in analyzing certain kinds of structured data, motivated by real-world problems. Multi-group design, such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI), has been undertaken by recruiting subjects based on their multi-class primary disease status, while some extensive secondary outcomes are also collected. Analysis by standard approaches is usually distorted because of the unequal sampling rates of different classes. In the first part of the dissertation, we develop a general regression framework for the analysis of secondary phenotypes collected in multi-group association studies. Our regression framework is built on a conditional model for the secondary outcome given the multi-group status and covariates and its relationship with the population regression of interest of the secondary outcome given the covariates. Then, we develop generalized estimation equations to estimate the parameters of interest. We use simulations and a large-scale imaging genetic data analysis of the ADNI data to evaluate the effect of the multi-group sampling scheme on standard genome-wide association analyses based on linear regression methods, while comparing it with our statistical methods that appropriately adjust for the multi-group sampling scheme. In the past few decades, network data has been increasingly collected and studied in diverse areas, including neuroimaging, social networks and knowledge graphs. In the second part of the dissertation, we investigate the graph-based semi-supervised learning problem with nonignorable non

关键词： Biostatistics Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Scientific Discovery as Link Prediction in Influence and Citation graphs 12

Scientific Discovery as Link Prediction in Influence and Cit...

引用

12th workshop on graph-based methods for natural language processing, Textgraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018

作者： Luo, Fan Valenzuela-Escárcega, Marco Hahn-Powell, Gus Surdeanu, Mihai University of Arizona TucsonAZ United States

ISBN: (纸本)9781948087254

We introduce a machine learning approach for the identification of "white spaces" in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence statements such as "CTCF activates FOXA1", which were automatically extracted using open-domain machine reading. We model this prediction task using graph-based features extracted from the above influence graph, as well as from a citation graph that captures scientific communities. We evaluated the proposed approach through backtesting. Although the data is heavily unbalanced (50 times more negative examples than positives), our approach predicts which influence links will be discovered in the "near future" with a F1 score of 27 points, and a mean average precision of 68%. © 2018 Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

A graph-theoretic Summary Evaluation for ROUGE

A Graph-theoretic Summary Evaluation for ROUGE

引用

Conference on Empirical methods in natural language processing (EMNLP)

作者： ShafieiBavani, Elaheh Ebrahimi, Mohammad Wong, Raymond Chen, Fang Univ New South Wales Sydney NSW Australia Data61 CSIRO Sydney NSW Australia

ISBN: (纸本)9781948087841

ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate summaries including lexical variations and paraphrasing. We propose a graph-based approach adopted into ROUGE to evaluate summaries based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets show that exploiting the lexico-semantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

graph-based Deep-Tree Recursive Neural Network (DTRNN) for Text Classification

Graph-Based Deep-Tree Recursive Neural Network (DTRNN) for T...

引用

2018 IEEE Spoken language Technology workshop, SLT 2018

作者： Chen, Fenxiao Wang, Bin Jay Kuo, C.-C. University of Southern California Los AngelesCA United States

ISBN: (纸本)9781538643341

A novel graph-to-tree conversion mechanism called the deep-tree generation (DTG) algorithm is first proposed to predict text data represented by graphs. The DTG method can generate a richer and more accurate representation for nodes (or vertices) in graphs. It adds flexibility in exploring the vertex neighborhood information to better reflect the second order proximity and homophily equivalence in a graph. Then, a Deep-Tree Recursive Neural Network (DTRNN) method is presented and used to classify vertices that contains text data in graphs. To demonstrate the effectiveness of the DTRNN method, we apply it to three real-world graph datasets and show that the DTRNN method outperforms several state-of-the-art benchmarking methods. © 2018 IEEE.

关键词： Trees (mathematics)

来源：评论

学校读者我要写书评

暂无评论

A Two-Stage Overlapping Community Detection based on Structure and Node Attributes in Online Social Networks 1st

A Two-Stage Overlapping Community Detection Based on Structu...

引用

1st International workshop on Human Brain and Artificial Intelligence, HBAI 2019, held in conjunction with the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019

作者： Zhang, Xinmeng Li, Xinguang Jiang, Shengyi Li, Xia Xie, Bolin Laboratory of Language Engineering and Computing Guangdong University of Foreign Studies GuangzhouGuangdong510006 China Non-universal Language Intelligent Processing Laboratory Guangdong University of Foreign Studies GuangzhouGuangdong510006 China School of Information Science and Technology Guangdong University of Foreign Studies GuangzhouGuangdong510006 China

ISBN: (纸本)9789811513978

Traditional community detection algorithms are mainly based on network structure, while ignoring a large number of node attributes. In this paper, we propose a two-stage overlapping community detection method which combines structure and attributes(tsocd-SA). first, a set of non-overlapping communities are identified by using existing community detection methods, and community attribute summaries which represents high degree homogeneous attribute value of a community are constructed according to the attributes of the special nodes in the community. Then, we propose a similarity measure between node and community based on network structure and community attribute summary. For connector nodes which connect more than one communities, each node is divided into one or more communities based on the similarity and a specific threshold r. Experimental results in online social network datasets show that our proposed method is more effective than solely focus on structural information. © 2019, Springer Nature Singapore Pte Ltd.

关键词： Social networking (online)

来源：评论

学校读者我要写书评

暂无评论

Efficient graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings 12

Efficient Graph-based Word Sense Induction by Distributional...

引用

作者： Chang, Haw-Shiuan Agrawal, Amol Ganesh, Ananya Desai, Anirudha Mathur, Vinayak Hough, Alfred McCallum, Andrew CICS University of Massachusetts 140 Governors Dr. AmherstMA01003 United States Lexalytics 320 Congress St BostonMA02210 United States

ISBN: (纸本)9781948087254

Word sense induction (WSI), which addresses polysemy by unsupervised discovery of multiple word senses, resolves ambiguities for downstream NLP tasks and also makes word representations more interpretable. This paper proposes an accurate and efficient graph-based method for WSI that builds a global non-negative vector embedding basis (which are interpretable like topics) and clusters the basis indexes in the ego network of each polysemous word. By adopting distributional inclusion vector embeddings as our basis formation model, we avoid the expensive step of nearest neighbor search that plagues other graph-based methods without sacrificing the quality of sense clusters. Experiments on three datasets show that our proposed method produces similar or better sense clusters and embeddings compared with previous state-of-the-art methods while being significantly more efficient. © 2018 Association for Computational Linguistics.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

Semantics as a Foreign language

Semantics as a Foreign Language

引用

Conference on Empirical methods in natural language processing (EMNLP)

作者： Stanovsky, Gabriel Dagan, Ido Bar Ilan Univ Comp Sci Dept Ramat Gan Israel Univ Washington Paul G Allen Sch Comp Sci Engn Seattle WA 98195 USA Allen Inst Artificial Intelligence Seattle WA 98195 USA

ISBN: (纸本)9781948087841

We propose a novel approach to semantic dependency parsing (SDP) by casting the task as an instance of multi-lingual machine translation, where each semantic representation is a different foreign dialect. To that end, we first generalize syntactic linearization techniques to account for the richer semantic dependency graph structure. Following, we design a neural sequence-to-sequence framework which can effectively recover our graph linearizations, performing almost on-par with previous SDP state-of-the-art while requiring less parallel training annotations. Beyond SDP, our linearization technique opens the door to integration of graph-based semantic representations as features in neural models for downstream applications.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：