The proceedings contain 17 papers. The topics discussed include: low resource quadratic forms for knowledge graph embeddings;evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools;limi...
ISBN:
(纸本)9781955917018
The proceedings contain 17 papers. The topics discussed include: low resource quadratic forms for knowledge graph embeddings;evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools;limitations of knowledge distillation for zero-shot transfer learning;countering the influence of essay length in neural essay scoring;memory-efficient transformers via top-k attention;combining lexical and dense retrieval for computationally efficient multi-hop question answering;learning to rank in the age of Muppets: effectiveness–efficiency tradeoffs in multi-stage ranking;improving synonym recommendation using sentence context;semantic categorization of social knowledge for commonsense question answering;speeding up transformer training by using dataset subsampling - an exploratory analysis;and hyperparameter power impact in transformer language model training.
Event Extraction is a complex and interesting topic in Information Extraction that includes methods for the identification of event's type, participants, location, and date from free text or web data. The result o...
详细信息
Estimating the software projects’ efforts developed by agile methods is important for project managers or technical leads. It provides a summary as a first view of how many hours and developers are required to comple...
详细信息
ISBN:
(纸本)9781665462310
Estimating the software projects’ efforts developed by agile methods is important for project managers or technical leads. It provides a summary as a first view of how many hours and developers are required to complete the tasks. There are research works on automatic predicting the software efforts, including Term Frequency - Inverse Document Frequency (TFIDF) as the traditional approach for this problem. graph Neural Network is a new approach that has been applied in naturallanguageprocessing for text classification. The advantages of graph Neural Network are based on the ability to learn information via graph data structure, which has more representations such as the relationships between words compared to approaches of vectorizing sequence of words. In this paper, we show the potential and possible challenges of graph Neural Network text classification in story point level estimation. By the experiments, we show that the GNN Text Level Classification can achieve as high accuracy as about 80% for story points level classification, which is comparable to the traditional approach. We also analyze the GNN approach and point out several current disadvantages that the GNN approach can improve for this problem or other problems in software engineering.
Prior work infers the causation between events mainly based on the knowledge induced from the annotated causal event pairs. However, additional evidence information intermediate to the cause and effect remains unexplo...
详细信息
ISBN:
(纸本)9781954085527
Prior work infers the causation between events mainly based on the knowledge induced from the annotated causal event pairs. However, additional evidence information intermediate to the cause and effect remains unexploited. By incorporating such information, the logical law behind the causality can be unveiled, and the interpretability and stability of the causal reasoning system can be improved. To facilitate this, we present an Event graph knowledge enhanced explainable CAusal Reasoning framework (ExCAR). ExCAR first acquires additional evidence information from a large-scale causal event graph as logical rules for causal reasoning. To learn the conditional probabilistic of logical rules, we propose the Conditional Markov Neural Logic Network (CMNLN) that combines the representation learning and structure learning of logical rules in an end-to-end differentiable manner. Experimental results demonstrate that ExCAR outperforms previous state-of-the-art methods. Adversarial evaluation shows the improved stability of ExCAR over baseline systems. Human evaluation shows that ExCAR can achieve a promising explainable performance.
This paper considers an important formalization problem and building the terminological ontology of problem subject domains based on content-related text data. As an ontological model, we propose to use a linguistic n...
详细信息
This paper considers an important formalization problem and building the terminological ontology of problem subject domains based on content-related text data. As an ontological model, we propose to use a linguistic network model of text representation, the so-called network of key terms. In this network, the nodes are keywords and phrases that appear in the text corpus, and the links between them are semantic-syntactic links between these terms in the text. Using systems of aggregation of thematic information flows from freely available information resources distributed in global computer networks, input sets of text data were prepared. In particular, this paper solves the important and urgent problem of computerized processing of legal information. The task of computerized processing of naturallanguage texts lies at the intersection between linguistic theory and mathematical sciences. Therefore, a wider naturallanguageprocessingbased on Part-of-Speech tagging was used for extraction of the key terms. After the extraction, a statistical weighing of the formed words and phrases was performed. The horizontal visibility graph algorithm was used to build undirected links between key terms. This paper also considers a new method that allows determining the direction of links between terms and weighting these links in the undirected network of words and phrases. This method takes into account the parts of speech tagging and also obeys the principle of inclusion of a word or phrase in their corresponding extended phrases with more words. The approbation of the proposed method was carried out on the example of a freely available legal document << Universal Declaration of Human Rights >>. After extracting the key terms from this legal document and determining the direction and weight of links between words or phrases using the proposed methods the directed weighted network of terms was built. The considered in this work method for building the terminological networks ca
Knowledge graphs (KGs) are widely used to store and access information about entities and their relationships. Given a query, the task of entity retrieval from a KG aims at presenting a ranked list of entities relevan...
详细信息
As the downstream task of building a knowledge graph, Chinese entity relationship extraction from unstructured texts plays an important role in the field of naturallanguageprocessing. There are two main ways for Chi...
详细信息
This paper describes the system we built as the YNU-HPCC team in the SemEval-2021 Task 11: NLPContributiongraph. This task involves first identifying sentences in the given naturallanguageprocessing (NLP) scholarly ...
详细信息
ISBN:
(纸本)9781954085701
This paper describes the system we built as the YNU-HPCC team in the SemEval-2021 Task 11: NLPContributiongraph. This task involves first identifying sentences in the given naturallanguageprocessing (NLP) scholarly articles that reflect research contributions through binary classification;then identifying the core scientific terms and their relation phrases from these contribution sentences by sequence labeling;and finally, these scientific terms and relation phrases are categorized, identified, and organized into subject-predicateobject triples to form a knowledge graph with the help of multiclass classification and multilabel classification. We developed a system for this task using a pre-trained language representation model called BERT that stands for Bidirectional Encoder Representations from Transformers, and achieved good results. The average F-1-score for Evaluation Phase 2, Part 1 was 0.4562 and ranked 7th, and the average F-1-score for Evaluation Phase 2, Part 2 was 0.6541, and also ranked 7th.
The encoder-decoder framework achieves state-of-the-art results in keyphrase generation (KG) tasks by predicting both present keyphrases that appear in the source document and absent keyphrases that do not. However, r...
详细信息
ISBN:
(纸本)9781955917094
The encoder-decoder framework achieves state-of-the-art results in keyphrase generation (KG) tasks by predicting both present keyphrases that appear in the source document and absent keyphrases that do not. However, relying solely on the source document can result in generating uncontrollable and inaccurate absent keyphrases. To address these problems, we propose a novel graph-based method that can capture explicit knowledge from related references. Our model first retrieves some document-keyphrases pairs similar to the source document from a pre-defined index as references. Then a heterogeneous graph is constructed to capture relationships of different granularities between the source document and its references. To guide the decoding process, a hierarchical attention and copy mechanism is introduced, which directly copies appropriate words from both the source document and its references based on their relevance and significance. The experimental results on multiple KG benchmarks show that the proposed model achieves significant improvements against other baseline models, especially with regard to the absent keyphrase prediction.
暂无评论