Keyword extraction is the process of identifying the words or phrases that express the main concepts of text to the best of one’s ability. Electronic infrastructure creates a considerable amount of text every day and...
详细信息
Relation extraction is an important task in naturallanguageprocessing(NLP). The existing methods generally pay more attention on extracting textual semantic information from text, but ignore the relation contextual ...
详细信息
Relation extraction is an important task in naturallanguageprocessing(NLP). The existing methods generally pay more attention on extracting textual semantic information from text, but ignore the relation contextual information from existed relations in datasets, which is very important for the performance of relation extraction task. In this paper, we represent each individual entity as a embedding based on entities and relations knowledge graph, which encodes the relation contextual information between the given entity pairs and relations. Besides, inspired by the impressive performance of language models recently, we used the language model to leverage word semantic information, in which word semantic information can be better captured than word embedding. The experimental results on SemEval2010 Task 8 dataset showed that the F1-score of our proposed method improved nearly 3% compared with the previous methods.
Explainable AI aims at building intelligent systems that are able to provide a clear, and human understandable, justification of their decisions. This holds for both rule-based and data-driven methods. In management o...
详细信息
Post-filtering is a popular technique for multichannel speech enhancement system, in order to further improve the speech quality and intelligibility after beamforming. This paper presents a novel post-filtering to a m...
详细信息
ISBN:
(纸本)9781728119274
Post-filtering is a popular technique for multichannel speech enhancement system, in order to further improve the speech quality and intelligibility after beamforming. This paper presents a novel post-filtering to a minimum variance distortionless response (MVDR) beamforming which is a single-channel modified complementary joint sparse representations (M-CJSR) method. First, MVDR beamformer is used to suppress interference and noise. Subsequently, the proposed M-CJSR approach based on joint dictionary learning is applied as a single microphone post-filter to process the beamformer output. Different from the existing post-filtering techniques which rely on the assumptions about the noise field, this algorithm considers a more generalized signal model including the ambient noise, like diffuse noise or white noise, as well as the point-source interference. Moreover, the original CJSR method is extended to jointly learn dictionaries for not only the mappings from mixture to speech and noise, but also the mapping from mixture to interference. In order to take the complementary advantages of different sparse representations, we design the weighting parameters based on the residual components of the estimated signals. An experimental study which consists of objective evaluations under various conditions verifies the superiority of the proposed algorithm compared to other state-of-the-art methods.
Corpus2graph is an open-source NLP-application-oriented Python package that generates a word co-occurrence network from a large corpus. It not only contains different built-in methods to preprocess words, analyze sent...
详细信息
There is an increasing interest in exploiting the content of electronic health records by means of naturallanguageprocessing and text-mining technologies, as they can result in resources for improving patient health...
详细信息
There is an increasing interest in exploiting the content of electronic health records by means of naturallanguageprocessing and text-mining technologies, as they can result in resources for improving patient health/safety, aid in clinical decision making, facilitate drug re-purposing or precision medicine. To share, re-distribute and make clinical narratives accessible for text mining research purposes, it is key to fulfill legal conditions and address restrictions related data protection and patient privacy. Thus, clinical records cannot be shared directly"as is". A necessary precondition for accessing clinical records outside of hospitals is their de-identification or exhaustive removal/replacement of all mentioned privacy related protected health information phrases. Providing a proper evaluation scenario for automatic anonymization tools is key for approval of data redistribution. The construction of manually de-identified medical records is currently the main rate and cost-limiting step for secondary use applications of clinical data. This paper summarizes the settings, data and results of the first shared track on anonymization of medical documents in Spanish, the MEDDOCAN (Medical Document Anonymization) track. This track relied on a carefully constructed synthetic corpus of clinical case documents, the MEDDOCAN corpus, following annotation guidelines for sensitive data based on the analysis of the EU General Data Protection Regulation. A total of 18 teams (from the 51 registrations) submitted 63 runs for first sub-track 1 and 61 systems for the second sub-track. The top scoring systems were based on sophisticated deep learning approaches, representing strategies that can significantly reduce time and costs associated to accessing textual data containing privacy-related sensitive information. The results of this track might help in lowering the clinical data access hurdle for Spanish language technology developers, showing also potentials for similar setti
Scholarly practices within the humanities have historically been perceived as distinct from the natural sciences. We look at literary studies, a discipline strongly anchored in the humanities, and hypothesize that ove...
详细信息
We introduce a machine learning approach for the identification of "white spaces" in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence sta...
详细信息
Toponym resolution is an important and challenging task in the neural languageprocessing field, and has wide applications such as emergency response and social media geographical event analysis. Toponym resolution ca...
详细信息
processing and text mining are becoming increasingly possible thanks to the development of computer technology, as well as the development of artificial intelligence (machine learning). This article describes approach...
详细信息
暂无评论