Statistical models for reordering source words have been used to enhance the hierarchical phrase-based statistical machine translation system. Existing word reordering models learn the reordering for any two source wo...
详细信息
ISBN:
(纸本)9781941643730
Statistical models for reordering source words have been used to enhance the hierarchical phrase-based statistical machine translation system. Existing word reordering models learn the reordering for any two source words in a sentence or only for two continuous words. this paper proposes a series of separate sub-models to learn reorderings for word pairs with different distances. Our experiments demonstrate that reordering sub-models for word pairs with distance less than a specific threshold are useful to improve translation quality. Compared with previous work, our method may more effectively and efficiently exploit helpful word reordering information.
Automatically translating naturallanguage into machine-readable instructions is one of major interesting and challenging tasks in naturallanguage (NL) processing. this problem can be addressed by using machine learn...
详细信息
ISBN:
(纸本)9782951740860
Automatically translating naturallanguage into machine-readable instructions is one of major interesting and challenging tasks in naturallanguage (NL) processing. this problem can be addressed by using machinelearning algorithms to generate a function that find mappings between naturallanguage and programming language semantics. For this purpose suitable annotated and structured data are required. In this paper, we describe our method to construct and semi-automatically annotate these kinds of data, consisting of pairs of NL questions and SQL queries. Additionally, we describe two different datasets obtained by applying our annotation method to two well-known corpora, GEOQUERIES and RESTQUERIES. Since we believe that syntactic levels are important, we also generate and make available relational pairs represented by means of their syntactic trees whose lexical content has been generalized. We validate the quality of our corpora by experimenting withthem and our machinelearning models to derive automatic NL/SQL translators. Our promising results suggest that our corpora can be effectively used to carry out research in the field of naturallanguage interface to database.
Nowadays Artificial Intelligent (AI) technologies are applied widely in many different areas to assist knowledge gaining and decision-making tasks. Especially, health information system can get most benefits from the ...
详细信息
ISBN:
(纸本)9781728132990
Nowadays Artificial Intelligent (AI) technologies are applied widely in many different areas to assist knowledge gaining and decision-making tasks. Especially, health information system can get most benefits from the AI advantages. In particular, symptoms based disease prediction research and production became increasingly popular in the healthcare sector recently. Various researchers and organizations have turned their interest in using modern computational techniques to analyze and develop new approaches that can efficiently predict diseases with reasonable accuracy. In this paper, we propose a framework to evaluate the efficiency of applying bothmachinelearning (ML) and Nature languageprocessing (NLP) technologies for disease prediction system. As an example, we scraped a disease-symptom dataset with NLP features from one of the UK most trustable National Health Service (NHS) website. In addition, we will exam our data in depth having symptom frequency, similarity and clustering analysis. As result, we can see that the prediction can have a very positive efficient rate but still open issues need to be addressed.
this research study intends to assist the examination board in evaluating the responses based on the level of question difficulty determined by the board. As an initial step, this study reviews numerous approaches and...
详细信息
Amidst the continuous stream of diverse data on the Bloomberg terminal, distinguishing editorial news articles from regular articles is critical to aid its users in tailoring their news experience and further analyzin...
详细信息
ISBN:
(纸本)9798400709227
Amidst the continuous stream of diverse data on the Bloomberg terminal, distinguishing editorial news articles from regular articles is critical to aid its users in tailoring their news experience and further analyzing the impact of news on global financial markets. In this paper, we propose various Artificial Intelligence and Neural Networks models regarding developing an editorial classifier that generalizes well across various news sources. the training set comprises articles published by news sources from the US. We compare the performance of these models using the Aggregate F1-measure and Binary Classification Performance Metric as evaluation metrics to account for the presence of class imbalance in our data. Further, we gauged our models by comparing their performance on a Zero-Shot dataset which comprised 1805 news articles published by Metro Winnipeg, a Canadian news source.
One of the difficult tasks on naturallanguageprocessing (NLP) is to resolve the sense ambiguity of characters or words on text, such as polyphones, homonymy, and homograph. the paper addresses the ambiguity issue of...
详细信息
ISBN:
(纸本)9781424420957
One of the difficult tasks on naturallanguageprocessing (NLP) is to resolve the sense ambiguity of characters or words on text, such as polyphones, homonymy, and homograph. the paper addresses the ambiguity issue of Chinese character polyphones and disambiguity approaches for such issues. three methods, dictionary matching, language models and voting scheme, are used to disambiguate the prediction of polyphones. the best precision rate for these methods achieves 92.65%. Furthermore we proposed the unify approaches to improve the performance with respect to various threshold value. Comparing withthe well-known MS Word 2007, our approach is Superior and enhances the final precision rate up to 93.32%.
We propose an abstraction-based multi-document summarization framework that can construct new sentences by exploring more fine-grained syntactic units than sentences, namely, noun/verb phrases. Different from existing...
详细信息
ISBN:
(纸本)9781941643723
We propose an abstraction-based multi-document summarization framework that can construct new sentences by exploring more fine-grained syntactic units than sentences, namely, noun/verb phrases. Different from existing abstraction-based approaches, our method first constructs a pool of concepts and facts represented by phrases from the input documents. then new sentences are generated by selecting and merging informative phrases to maximize the salience of phrases and meanwhile satisfy the sentence construction constraints. We employ integer linear optimization for conducting phrase selection and merging simultaneously in order to achieve the global optimal solution for a summary. Experimental results on the benchmark data set TAC 2011 show that our framework outperforms the state-of-the-art models under automated pyramid evaluation metric, and achieves reasonably well results on manual linguistic quality evaluation.
In 2016, the attention to the fake news phenomenon drastically increased. Mobile devices such as cellular phones and sources of information such as social networks are instruments that enable individuals to receive ne...
详细信息
ISBN:
(纸本)9781538646922
In 2016, the attention to the fake news phenomenon drastically increased. Mobile devices such as cellular phones and sources of information such as social networks are instruments that enable individuals to receive news, publish posts, communicate with peers, watch videos, listen to music, etc. In today's highly mobile society, this is a current trend. the uncontrolled freedom and simplicity in publications on the Internet result in overwhelming users receiving news that are fake and hoaxes. Detecting and filtering such information is a challenging problem. this paper discusses different approaches to combat fake news. they are used to a) determine text features utilizing linguistic naturallanguageprocessing methods (it is necessary to create a profile of the text document), b) detect spam bots in social networks to isolate those using machine-learning methods (it is crucial to reduce the number of analyzed documents), and c) confirm the facts in online documents by applying techniques used in search engines (it is very much important to select trusted documents). A system combining these mechanisms may demonstrate a high level of accuracy in filtering fake news.
Prediction of next word is also known as language modeling and is an application of naturallanguageprocessing which helps in next word prediction. In the past, several studies employed various models to predict the ...
详细信息
the Abstract Meaning Representation (AMR) is a semantic representation language used to capture the meaning of English sentences. In this work, we propose an AMR parser based on dependency parse rewrite rules. this ap...
详细信息
ISBN:
(纸本)9781941643747
the Abstract Meaning Representation (AMR) is a semantic representation language used to capture the meaning of English sentences. In this work, we propose an AMR parser based on dependency parse rewrite rules. this approach transfers dependency parses into AMRs by integrating the syntactic dependencies, semantic arguments, named entity and co-reference information. A dependency parse to AMR graph aligner is also introduced as a preliminary step for designing the parser.
暂无评论