Metadata provides the informationthat describes, explains, or makes it easier to retrieve, use, or manage information resources. In the marine field, a metadata model is defined and used for exchange, retrieval, inte...
详细信息
the FP7 FUPOL project aims at a completely new approach to traditional policy modeling providing complex domain use case verification withthe FUPOL Simulator and visualisation of the results in a form suitable for be...
详细信息
this paper describes our study on developing the text and speech databases for automatic speech recognition of Vietnamese using an available source of linguistic data: the Internet. First, a two-stage procedure is app...
详细信息
this paper describes our study on developing the text and speech databases for automatic speech recognition of Vietnamese using an available source of linguistic data: the Internet. First, a two-stage procedure is applied to extract a general text corpus which can be used for researches on Vietnamese language such as speech recognition, audio-visual speech recognition, and natural languageprocessing... We also collect another specific text corpus in the field of news and literature using the resource from some main web sites of Vietnamese. the total text corpus containing 8,681,869 sentences with more than 124 million syllables is then used to build and test the language model for the speech recognizer. Besides, the collecting of speech corpora for experiments on continuous speech recognition and audio-visual speech recognition of Vietnamese are also described.
To save time, healthcare providers frequently use abbreviations while authoring clinical documents. Nevertheless, abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients...
详细信息
the present paper deals withweb-based notation, which implements a natural language representation of mathematical texts and preserves their semantics. We have developed a visualization of notation for browsers and i...
详细信息
the present paper deals withweb-based notation, which implements a natural language representation of mathematical texts and preserves their semantics. We have developed a visualization of notation for browsers and its export to standard formats TeX, Content MathML and PDF. the suggested web-notation provides interactive communication over the Internet, as well as compatibility and interoperability of prepared texts in other applications.
Unsupervised Relation Extraction (URE) methods automatically discover semantic relations in text corpora of unknown content and extract for each discovered relation a set of relation instances. Due to the sparsity of ...
详细信息
We present a novel approach to learning phrasal inversion transduction grammars via Bayesian MAP (maximum a posteriori) or information-theoretic MDL (minimum description length) model optimization so as to incorporate...
详细信息
Named Entity Recognition (NER) is a well-studied area in natural languageprocessing (NLP) and the reported results in the literature are generally very high (~>%95) for most of the languages. Today, the focus area...
详细信息
Named Entity Recognition (NER) is a well-studied area in natural languageprocessing (NLP) and the reported results in the literature are generally very high (~>%95) for most of the languages. Today, the focus area of most practical natural language applications (i.e. web mining, sentiment analysis, machine translation) is real natural language data such as web2.0 or speech data. Nevertheless, the NER task is rarely investigated on this type of data which differs severely from formal written text. In this paper, we present 3 new Turkish data sets from different domains (on this focused area; namely from Twitter, a Speech-to-Text Interface and a Hardware Forum) annotated specifically for NER and report our first results on them. We believe, the paper draws light to the difficulty of these new domains for NER and the possible future work.
this book constitutes the refereed proceedings of the 7thinternationalconference on Scalable Uncertainty Management, SUM 2013, held in Washington, DC, USA, in September 2013. the 26 revised full papers and 3 revised...
ISBN:
(数字)9783642403811
ISBN:
(纸本)9783642403804;9783642403811
this book constitutes the refereed proceedings of the 7thinternationalconference on Scalable Uncertainty Management, SUM 2013, held in Washington, DC, USA, in September 2013. the 26 revised full papers and 3 revised short papers were carefully reviewed and selected from 57 submissions. the papers cover topics in all areas of managing and reasoning with substantial and complex kinds of uncertain, incomplete or inconsistent information including applications in decision support systems, machine learning, negotiation technologies, semantic web applications, search engines, ontology systems, information retrieval, natural languageprocessing, information extraction, image recognition, vision systems, data and text mining, and the consideration of issues such as provenance, trust, heterogeneity, and complexity of data and knowledge.
Hidden Markov models are a powerful statistical tool and have been used in many areas of speech and natural languageprocessing. In this work, we attempt to detect sentence-level subjectivity by means of hidden Markov...
详细信息
Hidden Markov models are a powerful statistical tool and have been used in many areas of speech and natural languageprocessing. In this work, we attempt to detect sentence-level subjectivity by means of hidden Markov model which hasn't been thoroughly investigated for subjectivity analysis. Our feature extraction algorithm calculates a feature vector based on the statistical occurrences of words in a corpus without any linguistic knowledge except tokenization. For this reason, this model can be applied to any language; i.e., there is no lexical, grammatical, syntactical analysis used in the classification process.
暂无评论