Text chunking consists of dividing a text into syntactically correlated parts of words. Given the words and their morphosyntactic class, a chunker will decide which words can be grouped as chunks. Malayalam is a free ...
详细信息
ISBN:
(纸本)9781467373494
Text chunking consists of dividing a text into syntactically correlated parts of words. Given the words and their morphosyntactic class, a chunker will decide which words can be grouped as chunks. Malayalam is a free word order language and has relatively unrestricted phrase structures that make the problem of chunking quite challenging. This paper aims to develop a text chunker for Malayalam using memory-based Learning (MBL) approach. memory-based Learning is a machine learning methodology based on the idea that the direct reuse of examples using analogical reasoning is more suited for solving languageprocessing problems than the application of rules extracted from those examples. The chunker was trained using the tool memory-based Tagger (MBT) with words and their POS tags as features. The chunker demonstrated an accuracy of 97.14%.
暂无评论