chared is a system which can detect characterencoding of a text document provided the language of the document is known. The system supports a wide range of languages and the most commonly used characterencodings. W...
详细信息
ISBN:
(纸本)9788026300779
chared is a system which can detect characterencoding of a text document provided the language of the document is known. The system supports a wide range of languages and the most commonly used characterencodings. We explain the details of the algorithm, describe the process of creating models for various languages and present results of an evaluation on a collection of Web pages.
暂无评论