A Comparison method for webpages in terms of visual similarity is proposed Conventional web Information retrieval/gathering systems, such as search engines, extract keywords from HTML source files, based on which the...
详细信息
ISBN:
(纸本)076952415X
A Comparison method for webpages in terms of visual similarity is proposed Conventional web Information retrieval/gathering systems, such as search engines, extract keywords from HTML source files, based on which the similarity between pages is calculated. The extracted keywords are considered as semantic features representing the contents of webpages. On the other hand, visual feature of webpages is as important as semantic feature, because HTML is designed for visualizing a webpage in understandable manner for humans. The proposed method compares the layouts of webpages based on image processing and graph matching. The experimental results show that the accuracy of layout analysis is 91.6% in average, and the visual similarity calculated by the proposed method is closer to the visual judgment by test subjects than color-based comparison method.
暂无评论