There is an intuition that the text surrounding a link or the link context on the HMTL page is a good summary of the target *** paper presents a focused Web crawling technique based on link-contexts guided by SVM clas...
详细信息
There is an intuition that the text surrounding a link or the link context on the HMTL page is a good summary of the target *** paper presents a focused Web crawling technique based on link-contexts guided by SVM classifier with uneven *** work utilizes the beneficial link context information about the seed URLs before actual crawling and collects specific-domain resources beforehand to steer the focused Web *** results show obviously that this approach outperforms Best-First and Breath-First algorithm both in harvest rate and efficiency.
暂无评论