咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Scalable thread based index co... 收藏

Scalable thread based index construction using wavelet tree

作     者:Yadav, Arun Kumar Yadav, Divakar Verma, Akhilesh Akbar, Mohd Tewari, Kartikey 

作者机构:NIT Hamirpur Dept Comp Sci & Engn Hamirpur India Ajay Kumar Garg Engn Coll Dept Comp Sci & Engn Ghaziabad UP India 

出 版 物:《MULTIMEDIA TOOLS AND APPLICATIONS》 (多媒体工具和应用)

年 卷 期:2023年第82卷第9期

页      面:14037-14053页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:UK Research and Innovation  UKRI  (105141) 

主  题:Wavelet tree Inverted index Map-reduce Search time Parallel algorithms Dictionary searching 

摘      要:Indexing is one of the key components of any search tool to be optimized for searching documents. Among the existing indexing techniques, inverted indexing is one of the best methods used at a larger scale for various applications. Under this method, the index is designed using a signature file, hash tree and B-tree to retrieve the required document in efficient time. B-tree is popular due to its searching efficiency, but its performance degrades with increasing data set size. The wavelet tree has become a popular and versatile data structure in the last decade, used in various domains such as sequences, indexing, compression, and grid-point with surprising results. This study proposes a parallel wavelet tree algorithm with hybridization of the Map-Reduce concept to construct an index for textual search. The proposed algorithm reduces the index construction time considerably. Experiments show that the proposed algorithm takes a reasonable trade-off with existing indexing approaches. For large data sets, index construction time has been reduced with respect to other existing state-of-art schemes. Also, results show that the algorithm performs well when the data-set scales up to up-to-the full utilization of available cores. It is possible due to the use of multiple threads working in parallel. Our experiment demonstrated consistent performance with 2-core, 4-core, 8-core, 12-core and results of 16-core show increase in index construction time due to parallel overhead when the data-set in not sufficiently large.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分