版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:IDRBT Hyderabad India NIT Warangal Hanamkonda India NIT AP Tadepalligudem India
出 版 物:《ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING》 (Arab. J. Sci. Eng.)
年 卷 期:2023年第48卷第2期
页 面:2121-2132页
核心收录:
主 题:Big data MapReduce Best region search Distributed and parallel processing
摘 要:The best region search (BRS) is one of the major research problems in geospatial data processing applications. The BRS problem objective is to discover the ideal location of a particular size specified rectangle, with a predetermined end goal of maximizing the user-defined scoring function. The existing solutions for finding the top-k best regions have focused on designing algorithms for centralized settings. These solutions are not suitable for processing massive datasets. In this paper, we enable a Hadoop MapReduce-based parallel and distributed computation to obtain significant improvement in the performance. In addition to the parallel and distributed setting, we also incorporate early pruning strategies to eliminate the need to process rectangles that are not part of the output to minimize the communication cost involved in computing k-BRS. We later introduced a redistribution strategy over the initially proposed methodology that handles skew inherited from the dataset. Our results are obtained from extensive experimentation, both synthetic and real-world datasets.