咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Similarity Joins: Their implem... 收藏

Similarity Joins: Their implementation and interactions with other database operators

类似加入: 他们有另外的数据库操作符的实现和相互作用

作     者:Silva, Yasin N. Pearson, Spencer S. Chon, Jaime Roberts, Ryan 

作者机构:Arizona State Univ Glendale AZ 85306 USA 

出 版 物:《INFORMATION SYSTEMS》 (信息系统)

年 卷 期:2015年第52卷

页      面:149-162页

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Similarity Join Database operator Similarity queries PostgreSQL Query processing and optimization 

摘      要:Similarity Joins are extensively used in multiple application domains and are recognized among the most useful data processing and analysis operations. They retrieve all data pairs whose distances are smaller than a predefined threshold epsilon. While several standalone implementations have been proposed, very little work has addressed the implementation of Similarity Joins as physical database operators. In this paper, we focus on the study, design, implementation, and optimization of a Similarity Join database operator for metric spaces. We present DBSimJoin, a physical database operator that integrates techniques to: enable a non-blocking behavior, prioritize the early generation of results, and fully support the database iterator interface. The proposed operator can be used with multiple distance functions and data types. We describe the changes in each query engine module to implement DBSimJoin and provide details of our implementation in PostgreSQL. We also study ways in which DBSimJoin can be combined with other similarity and non-similarity operators to answer more complex queries, and how DBSimJoin can be used in query transformation rules to improve query performance. The extensive performance evaluation shows that DBSimJoin significantly outperforms alternative approaches and scales very well when important parameters like epsilon, data size, and number of dimensions increase. (C) 2015 Elsevier Ltd. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分