文献详情 >Practical compressed string di... 收藏

Practical compressed string dictionaries

实际的压缩的绳字典 <sup></sup>

作者：Martinez-Prieto, Miguel A. Brisaboa, Nieves Canovas, Rodrigo Claude, Francisco Navarro, Gonzalo

作者机构：Univ Valladolid Dept Comp Sci DataWeb Res E-47002 Valladolid Spain Univ A Coruna Database Lab La Coruna Spain Univ Melbourne Dept Comp & Informat Syst CIS NICTA Victoria Res Lab Melbourne Vic 3010 Australia Univ Diego Portales Escuela Informat & Telecomunicac Santiago Chile Univ Chile Dept Comp Sci CeBiB Ctr Biotechnol & Bioengn Santiago Chile

出版物：《INFORMATION SYSTEMS》 (信息系统)

年卷期：2016年第56卷第Mar.期

页面：73-108页

核心收录：

学科分类：08[工学] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：Spanish Ministry of Economy and Competitiveness [TIN2013-46238-C4-3-R] ICT COST Action KEYSTONE [IC1302] Conicyt, Chile [FB0001] Fondecyt Iniciacion

主　　题：Compressed string dictionaries Text processing Text databases Compressed data structures

摘要：The need to store and query a set of strings - a string dictionary - arises in many kinds of applications. While classically these string dictionaries have accounted for a small share of the total space budget (e.g., in Natural Language Processing or when indexing text collections), recent applications in Web engines, Semantic Web (RDF) graphs, Bioinformatics, and many others handle very large string dictionaries, whose size is a significant fraction of the whole data. In these cases, string dictionary management is a scalability issue by itself. This paper focuses on the problem of managing large static string dictionaries in compressed main memory space. We revisit classical solutions for string dictionaries like hashing, tries, and front-coding, and improve them by using compression techniques. We also introduce some novel string dictionary representations built on top of recent advances in succinct data structures and full-text indexes. All these structures are empirically compared on a heterogeneous testbed formed by real-world string dictionaries. We show that the compressed representations may use as little as 5% of the original dictionary size, while supporting lookup operations within a few microseconds. These numbers outperform the state-of-the-art space/time tradeoffs in many cases. Furthermore, we enhance some representations to provide prefix- and substring-based searches, which also perform competitively. The results show that compressed string dictionaries are a useful building block for various data-intensive applications in different domains. (C) 2015 Elsevier Ltd. All rights reserved.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Practical compressed string dictionaries

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Practical compressed string dictionaries

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：