Closed frequent itemsets(CFI) mining uses less memory to store the entire information of frequent itemsets thus is much suitable for mining stream. In this paper, we discuss recent CFI mining methods over stream and p...
详细信息
Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain...
详细信息
database-as-a-Service is a promising data management paradigm in which data is encrypted before being sent to the untrusted server. Efficient querying on encrypted data is a performance critical problem which has vari...
详细信息
Collaborative filtering is an important personalized recommendation technique applied widely in E-commerce. It is not adapted to multi-interest or title recommendation for the 'general neighbourhood' problem w...
详细信息
Detecting and exploiting correlations among columns in relational databases are of great value for query optimizers to generate better query execution plans (QEPs). We propose a more robust and informative metric, nam...
详细信息
Extracting multi-records from web pages is useful, it allows us to integrate information from multiple sources to provide value-added services. Existing techniques still have some limitations because of their several ...
详细信息
In update intensive applications, main memory database systems produce large volume of log records, it is critical to write out the log records efficiently to speedup transaction processing. We propose a parallel reco...
详细信息
In update intensive applications, main memory database systems produce large volume of log records, it is critical to write out the log records efficiently to speedup transaction processing. We propose a parallel recovery scheme based on XOR differential logging for main memory database systems in such environments. Some NVRAM is used to temporarily hold log records and decouple transaction committing from disk writes, inherited parallelism properties of differential logging are exploited to accelerate log flushing by using multiple log disks. During recovery, log records are loaded from multiple log disks and applied to data partition in time without the need of reordering according to serialization order, total recovery time is cut down. The scheme employs a data partition based consistent checkpointing method. The log records are classified according to IDs of data partitions accessed. data partitions are recovered according to loading priorities computed from update frequencies and transaction waiting times, data access demands of new transactions coming after failure recovery are given attention immediately, thus the scheme provides system availability during recovery, which is of importance for large scale main memory database systems.
With the development of relational database, people require better database not only in the aspect of database performance, but also in the aspect of the database’s interactive ability. So that the database is much m...
详细信息
With the rapid development of information retrieval technology and daily increasing information in the Internet, common users can retrieve many text-based database and get part of the information through the search en...
详细信息
With the rapid development of information retrieval technology and daily increasing information in the Internet, common users can retrieve many text-based database and get part of the information through the search engines such as Google, and Baidu. However, there is a great amount of data contained in the background relational database of web pages. So there are many researches focusing on the search in these relational database with keywords, compared with these researches, our algorithms are mainly based on bags using the greedy algorithms and supporting the phrase recognition by utilizing multiple dictionaries. We make a comparison between our algorithm and the existing ones. The experiment results shows that our algorithm owns not only the feature of effectiveness but also the feature of efficiency.
暂无评论