检索结果-内蒙古大学图书馆

3rd International Conference on Innovative Computing Information and Control, ICICIC'08

作者： Li, Haifeng Chen, Hong Key Laboratory of Data Engineering and Knowledge Engineering School of Information Renmin University of China Beijing 100872 China

ISBN: (纸本)9780769531618

Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain the partial feature of stream. In this paper, a multi-level evolving sequential pattern mining model ESPMM is presented to address this problem thus the mostly entire stream feature is obtained. Furthermore, because of the smaller support of sequential patterns in each level, a mining method BMLA based on Levenshtein-Automata is proposed which builds state conversion model to compute sequences' similarity in linear time. The experiment results show this model is effective and efficient. © 2008 IEEE.

关键词： Automata theory

来源：评论

学校读者我要写书评

暂无评论

基于Web Services的信息系统的实施研究

基于Web Services的信息系统的实施研究

引用

第15届海峡两岸信息管理发展与策略学术研讨会(2009)

作者： Zuo Meiyun 左美云 Wang Shijuan 王世娟 In formation School and Key Laboratory of Data Engineering and Knowledge Engineering MOE Renmin Univ 中国人民大学信息学院数据工程与知识工程教育部重点实验室北京

基于Web Services的信息系统将会是今后信息系统建设的一种重要模式。论文在对23个实施基于WebServices的信息系统的二手资料分析的基础上，归纳了基于Web Services的信息系统的类型、实施策略和实施过程。

关键词：企业管理信息系统控制策略网络服务

来源：评论

学校读者我要写书评

暂无评论

S-SimRank: Combining Content and Link Information to Cluster Papers Effectively and Efficiently

S-SimRank: Combining Content and Link Information to Cluster...

引用

4th International Conference on Advanced data Mining and Applications

作者： Cai, Yuanzhe Li, Pei Liu, Hongyan He, Jun Du, Xiaoyong Key Labs of Data Engineering and Knowledge Engineering Ministry of Education China Department of Computer Science Renmin University of China China Department of Management Science and Engineering Tsinghua University China

ISBN: (纸本)9783540881919

Both Content analysis and link, analysis have its advantages in measuring relationships among documents. In this paper. we propose a new method to combine these two methods to compute the similarity of research papers so that we can do clustering of these papers more accurately. In order to improve the efficiency of similarity calculation, we develop a strategy to deal with the relationship graph separately, without affecting the accuracy. We also design an approach to assign different weights to different links to the papers, which can enhance the accuracy of similarity calculation. The experimental results conducted oil ACM data Set show that our new algorithm. S-SimRank, outperforms other algorithms.

关键词： Calculations

来源：评论

学校读者我要写书评

暂无评论

LOB: Bucket based index for range queries

LOB: Bucket based index for range queries

引用

9th International Conference on Web-Age Information Management, WAIM 2008

作者： Wang, Jieping Du, Xiaoyong Key Laboratory of Data Engineering and Knowledge Engineering MOE Beijing 100872 China School of Information Renmin University of China Beijing 100872 China

ISBN: (纸本)9780769531854

database-as-a-Service is a promising data management paradigm in which data is encrypted before being sent to the untrusted server. Efficient querying on encrypted data is a performance critical problem which has various solutions, among which bucket based index is an effective and flexible one. In previous research some metrics are proposed to measure security and efficiency. In this paper, we illustrate by example the limitations of these metrics and introduce a new security metric based on probability distribution variance and efficiency metric based on overlapping ratio. Based on these metrics we propose a local overlapping bucket algorithm (LOB) with time complexity of O(nlogn), where n represents the cardinality of the table. Experiments on synthetic and real dataset show that our algorithm can achieve higher security by trading off efficiency. © 2008 IEEE.

关键词： Information management

来源：评论

学校读者我要写书评

暂无评论

Clustering XML retrieval results based on hybrid similarity

引用

Journal of Computational Information Systems 2008年第3期4卷 1323-1330页

作者： Wan, Changxuan Yu, Hong School of Information Technology Jiangxi University of Finance and Economics Nanchang 330013 China Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang 330013 China

With the unceasing growth of XML data in World Wide Web, XML document retrieval and clustering retrieval results are confronted with both challenges and opportunities. One of the challenges is how to improve the quality of XML retrieval results. Firstly, according to the features of XML documents, a method of modeling XML retrieval result documents is brought forward, which integrates both structural semantic features and content information of XML documents. Then, a measure method to compute similarity, including structural semantic similarity and keywords similarity, between retrieval result documents is suggested;and a strategy named Item Frequency in Cluster-Inverse Cluster Frequency to extract labels from result clusters is presented. Experiments indicate that the clustering quality for XML retrieval results based on hybrid similarity is obviously better than the one only based on content similarity.

关键词： XML

来源：评论

学校读者我要写书评

暂无评论

Using ontology to enhance collaborative recommendation based on community

Using ontology to enhance collaborative recommendation based...

引用

9th International Conference on Web-Age Information Management, WAIM 2008

作者： Yu, Li School of Information Renmin University of China Beijing 100872 China Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China MOE Beijing 100872 China

ISBN: (纸本)9780769531854

Collaborative filtering is an important personalized recommendation technique applied widely in E-commerce. It is not adapted to multi-interest or title recommendation for the 'general neighbourhood' problem which is analyzed in this paper. Based on it, collaborative filtering recommendation based on community is presented by introducing the concept 'community neighbourhood' in the paper. Unfortunately, it results into severer sparsity problem which makes heavy effect on its performance. In order to overcome it, an ontological A-priori score is used to infer user preference and to pre-fill null rating first. After pre-filling using the ontology method, then collaborative filtering based on community is executed based on a dense rating matrix. The experiment shows that collaborative filtering based on community makes generally better performance than traditional method when data is not very sparse, and ontology method can truly enhance collaborative filtering based on community since the sparsity is overcame. © 2008 IEEE.

关键词： Collaborative filtering

来源：评论

学校读者我要写书评

暂无评论

COCA: More accurate multi-dimensional histograms out of more accurate correlations detection

COCA: More accurate multi-dimensional histograms out of more...

引用

9th International Conference on Web-Age Information Management, WAIM 2008

作者： Wei, Cao Xiongpai, Qin Wang, Shan Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China MOE Beijing 100872 China School of Information Renmin University of China Beijing 100872 China

ISBN: (纸本)9780769531854

Detecting and exploiting correlations among columns in relational databases are of great value for query optimizers to generate better query execution plans (QEPs). We propose a more robust and informative metric, namely, entropy correlation coefficients, other than chi-square test to detect correlations among columns in large datasets. We introduce a novel yet simple kind of multi-dimensional synopses named COCA-Hist to cope with different correlations in databases. With the aid of the precise metric of entropy correlation coefficients, correlations of various degrees can be detected effectively;when correlation coefficients testify to mutual independence among columns, the AVI (attribute value independence) assumption can be adopted undoubtedly. COCA can also serve as a data-mining tool with superior qualities as CORDS does. We demonstrate the effectiveness and accuracy of our approach by several experiments. © 2008 IEEE.

关键词： Entropy

来源：评论

学校读者我要写书评

暂无评论

Extracting multi-records from web pages

Extracting multi-records from web pages

引用

4th International Conference on Semantics, knowledge, and Grid, SKG 2008

作者： Tian, Xia Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China MOE Beijing 100872 China School of Information Resource Management Renmin University of China Beijing 100872 China

ISBN: (纸本)9780769534015

Extracting multi-records from web pages is useful, it allows us to integrate information from multiple sources to provide value-added services. Existing techniques still have some limitations because of their several restrictions and accuracy. This paper proposes a new method to perform multi-records extraction task automatically. Firstly, the HTML tag tree is build based on an embedded browser interface to solve the AJAX problem. Secondly, data regions are found out by data chunk comparison, and simple tree matching method is proposed to compute the chunk similarity. Finally, the main data region is determined and the multi-records are extracted out. Experimental results show that our method dramatically outperforms other existing methods, and it can extract multi-records from pages very accurately. © 2008 IEEE.

关键词： Websites

来源：评论

学校读者我要写书评

暂无评论

A Parallel Recovery Scheme for Update Intensive Main Memory database Systems

A Parallel Recovery Scheme for Update Intensive Main Memory ...

引用

IEEE International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)

作者： Xiongpai Qin Yanqin Xiao Wei Cao Shan Wang Key Laboratory of Data Engineering and Knowledge Engineering MOE Beijing China Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Beijing P.R. China

In update intensive applications, main memory database systems produce large volume of log records, it is critical to write out the log records efficiently to speedup transaction processing. We propose a parallel recovery scheme based on XOR differential logging for main memory database systems in such environments. Some NVRAM is used to temporarily hold log records and decouple transaction committing from disk writes, inherited parallelism properties of differential logging are exploited to accelerate log flushing by using multiple log disks. During recovery, log records are loaded from multiple log disks and applied to data partition in time without the need of reordering according to serialization order, total recovery time is cut down. The scheme employs a data partition based consistent checkpointing method. The log records are classified according to IDs of data partitions accessed. data partitions are recovered according to loading priorities computed from update frequencies and transaction waiting times, data access demands of new transactions coming after failure recovery are given attention immediately, thus the scheme provides system availability during recovery, which is of importance for large scale main memory database systems.

关键词： database systems data engineering Large-scale systems Random access memory Checkpointing Communication industry Defense industry Distributed computing laboratories knowledge engineering

来源：评论

学校读者我要写书评

暂无评论

Greedy-Algorithm of KSORD Supporting Multi-language Phrase Recognition

Greedy-Algorithm of KSORD Supporting Multi-language Phrase R...

引用

International Conference on Advanced Computer Theory and engineering, ICACTE

作者： Peng Li Qing Zhu Chao Tian Shan Wang Key Laboratory of Data Engineering and Knowledge Engineering School of Information Renmin University of China Beijing China

With the rapid development of information retrieval technology and daily increasing information in the Internet, common users can retrieve many text-based database and get part of the information through the search engines such as Google, and Baidu. However, there is a great amount of data contained in the background relational database of web pages. So there are many researches focusing on the search in these relational database with keywords, compared with these researches, our algorithms are mainly based on bags using the greedy algorithms and supporting the phrase recognition by utilizing multiple dictionaries. We make a comparison between our algorithm and the existing ones. The experiment results shows that our algorithm owns not only the feature of effectiveness but also the feature of efficiency.

关键词： Relational databases Internet Information retrieval data engineering knowledge engineering Search engines Dictionaries data structures Chaos laboratories

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：