检索结果-内蒙古大学图书馆

IEEE WIC ACM International Conference on Web Intelligence (WI)

作者： Peerapol Moemeng Longbing Cao Chengqi Zhang Faculty of Engineering and Information Technology Data Sciences & Knowledge Discovery Research Laboratory University of Technology Sydney Australia

data mining researches focus on algorithms that mine valuable patterns from particular domain. Apart from the theoretical research, experiments take a vast amount of effort to build. In this paper, we propose an integrated framework that utilises a multi-agent system to support the researchers to rapidly develop experiments. Moreover, the proposed framework allows extension and integration for future researches in mutual aspects of agent and data mining. The paper describes the details of the framework and also presents a sample implementation.

关键词： data mining Intelligent agent Programming Multiagent systems Software testing data analysis data engineering knowledge engineering information technology Australia

来源：评论

学校读者我要写书评

暂无评论

Top-K keyword search for supporting semantics in relational databases

引用

Ruan Jian Xue Bao/Journal of Software 2008年第9期19卷 2362-2375页

作者： Wang, Bin Yang, Xiao-Chun Wang, Guo-Ren College of Information Science and Engineering Northeastern University Shenyang 110004 China Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Beijing 100872 China

In order to enhance the search results of keyword search in relational databases, semantic relationship among relations and tuples is employed and a semantic ranking function is proposed. In addition to considering current ranking principles, the proposed semantic ranking function provides new metrics to measure query relevance. Based on it, two Top-k search algorithms BA (blocking algorithm) and EBA (early-stopping blocking algorithm) are presented. EBA improves BA by providing a filtering threshold to terminate iterations as early as possible. Finally, experimental results show the semantic ranking function guarantees a search result with high precision and recall, and the proposed BA and EBA algorithms improve query performance of existing approaches.

关键词： information retrieval

来源：评论

学校读者我要写书评

暂无评论

OLTP workloads on modern processor: Characterization and analysis

引用

Journal of Computational information Systems 2008年第1期4卷 389-394页

作者： Liu, Dawei Qin, Biao Wang, Shan Gong, Weiwei School of Information Renmin University of China Beijing 100872 China Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Beijing 200872 China

This paper analysis of how OLTP workloads interact with modern processors and caches behavior. First, we extend TPC-C, the OLTP-oriented benchmark, to ETPC-C benchmark, for measuring the performance of main-memory database (MMDBMS) more precisely. As the performance of MMDBMS is not affected by disk I/O, it is more sensitive to cache usage. Then using ETPC-C benchmark, we investigated the behavior of caches and processors extensively. We find that the miss stall time is mostly spent on on-CPU-chip caches, that is, the first and second level cache misses are dominant. Furthermore, we find instruction cache (I-cache) stall time of on-CPU-chip is a major component to the memory stall time. The smaller the emulated users, the more proportion the I-cache stall time of on-CPU-chip contributes to the memory stall time. However, if employing index, the system under test (SUT) has more total I-cache stall time than the SUT without index at the same number of emulated users and data population. Another observation is that the SUT with index has a little more branch misprediction rate than the SUT without index in average. Finally, we find only the third level (L3) D-cache stall time rate increases with the number of users. This is because L3 D-cache miss incremental rate is the largest. Under TPC-and ETPC-evaluation, we find that for optimized database performance on modern computers, reducing instruction miss penalty is equally important to reducing data miss penalty;since they are conflict efforts, the best way is to have them balanced.

关键词： Program processors

来源：评论

学校读者我要写书评

暂无评论

A New Dynamic Hash Index for Flash-Based Storage

A New Dynamic Hash Index for Flash-Based Storage

引用

International Conference on Web-Age information Management Workshops (WAIM)

作者： Xiang Li Zhou Da Xiaofeng Meng Renmin University of China Beijing CN Key Laboratory of Data Engineering and Knowledge Engineering MOE School of Information Renmin University of China China

Compared with traditional magnetic disks, flash memory has many advantages and has been used as external storage media for a wide spectrum of electronic devices (such as PDA, MP3, digital camera and mobile phone). As the capacity increases and price drops, it looks like a perfect alternative for magnetic disks. However, due to hardware limitations of flash memory, techniques including storage subsystem and indexing originally designed for magnetic disks can not run smoothly in a flash memory without any modification. In this paper we explore problems of indexing flash-resided data and present a new dynamical hash index for flash memory in two schemas. The analysis and experimental results validate the efficiency of our design.

关键词： Indexes Flash memory Algorithm design and analysis Writing Tuning Indexing Arrays

来源：评论

学校读者我要写书评

暂无评论

Shingles-based structural clustering of web documents

引用

Journal of Computational information Systems 2008年第6期4卷 2777-2785页

作者： Xia, Tian Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Beijing 100872 China School of Information Resource Management Renmin University of China Beijing 100872 China

Web document structural clustering is a useful task for many web intelligent applications, however, processing based on the structure of web documents have not yet received strong attention. In this paper, we propose a shingles-based approach to clustering web documents by structure. Firstly, semi-structured web documents are converted into structured tree which composed of a set of limited nodes, and structural features are extracted by shingles. Secondly, we define document distance and structural distance matrix, and then structural similarity is calculated according to this matrix. Finally, we cluster the document structure based on modified k-means algorithm. Different from existing methods, we construct shingles not only including real vertical paths, but also virtual horizontal paths. Weight factors are also considered to optimize the algorithm. Experimental results show the effectiveness of the new shingles-based similarity measurement and the structural clustering. The proposed document similarity, as well as the structural shingles analysis, could be applied to other web-based research issues. © 2008 Binary information Press.

关键词： information systems

来源：评论

学校读者我要写书评

暂无评论

Extracting Feature and Opinion Words Effectively from Chinese Product Reviews

Extracting Feature and Opinion Words Effectively from Chines...

引用

International Conference on Fuzzy Systems and knowledge Discovery (FSKD)

作者： Wei Wei Hongyan Liu Jun He Hui Yang Xiaoyong Du Information School Renmin University of China China School of Economics and Management Tsinghua University China Key Laboratory of Data Engineering and Knowledge Engineering MOE China

Nowadays more and more people like to publish their comments on a product on the Web. Mining such unstructured data (product reviews) is exciting hot and challenging research and application topic. In this paper, we focus on mining product reviews written in Chinese. We aim at extract the structural information from Chinese product reviews. By structural information, we mean product features and corresponding opinion words expressed in each review text. There are already some works done for reviews written in English, but less in Chinese. In this paper, we propose an effective method to extract candidate features and some effective pruning rules to prune the features. Also, we introduce a pattern extraction and matching step to improve our results. The experiment results show our approach is very effective, and has a good recall and precision.

关键词： Feature extraction data mining Negative feedback information retrieval Fuzzy systems Conference management knowledge management engineering management Helium Laboratories

来源：评论

学校读者我要写书评

暂无评论

The Design of an Open Hybrid Recommendation System for Mobile Commerce

The Design of an Open Hybrid Recommendation System for Mobil...

引用

International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on Intelligent Agents, Web Technologies and Internet Commerce

作者： Chengzhi Liu Caihong Sun Jia Yu Information School and Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Beijing China School of Electrical Engineering and Telecommunications University of New South Wales Sydney Australia

To meet more and more complex recommendation needs, it is quite important to implement hybrid recommendations for mobile commerce. In this paper, we propose a design for open hybrid recommendation systems in mobile commerce, which could integrate multiple recommendation algorithms together to improve recommendation performance. First, three solutions for an open hybrid recommendation approach are discussed in detail, which are generic customer profile, weighted hybrid recommendation algorithm, and mobile device profile creation. After that, we give out a multi-agent architecture design to make the three solutions work together. Finally a prototype system based on our proposed architecture is implemented to demonstrate the feasibility of our design and evaluate the performance of the proposed open hybrid recommendation system.

关键词： Business Customer profiles Context awareness Laboratories data engineering knowledge engineering Prototypes XML Sun Australia

来源：评论

学校读者我要写书评

暂无评论

Clustering XML retrieval results based on hybrid similarity

引用

Journal of Computational information Systems 2008年第3期4卷 1323-1330页

作者： Wan, Changxuan Yu, Hong School of Information Technology Jiangxi University of Finance and Economics Nanchang 330013 China Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang 330013 China

With the unceasing growth of XML data in World Wide Web, XML document retrieval and clustering retrieval results are confronted with both challenges and opportunities. One of the challenges is how to improve the quality of XML retrieval results. Firstly, according to the features of XML documents, a method of modeling XML retrieval result documents is brought forward, which integrates both structural semantic features and content information of XML documents. Then, a measure method to compute similarity, including structural semantic similarity and keywords similarity, between retrieval result documents is suggested;and a strategy named Item Frequency in Cluster-Inverse Cluster Frequency to extract labels from result clusters is presented. Experiments indicate that the clustering quality for XML retrieval results based on hybrid similarity is obviously better than the one only based on content similarity.

关键词： XML

来源：评论

学校读者我要写书评

暂无评论

Energy Efficient Multicast Routing for Discrete Power Levels in Ad Hoc Sensor Networks

Energy Efficient Multicast Routing for Discrete Power Levels...

引用

The 4th International Conference on Wireless Communications, Networking and Mobile Computing(第四届IEEE无线通信、网络技术及移动计算国际会议)

作者： Deying Li Qinghua Zhu Zheng Li Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China) MOE Scho School of Information Renmin University of China Beijing 100872 P.R.China

In this paper, we discuss the energy efficient multicast problem for discrete power levels in ad hoc sensor wireless networks. The problem of our concern is: given n nodes and each node v has l(v) transmission power levels and a multicast request (s, D), how to find a multicast tree rooted at s and spanning all destinations in D such that the total energy cost of the multicast tree is minimized. This problem is NP-hard. We propose a NWM_DST algorithm which has a theoretical guaranteed approximation performance ratio, and two efficient heuristics MNJT and g-D-MIP for multicast tree problem. Simulation results have shown efficiency of our proposed algorithms.

关键词： wireless sensor network energy efficiency approzimation algorithm multicast routing

来源：评论

学校读者我要写书评

暂无评论

An Edit Distance Algorithm with Block Swap

An Edit Distance Algorithm with Block Swap

引用

The 9th International Conference for Young Computer Scientists(第九届国际青年计算机大会)

作者： Tian Xia Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China MOE Beijing 100872 China School of Information Resource Management Renmin University of China. Beijing 100872 China

The edit distance between two given strings X and Y is the minimum number of edit operations that transform X into Y. In ordinary course, string editing is based on character insert, delete, and substitute operations. It has been suggested that extending this model with block edits would be useful in applications such as DNA sequence comparison and sentence similarity computation. However, the existing algorithms have generally focused on the normalized edit distance, and seldom of them consider the block swap operations at a higher level. In this paper, we introduce an extended edit distance algorithm which permits insertions, deletions, and substitutions at character level, and also permits block swap operations. Experimental results on randomly generated strings verify the algorithm's rationality and efficiency. The main contribution of this paper is that we present an algorithm to compute the lowest edit cost for string transformation with block swap in polynomial time, and propose a breaking points selection algorithm to improve the computation speed.

关键词： Edit distance edit operation block swap string matching.

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：