检索结果-内蒙古大学图书馆

ChinaGrid Annual Conference (ChinaGrid)

作者： Jingjing Kang Tao Liu He Hu Xiaoyong Du Key Laboratories of Data Engineering and Knowledge Engineering Ministry of Education China School of Information Renmin University of China Beijing China

Domain terms play a crucial role in many research areas, which has led to a rise in demand for automatic domain terms extraction. In this paper, we present a two-level evaluation approach based on term hood and unit hood to extract Chinese domain compound terms automatically, which takes the character-level and word-level information into account. To achieve this, we incorporate semantic features by using the word segmentation to recognize single word terms, then leverage the improved C-value and heuristic methods such as word formation pattern and word formation power to evaluate candidates at both levels. By validating our approach with several existing dictionaries, a significant improvement of compound terms detection is achieved. Experiments in legal corpus show our method is superior over other compared methods.

关键词： Compounds Semantics Arrays Feature extraction Filtering Dictionaries Pragmatics

来源：评论

学校读者我要写书评

暂无评论

Large scale report generation in data consolidation environments of banks

引用

Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition) 2012年第SUPPL.1期40卷 5-8页

作者： Qin, Xiongpai Zhou, Xiaoyun Wu, Zhongxin Yang, Hongzhi Wang, Wei Ministry of Education Key Lab of Data Engineering and Knowledge Engineering Renmin University of China Beijing 100872 China Information School Renmin University of China Beijing 100872 China Computer Science Department Jiangsu Normal University Xuzhou 221116 Jiangsu China Beijing Nantian Software Co. Ltd. Beijing 100085 China

To generate large number of reports in a limited time window, four techniques were proposed, including ROLAP&SQL, Shared Scanning, Hadoop based Solution, and MOLAP&Cube Sharding, an algorithm that performs in memory aggregation was designed for the second solution. The experiment results show that all techniques except ROLAP&SQL can meet the time window constraint, the Hadoop based solution is a promising technique owe to its highly scalability. Considering maturity of the techniques and their performance, we put MOLAP&Cube Sharding into practice while keeping an eye on Hadoop for future adoption.

关键词： Custom report data consolidation Hadoop Large bank MOLAP ROLAP

来源：评论

学校读者我要写书评

暂无评论

Probabilistic range queries for Uncertain Trajectories on road networks 11

Probabilistic range queries for Uncertain Trajectories on ro...

引用

14th International Conference on Extending database Technology: Advances in database Technology, EDBT 2011

作者： Zheng, Kai Trajcevski, Goce Zhou, Xiaofang Scheuermann, Peter School of ITEE University of Queensland Australia Department of EECS Northwestern University United States School of Information Renmin University of China Key Lab. of Data Engineering and Knowledge Engineering Ministry of Education China

ISBN: (纸本)9781450305280

Trajectories representing the motion of moving objects are typically obtained via location sampling, e.g. using GPS or road-side sensors, at discrete time-instants. In-between consecutive samples, nothing is known about the whereabouts of a given moving object. Various models have been proposed (e.g., sheared cylinders;spacetime prisms) to represent the uncertainty of the moving objects both in unconstrained Euclidian space, as well as road networks. In this paper, we focus on representing the uncertainty of the objects moving along road networks as time-dependent probability distribution functions, assuming availability of a maximal speed on each road segment. For these settings, we introduce a novel indexing mechanism - UTH (Uncertain Trajectories Hierarchy), based upon which efficient algorithms for processing spatio-temporal range queries are proposed. We also present experimental results that demonstrate the benefits of our proposed methodologies.

关键词： Trajectories

来源：评论

学校读者我要写书评

暂无评论

Clustering XML search results based on content and structure similarity

Clustering XML search results based on content and structure...

引用

2011 International Conference on Management of e-Commerce and e-Government, ICMeCG 2011

作者： Min-Juan, Zhong Chang-Xuan, Wan De-Xi, Liu Xian-Pei, Jiao School of Information and Technology Jiangxi University of Finance and Economics Nanchang China Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang China

ISBN: (纸本)9780769545448

Clustering XML search results is an effective way to improve performance. However, the key problem is how to measure similarity between XML documents. In this paper, we propose a semantic similarity measure method combining content with structure, in which a variety of XML document features, including term element frequency, term inverse element frequency, semantic weight of tag label and level information of the term, are analyzed and applied for computing the similarity between XML documents. In addition, two new performance evaluation methodology, namely Cluster Ratio Relevant and Docu Ratio Relevant, for clustering quality are introduced motivated by the observations of relevant documents distribution and the fact that collection has no classification information. Experiment results show that proposed similarity method(CAS measure)outperforms traditional document clustering(CO measure) in Cluster Ratio Relevant and Docu Ratio Relevant and produces better clustering quality. © 2011 IEEE.

关键词： XML

来源：评论

学校读者我要写书评

暂无评论

Effect factors on secondary structure of protein sequence pattern

Effect factors on secondary structure of protein sequence pa...

引用

International Workshop on Intelligent Systems and Applications

作者： Liu, Tao Li, Minghui Key Laboratory of Data Engineering and Knowledge Engineering MOE Beijing 100872 China School of Information Renmin University of China Beijing 100872 China School of Computer Science and Technology Harbin Institute of Technology Harbin 150001 China

ISBN: (纸本)9781424498574

Discovering the relationship between protein sequence pattern and protein secondary structure is important for accurately predicting secondary structure of protein sequence. A protein secondary structure pattern dictionary is constructed using protein sequence pattern and its corresponding secondary structure in this paper. Based on the constructed dictionary, we propose four effect factors on secondary structure of protein sequence pattern, including 1) the core pattern itself;2) patterns or amino acid residues that neighbor with the core pattern;3) patterns or amino acid residues that are far away from the core pattern;and 4) amino acid sequence segment that match the core pattern. Statistical measures are adopted to analyze these factors. The experimental result shows the reliability of these factors. The recognition of these effect factors presents new directions to predict protein secondary structure based on protein pattern dictionary. © 2011 IEEE.

关键词： Amino acids

来源：评论

学校读者我要写书评

暂无评论

A query-oriented summarization system for XML elements

A query-oriented summarization system for XML elements

引用

2011 International Conference on Management of e-Commerce and e-Government, ICMeCG 2011

作者： Liu, Dexi Wu, Shihan Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang 330013 China Songjiang Branch Shanghai Rural Commercial Bank Shanghai 201600 China

ISBN: (纸本)9780769545448

In document-center XML dataset, an element may contain so many text that users have to spend enough time to judge the elements returned by XML search engine are valuable or not. Query-orient XML summarization system aim to provide users a brief and readable substitution of the original retrieved elements according to the user's query, which can relieve user's reading burden effectively. In this work, we extract sentences from the results of XML search engine, and combine them as a summary. Experiments on the IEEE-CS datasets used in INEX show that, the query-oriented XML summary generated by our method is reasonable. © 2011 IEEE.

关键词： XML

来源：评论

学校读者我要写书评

暂无评论

Multi-view random walk framework for search task discovery from click-through log 11

Multi-view random walk framework for search task discovery f...

引用

20th ACM Conference on Information and knowledge Management, CIKM'11

作者： Cui, Jianwei Liu, Hongyan Yan, Jun Ji, Lei Jin, Ruoming He, Jun Gu, Yingqin Chen, Zheng Du, Xiaoyong Key Labs. of Data Engineering and Knowledge Engineering School of Information Renmin University of China China Department of Management Science and Engineering Tsinghua University China Microsoft Research Asia Beijing China Computer Science Department Kent State University United States

ISBN: (纸本)9781450307178

Search engine users often have clear search tasks hidden behind their queries. Inspired by this, the modern search engines are providing an increasing number of services to help users simplify their key tasks. However, the problem of what are the major user search tasks with high traffic for which search engines should design special services is still underexplored. In this paper, we propose a novel Multi-view Random Walk (MRW) algorithm to measure the search task oriented similarity between queries, and then group search queries with similar tasks so that the major search tasks of users can be identified from search engine click-through log. The proposed MRW, which is a general framework to combine knowledge from different views in a random walk process, allows the random surfer to walk across different views to integrate information for search task discovery. Experimental results on click-through log of a commonly used commercial search engine show that our proposed MRW algorithm can effectively discover user search tasks. © 2011 ACM.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

Discovering Chinese compound term using termhood and unithood measures

Proceedings - 2011 6th Annual ChinaGrid Conference, ChinaGri...

引用

Proceedings - 2011 6th Annual ChinaGrid Conference, ChinaGrid 2011 2011年 60-67页

作者： Kang, Jingjing Liu, Tao Hu, He Du, Xiaoyong Key Labs. of Data Engineering and Knowledge Engineering Ministry of Education China School of Information Renmin University of China Beijing China

ISBN: (纸本)9780769544724

关键词： Heuristic methods

来源：评论

学校读者我要写书评

暂无评论

K-nearest neighbors in uncertain graph

引用

Jisuanji Yanjiu yu Fazhan/Computer Research and Development 2011年第10期48卷 1850-1858页

作者： Zhang, Yinglong Li, Cuiping Chen, Hong Du, Lingxia Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China Ministry of Education Beijing 100872 China School of Computer and Information Engineer Jiangxi Agriculture University Nanchang 330045 China

In many areas, a lot of data have been modeled by graphs which are subject to uncertainties, such as molecular compounds and protein interaction networks. While many real applications, for example, collaborative filtering, fraud detection, and link prediction in social networks etc, rely on efficiently answering k-nearest neighbor queries (kNN), which is the problem of computing the most "similar" k nodes to a given query node. To solve the problem, in this paper a novel method based on measurement of SimRank is proposed. However, because graphs evolve over time and are uncertainly, the computing cost can be very high in practice to solve the problem using the existing algorithms of SimRank. So the paper presents an optimization algorithm. Introducing path threshold, which is suitable in both determined graph and uncertain graph, the algorithm merely considers the local neighborhood of a given query node instead of whole graph to prune the search space. To further improving efficiency, the algorithm adopts sample technology in uncertain graph. At the same time, theory and experiments interpret and verify that the optimization algorithm is efficient and effective.

关键词： Collaborative filtering

来源：评论

学校读者我要写书评

暂无评论

图形处理器加速的联机分析处理系统

图形处理器加速的联机分析处理系统

引用

第29届中国数据库学术会议

作者： Fang Yixuan 方艺璇刘虹 Liu Hong Chen Hong 陈红 Li Cuiping 李翠平 Zhang Yansong 张延松 Zhao Suyun 赵素云 Chen Jie 陈杰辛鑫 Xin Xin Zhang Ji 张吉 Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China) Ministry 数据工程与知识工程教育部重点实验室(中国人民大学) 北京100872 中国人民大学信息学院北京100872 Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China) Ministry 数据工程与知识工程教育部重点实验室(中国人民大学) 北京100872 中国人民大学中国调查与数据中心北京100872

基于现有联机分析处理系统(online analytical processing,OLAP)的不足和图形处理器(graphics processing unit,GPU)的发展,研制了GOOLAP(GPU oriented OLAP)系统.GOOLAP系统利用GPU的高并行性和高存储带宽,把计算密集型运算转移到GPU端执行,加运OLAP处理性能.GOOLAP系统主要由3部分组成：1)表示层,以Excel作为用户输入前端并展示结果,加强系统的易用性;2)OLAP服务器层,负责把多维表达式(multi-dimensional expression,MDX)语句解析成SQL语句并根据自定义启发式规则向GPU传递适合计算的SQL语句,最后再将GPU返回结果与CPU原始结果合并;3)存储层,提供GPU显存和关系数据库两种存储结构.

关键词：图形处理器联机分析处理系统优化设计功能模块

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：