检索结果-内蒙古大学图书馆

14th International Conference on database Systems for Advanced Applications

作者： Cai, Yuanzhe Liu, Hongyan He, Jun Du, Xiaoyong Jia, Xu Key Labs of Data Engineering and Knowledge Engineering MOE China School of Information Renmin University of China China Department of Management Science and Engineering Tsinghua University China

ISBN: (纸本)9783642008863

SimRank is a well-known algorithm for similarity calculation based on object-to-object relationship. However, it suffers from high computation cost. Inthis paper, we find that the convergence behavior of different object pairs is different when we use SimRank to compute the similarity of objects. Many similarity scores converge fast, while others need more time before convergence. Based on this observation, we propose an adaptive method called Adaptive-SimRank to speed up similarity calculation. Using this method, we don't need to recalculate those converged pairs' similarity. The experiments conducted on web datasets and synthetic dataset show that our new method can reduce the running time by nearly 35%.

关键词： Similarity Calculation Linkage Mining

来源：评论

学校读者我要写书评

暂无评论

Reflection on the popularity of MapReduce and observation of its position in a unified big data platform

Reflection on the popularity of MapReduce and observation of...

引用

14th International Conference on Web-Age Information Management, WAIM 2013

作者： Qin, Xiongpai Qin, Biao Du, Xiaoyong Wang, Shan MOE Key Lab. of Data Engineering and Knowledge Engineering Beijing 100872 China School of Information Renmin University of China Beijing 100872 China

ISBN: (纸本)9783642395260

In recent years MapReduce has risen to be the de-facto tool for big data processing. MapReduce is a disruptive innovation. It has changed the landscape of database market, the landscape of technologies, as well as the landscape of saying power. The article will give a reflection on the popularity of the technique and some observations of its position in a unified big data platform. © 2013 Springer-Verlag.

关键词： MapReduce

来源：评论

学校读者我要写书评

暂无评论

Clustering moving objects in spatial networks

引用

12th International Conference on database Systems for Advanced Applications

作者： Chen, Jidong Lai, Caifeng Meng, Xiaofeng Xu, Jianliang Hu, Haibo School of Information Renmin University of China Key Laboratory of Data Engineering and Knowledge Engineering MOE Department of Computer Science Hong Kong Baptist University

ISBN: (纸本)9783540717027

Advances in wireless networks and positioning technologies (e.g., CPS) have enabled new data management applications that monitor moving objects. In such new applications, realtime data analysis such as clustering analysis is becoming one of the most important requirements. In this paper, we present the problem of clustering moving objects in spatial networks and propose a unified framework to address this problem. Due to the innate feature of continuously changing positions of moving objects, the clustering results dynamically change. By exploiting the unique features of road networks, our framework first introduces a notion of cluster block (CB) as the underlying clustering unit. We then divide the clustering process into the continuous maintenance of CBs and periodical construction of clusters with different criteria based on CBs. The algorithms for efficiently maintaining and organizing the CBs to construct clusters are proposed. Extensive experimental results show that our clustering framework achieves high efficiency for clustering moving objects in real road networks.

关键词： spatial-temporal databases moving objects clustering spatial networks

来源：评论

学校读者我要写书评

暂无评论

Distributed join algorithms on multi-CPU clusters with GPUDirect RDMA 19

Distributed join algorithms on multi-CPU clusters with GPUDi...

引用

48th International Conference on Parallel Processing, ICPP 2019

作者： Guo, Chengxin Chen, Hong Zhang, Feng Li, Cuiping Key Laboratory of Data Engineering and Knowledge Engineering Renmin University of China MOE School of Information Renmin University of China China

ISBN: (纸本)9781450362955

In data management systems, query processing on GPUs or distributed clusters have proven to be an effective method for high efficiency. However, the high PCIe data transfer overhead between CPUs and GPUs, and the communication cost between nodes in distributed systems are usually bottleneck for improving system performance. Recently, GPUDirect RDMA has been developed and has received a lot of attention. It contains the features of the RDMA and GPUDirect technologies, which provides new opportunities for optimizing query processing. In this paper, we revisit the join algorithm, one of the most important operators in query processing, with GPUDirect RDMA. Specifically, we explore the performance of the hash join and sort merge join with GPUDirect RDMA. We present a new design using GPUDirect RDMA to improve the data communication in distributed join algorithms on multi-GPU clusters. We propose a series of techniques, including multi-layer data partitioning, and adaptive data communication path selection for various transmission channels. Experiments show that the proposed distributed join algorithms using GPUDirect RDMA achieve up to 1.83x performance speedup compared to the state-of-the-art distributed join algorithms. To the best of our knowledge, this is the first work for distributed GPU join algorithms. We believe that the insights and implications in this study shall shed lights on future researches using GPUDirect RDMA. © 2019 ACM.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Virtualizing system and ordinary services in Windows-based OS-level virtual machines 11

Virtualizing system and ordinary services in Windows-based O...

引用

Proceedings of the 2011 ACM Symposium on Applied Computing

作者： Shan, Zhiyong Chiueh, Tzi-Cker Wang, Xin Key Laboratory of Data Engineering and Knowledge Engineering MOE Renmin University of China China Stony Brook University United States

ISBN: (纸本)9781450301138

OS-level virtualization incurs smaller start-up and run-time overhead than HAL-based virtualization and thus forms an important building block for developing fault-tolerant and intrusion-tolerant applications. A complete implementation of OS-level virtualization on the Windows platform requires virtualization of Windows services, such as system services like the Remote Procedure Call Server Service (RPCSS), because they are essentially extensions of the kernel. As Windows system services work very differently from their counterparts on UNIX-style OS, i.e., daemons, and many of their implementation details are proprietary, virtualizing Windows system services turned out to be the most challenging technical barrier for OS-level virtualization for the Windows platform. In this paper, we describe a general technique to virtualize Windows services, and demonstrate its effectiveness by applying it to successfully virtualize a set of important Windows system services and ordinary services on different versions of Windows OS, including RPCSS, DcomLaunch, IIS service group, Tlntsvr, MySQL, Apache2.2, CiSvc, ImapiService, etc. © 2011 ACM.

关键词： Virtual machine

来源：评论

学校读者我要写书评

暂无评论

COP: Privacy-preserving multidimensional partition in DAS paradigm 09

COP: Privacy-preserving multidimensional partition in DAS pa...

引用

2009 International Conference on Extending database Technology/International Conference on database Theory Workshops, EDBT/ICDT '09

作者： Wang, Jieping Du, Xiaoyong Wang, Haocong Yang, Pingping Key Laboratory of Data Engineering and Knowledge Engineering MOE China School of Information Renmin University of China Beijing 100872 China

ISBN: (纸本)9781605586502

database-as-a-Service (DAS) is an emerging database management paradigm wherein partition based index is an effective way to querying encrypted data. However, previous research either focuses on one-dimensional partition or ignores multidimensional data distribution characteristic, especially sparsity and locality. In this paper, we propose Cluster based Onion Partition (COP), which is designed to decrease both false positive and dead space at the same time. Basically, COP is composed of two steps. First, it partition covered space level by level, which is like peeling of onion;second, at each level, a clustering algorithm based on local density is proposed to achieve local optimal secure partition. Extensive experiments on real dataset and synthetic dataset show that COP is a secure multidimensional partition with much less efficiency loss than previous top down or bottom up counterparts. Copyright 2009 ACM.

关键词： database systems

来源：评论

学校读者我要写书评

暂无评论

Maintaining materialized relations incrementally to improve performance of ontology query

Maintaining materialized relations incrementally to improve ...

引用

7th International Conference on Web-Age Information Management Workshops, WAIM 2006

作者： Li, Man Du, Xiaoyong Wang, Shan School of Information Renmin University of China China Key Laboratory of Data Engineering and Knowledge Engineering MOE 100872 Beijing China

ISBN: (纸本)0769527051

For ontology-based applications, the efficiency of ontology query is vital. Different from existing approaches, the paper improves performance of ontology query by materializing some derived relations. Experimental results show that the integrated performance of ontology query can be improved greatly by maintaining materialized relations and the materialized relations technique has good scalability. Here the challenge is how to maintain the materialized relations incrementally with the update of ontologies. Because transitive relations are in common use in ontology, the paper proposes a novel algorithm for maintaining transitive materialized relations incrementally based on a special weighted materialized relation transitive graph, which can solve the coexistence problem of multiple derived paths better and proves the correctness of the algorithm. © 2006 IEEE.

关键词： Ontology

来源：评论

学校读者我要写书评

暂无评论

Approximation algorithm for constructing data aggregation trees for wireless sensor networks

引用

中国高等学校学术文摘·计算机科学 2009年第4期3卷 524-534页

作者： Deying LI Jiannong CAO Qinghua ZHU Key Laboratory of Data Engineering and Knowledge Engineering School of InformationRenmin University of ChinaBeijing 100872China Internet and Mobile Computing Lab Department of ComputingHong Kong Polytechnic UniversityHong KongChina

This paper considers the problem of constructing data aggregation trees in wireless sensor networks (WSNs)for a group of sensor nodes to send collected information to a single sink *** data aggregation tree contains the sink node,all the source nodes,and some other non-source *** goal of constructing such a data aggregation tree is to minimize the number of non-source nodes to be included in the tree so as to save *** prove that the data aggregation tree problem is NP-hard and then propose an approximation algorithm with a performance ratio of four and a greedy *** also give a distributed version of the approximation *** simulations are performed to study the performance of the proposed *** results show that the proposed algorithms can find a tree of a good approximation to the optimal tree and has a high degree of scalability.

关键词： WSNs data aggregation tree approximation algorithm distributed algorithm

来源：评论

学校读者我要写书评

暂无评论

Managing a large shared bank of unstructured data by using free-table

Managing a large shared bank of unstructured data by using f...

引用

12th International Asia Pacific Web Conference, APWeb 2010

作者： Zhang, Xiao Du, Xiaoyong Chen, Jinchuan Wang, Shan Renming University of China Key Lab. of Data Engineering and Knowledge Engineerging MOE China 59 Zhongguancun Street Beijing China

ISBN: (纸本)9780769540122

This paper presents a reference framework, called BUD, to manage a large shared bank of unstructured data. This paper lists several important issues on managing or maintaining the unstructured data in BUD. BUD stores and manages the ever-growing unstructured data by introducing a novel technique called free-table, which is a conceptual view for endusers and a physical entity maintained by transactional storage manager of BUD. Free-table is cell-oriented but not columnoriented as relational table. It can store various types of unstructured data in cell with different versions. Additionally, we study two cases, VMP and PXRDB, to show that our proposal is feasible and tractable. © 2010 IEEE.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

Efficient Duplicate Detection on Cloud Using a New Signature Scheme

Efficient Duplicate Detection on Cloud Using a New Signature...

引用

12th International Conference on Web-Age Information Management

作者： Rong, Chuitian Lu, Wei Du, Xiaoyong Zhang, Xiao Key Labs. of Data Engineering and Knowledge Engineering MOE China School of Information Renmin University of China China Shanghai Key Laboratory of Intelligent Information Processing China

ISBN: (纸本)9783642235344;9783642235351

Duplicate detection has been well recognized as a crucial task to improve the quality of data. Related work on this problem mainly aims to propose efficient approaches over a single machine. However, with increasing volume of the data, the performance to identify duplicates is still far from satisfactory. Hence, we try to handle the problem of duplicate detection over MapReduce, a share-nothing paradigm. We argue the performance of utilizing MapReduce to detect duplicates mainly depends on the number of candidate record pairs. In this paper, we proposed a new signature scheme with new pruning strategy over MapReduce to minimize the number of candidate record pairs. Our experimental results over both real and synthetic datasets demonstrate that our proposed signature based method is efficient and scalable.

关键词： duplicate detection MapReduce Cloud

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：