Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain...
详细信息
The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalabilit...
详细信息
The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalability. Previous work on evaluating SPARQL queries with Hadoop mainly focus on reducing the number of joins through careful split of HDFS files and algorithms for generating Map/Reduce jobs. However, the way of partitioning RDF data could also affect system performance. Specifically, a good partitioning solution would greatly reduce or even to- tally avoid cross-node joins, and significantly cut down the cost in query evaluation. Based on HadoopDB, this work processes SPARQL queries in a hybrid architecture, where Map/Reduce takes charge of the computing tasks, and RDF query engines like RDF-3X store the data and execute join operations. According to the analysis of query workloads, this work proposes a novel algorithm for automatically parti- tioning RDF data and an approximate solution to physically place the partitions in order to reduce data redundancy. It also discusses how to make a good trade-off between query evaluation efficiency and data redundancy. All of these pro- posed approaches have been evaluated by extensive experiments over large RDF data sets.
Multi-Constrained Graph Pattern Matching (MC-GPM) aims to match a pattern graph with multiple attribute constraints on its nodes and edges, and has garnered significant interest in various fields, including social-bas...
Multi-Constrained Graph Pattern Matching (MC-GPM) aims to match a pattern graph with multiple attribute constraints on its nodes and edges, and has garnered significant interest in various fields, including social-based e-commerce and trust-based group discovery. However, the existing MC-GPM methods do not consider situations where the number of each node in the pattern graph needs to be fixed, such as finding experts group with expert quantities and relations specified. In this paper, a Multi-Constrained Strong Simulation with the Fixed Number of Nodes (MCSS-FNN) matching model is proposed, and then a Trust-oriented Optimal Multi-constrained Path (TOMP) matching algorithm is designed for solving it. Additionally, two heuristic optimization strategies are designed, one for combinatorial testing and the other for edge matching, to enhance the efficiency of the TOMP algorithm. Empirical experiments are conducted on four real social network datasets, and the results demonstrate the effectiveness and efficiency of the proposed algorithm and optimization strategies.
Cell association is a significant research issue in future mobile communication systems due to the unacceptably large computational time of traditional *** article proposes a polynomial-time cell association scheme wh...
详细信息
Cell association is a significant research issue in future mobile communication systems due to the unacceptably large computational time of traditional *** article proposes a polynomial-time cell association scheme which not only completes the association in polynomial time but also fits for a generic optimization objective *** the one hand,traditional cell association as a non-deterministic polynomial(NP)hard problem with a generic utility function is heuristically transformed into a 2-dimensional assignment optimization and solved by a certain polynomial-time algorithm,which significantly saves computational *** the other hand,the scheme jointly considers utility maximization and load balancing among multiple base stations(BSs)by maintaining an experience pool storing a set of weighting factor values and their corresponding *** an association optimization is required,a suitable weighting factor value is taken from the pool to calculate a long square utility matrix and a certain polynomial-time algorithm will be applied for the *** with several representative schemes,the proposed scheme achieves large system capacity and high fairness within a relatively short computational time.
Fuel cells are made from fuel and *** of its low pollution,high energy conversion efficiency and high reliability,fuel cell has become the future direction of new energy application,the technological development path ...
详细信息
Fuel cells are made from fuel and *** of its low pollution,high energy conversion efficiency and high reliability,fuel cell has become the future direction of new energy application,the technological development path in the field of fuel cell research has great significance to the development of technological and energy *** the patent analysis method,this paper analyses the patent data from Derwent Innovation Index quantitively to study the state of application for patents,core technologies,highly cited patents and the main *** shows that auxiliary device and related methods were a research hotspot in recent years;as the biggest patent holder of fuel cell technologies,Toyota,Honda motor *** Nissan motor *** an *** paper has discovered some potential problems behind the phenomena and some suggestions are put forward finally.
On the internet, all-round lawyer information is located at separated information sources, which prevent web users from effective information acquisition. In order to build a unified view of separated, heterogeneous, ...
详细信息
All-pairs SimRank calculation is a classic SimRank problem. However, all-pairs algorithms suffer from efficiency issues and accuracy issues. In this paper, we convert the non-linear simrank calculation into a new simp...
详细信息
With the system becoming more complex and workloads becoming more fluctuating, it is very hard for DBA to quickly analyze performance data and optimize the system, self optimization is a promising technique. A data mi...
详细信息
Privacy-preserving data publication problem has attracted more and more attentions in recent years. A lot of related research works have been done towards dataset with single sensitive attribute. However, usually, ori...
详细信息
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of pa...
详细信息
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of parallel recovery scheme for large-scale update intensive main memory database systems is presented. Simulation provides a faster way of evaluating the new idea compared to actual system implementation. J-SIM is an open source discrete time simulation software package. The simulation implementation using J-SIM is elaborated in terms of resource modeling, transaction processing system modeling and workload modeling. Finally, with simulation results analyzed, the effectiveness of the parallel recovery scheme is verified and the feasibility of J-SIM's application in main memory database system simulation is demonstrated.
暂无评论