Data uncertainty is inherent in many real-world applications such as environmental surveillance and mobile tracking. Mining sequential patterns from inaccurate data, such as those data arising from sensor readings and...
详细信息
Data uncertainty is inherent in many real-world applications such as environmental surveillance and mobile tracking. Mining sequential patterns from inaccurate data, such as those data arising from sensor readings and GPS trajectories, is important for discovering hidden knowledge in such applications. In this paper, we propose to measure pattern frequentness based on the possible world semantics. We establish two uncertain sequence data models abstracted from many real-life applications involving uncertain sequence data, and formulate the problem of mining probabilistically frequent sequential patterns (or p-FSPs) from data that conform to our models. However, the number of possible worlds is extremely large, which makes the mining prohibitively expensive. Inspired by the famous PrefixSpan algorithm, we develop two new algorithms, collectively called U-PrefixSpan, for p-FSP mining. U-PrefixSpan effectively avoids the problem of "possible worlds explosion", and when combined with our four pruning and validating methods, achieves even better performance. We also propose a fast validating method to further speed up our U-PrefixSpan algorithm. The efficiency and effectiveness of U-PrefixSpan are verified through extensive experiments on both real and synthetic datasets.
Multicast greatly benefits many emerging applications such as federated learning, metaverse, and data warehouse. Recently, due to frequent cyber-attacks, multicast services have tended to request rigorous security agr...
详细信息
Multicast greatly benefits many emerging applications such as federated learning, metaverse, and data warehouse. Recently, due to frequent cyber-attacks, multicast services have tended to request rigorous security agreements, which likely differ among the destinations. To meet such agreements, one can employ security-aware service functions (SFs) to construct the security-aware SF tree (S-SFT) for multicast services. A security-aware SF can be provided by various vendors with diverse configurations and implementation costs. The multi-configured SFs and the various security agreements will add significant complexity to the deployment process of the security-aware multicast request. In this work, for the first time, we study how to effectively compose and embed an S-SFT over the network with multiple vendors. We formulate the problem of security- aware SFT composing and embedding. We develop a new technique called cost-security-centrality (CSC) based on the pigeonhole' s principle and propose a heuristic algorithm called CSC-based S-SFT deployment (CSC-SD). Via thorough mathematical proofs, we show that CSC-SD is logarithm approximate. Extensive simulations show that CSC-SD significantly outperforms the benchmarks and reveal that more function sharing facilitates saving implementation cost, but more routing sharing does not indicate saving routing cost.
Network function virtualization (NFV) is introduced to effectively deliver end-to-end network services for the emerging Internet of Things (IoT), multiaccess edge computing, and 5G communication techniques. In NFV, th...
详细信息
Network function virtualization (NFV) is introduced to effectively deliver end-to-end network services for the emerging Internet of Things (IoT), multiaccess edge computing, and 5G communication techniques. In NFV, the network service request can be accommodated in the form of a service function chain (SFC). The SFC will have to reserve abundant resources, such as link bandwidth, service functions, and computation in the physical network to meet the demands of customers. Minimizing the cost from the resource reservation in NFV remains challenging, even though a few works in the literature proposed cost-optimization methodologies with assumptions to guarantee their correctness. In this article, we comprehensively investigate how to minimize the cost when delivering network services as SFCs with provable bounds and fewer assumptions. We formally define the problem of minimum cost service function chaining and embedding (MC-SFCE) and propose an algorithm, namely, cost factor-based SFCE optimization with shortcut (COFO-SC), for MC-SFCE. Novel mathematical analysis is provided to demonstrate the correctness of our approaches and related bounds. Our extensive simulations and analysis also show that the proposed COFO-SC outperforms the schemes directly extended from the existing work.
The Eclat algorithm is one of the most widely used frequent itemset mining methods. However, the inefficiency for calculating the intersection of itemsets makes it a time-consuming method, especially when the dataset ...
详细信息
The Eclat algorithm is one of the most widely used frequent itemset mining methods. However, the inefficiency for calculating the intersection of itemsets makes it a time-consuming method, especially when the dataset has a large number of transactions. In this work, for the purpose of efficiency improvement, we proposed an approximate Eclat algorithm named HashEclat based on MinHash, which could quickly estimate the size of the intersection set, and adjust the parameters k, E and minSup to consider the tradeoff between accuracy of the mining results and execution time. The parameter k is the top-k parameter of one-permutation MinHash algorithm;the parameter E is the estimate error of one intersection size;the parameter minSup is the minimum support threshold. In many real situations, an approximate result with faster speed maybe more useful than 'exact' result. The theoretical analysis and experiment results that we present demonstrate that the proposed algorithm can output almost all of the frequent itemset with faster speed and less memory space.
As a major type of continuous spatial queries, the moving k nearest neighbor (kNN) query has been studied extensively. However, most existing studies have focused on only the query efficiency. In this paper, we consid...
详细信息
As a major type of continuous spatial queries, the moving k nearest neighbor (kNN) query has been studied extensively. However, most existing studies have focused on only the query efficiency. In this paper, we consider further the usability of the query results, in particular the diversification of the returned data points. We thereby formulate a new type of query named the moving k diversified nearest neighbor query (MkDNN). This type of query continuously reports the k diversified nearest neighbors while the query object is moving. Here, the degree of diversity of the kNN set is defined on the distance between the objects in the kNN set. Computing the k diversified nearest neighbors is an NP-hard problem. We propose an algorithm to maintain incrementally the k diversified nearest neighbors to reduce the query processing costs. We further propose two approximate algorithms to obtain even higher query efficiency with precision bounds. We verify the effectiveness and efficiency of the proposed algorithms both theoretically and empirically. The results confirm the superiority of the proposed algorithms over the baseline algorithm.
Gaussian kernel support vector machine recursive feature elimination (GKSVM-RFE) is a method for feature ranking in a nonlinear way. However, GKSVM-RFE suffers from the issue of high computational complexity, which hi...
详细信息
Gaussian kernel support vector machine recursive feature elimination (GKSVM-RFE) is a method for feature ranking in a nonlinear way. However, GKSVM-RFE suffers from the issue of high computational complexity, which hinders its applications. This paper investigates the issue of computational complexity in GKSVM-RFE, and proposes two fast versions for GKSVM-RFE, called fast GKSVM-RFE (FGKSVM-RFE), to speed up the procedure of recursive feature elimination in GKSVM-RFE. For this purpose, we design two kinds of ranking scores based on the first-order and second-order approximate schemes by introducing approximate Gaussian kernels. In iterations, FGKSVM-RFE fast calculates approximate ranking scores according to approximate schemes and ranks features based on approximate ranking scores. Experimental results reveal that our proposed methods can faster perform feature ranking than GKSVM-RFE and have compared performance to GKSVM-RFE.
This paper considers wireless networks where communication links are unstable and link interference is a challenge to design high performance scheduling algorithms. Wireless links are time varying and are modeled by M...
详细信息
This paper considers wireless networks where communication links are unstable and link interference is a challenge to design high performance scheduling algorithms. Wireless links are time varying and are modeled by Markov stochastic processes. The problem of designing an optimal link scheduling algorithm to maximize the expected reliability of the network is formulated into a Markov Decision Process first. The optimal solution can be obtained by the finite backward induction algorithm. However, the time complexity is very high. Thus, we develop an approximate link scheduling algorithm with an approximate ratio of 2 (N - 1) (r(M)Delta - r(m)delta);where N is the number of decision epochs, r(M) is the maximum link reliability, r(m) is the minimum link reliability, Delta is the number of links in the largest maximal independent set and delta is the number of links in the smallest maximal independent set. Simulations are conducted in different scenarios under different network topologies.
Scheduling parallel tasks in multi-cluster grid can be seen as two interdependent problems: cluster allocation and scheduling parallel task on the allocated cluster. In this paper both rigid and moldable parallel task...
详细信息
Scheduling parallel tasks in multi-cluster grid can be seen as two interdependent problems: cluster allocation and scheduling parallel task on the allocated cluster. In this paper both rigid and moldable parallel tasks are considered. We propose a theoretical model of utility-oriented parallel task scheduling in multi-cluster grid with advance reservations. On the basis of the model we present an approximation algorithm, a repair strategy based genetic algorithm and greedy heuristics MaxMax, T-Sufferage and R-Sufferage to solve the two interdependent problems. We compare the performance of these algorithms in aspect of utility optimality and timing results. Simulation results show on average the (1+alpha)-approximation algorithm achieves the best trade-off between utility optimality and timing. Genetic algorithm could achieve better utility than greedy heuristics and approximate algorithm at expensive time cost. Greedy heuristics do not perform equally well when adapted to different utility functions while the approximation algorithm shows its intrinsic stable performance.
The behavior of the welding pool plays an important role in determining the quality of the weld, and the surface behavior of the welding pool contains some important information as feedback to adjust welding parameter...
详细信息
The behavior of the welding pool plays an important role in determining the quality of the weld, and the surface behavior of the welding pool contains some important information as feedback to adjust welding parameters. In order to study the dynamic characteristics of the molten pool surface in the TIG welding process with the filler wire, a grid structure laser measurement platform, based on the principle of surface reflection, was designed to observe the molten pool surface in this work. CCD was used to record the imaging on the projection screen. A new three-dimensional reconstruction algorithm was proposed for calculation of the welding pool surface. This algorithm analyzes the image which is captured by the CCD to restore the three-dimensional topography of the fixed-point wire-filled TIG welding pool, so as to obtain the three-dimensional topography evolution the during welding process. The difference between the obtained weld pool height and the experimental results is very small.
Chemical Reaction Optimization (CRO) is a recently established metaheuristics for optimization, inspired by the nature of chemical reactions. A chemical reaction is a natural process of transforming the unstable subst...
详细信息
Chemical Reaction Optimization (CRO) is a recently established metaheuristics for optimization, inspired by the nature of chemical reactions. A chemical reaction is a natural process of transforming the unstable substances to the stable ones. In microscopic view, a chemical reaction starts with some unstable molecules with excessive energy. The molecules interact with each other through a sequence of elementary reactions. At the end, they are converted to those with minimum energy to support their existence. This property is embedded in CRO to solve optimization problems. CRO can be applied to tackle problems in both the discrete and continuous domains. We have successfully exploited CRO to solve a broad range of engineering problems, including the quadratic assignment problem, neural network training, multimodal continuous problems, etc. The simulation results demonstrate that CRO has superior performance when compared with other existing optimization algorithms. This tutorial aims to assist the readers in implementing CRO to solve their problems. It also serves as a technical overview of the current development of CRO and provides potential future research directions.
暂无评论