Clustering by fast search and find of density peaks (CFSFDP) is a state-of-the-art density-based clustering algorithm that can effectively find clusters with arbitrary shapes. However, it requires to calculate the dis...
详细信息
Clustering by fast search and find of density peaks (CFSFDP) is a state-of-the-art density-based clustering algorithm that can effectively find clusters with arbitrary shapes. However, it requires to calculate the distances between all the points in a data set to determine the density and separation of each point. Consequently, its computational cost is extremely high in the case of large-scale data sets. In this study, we investigate the application of the k-means algorithm, which is a fast clustering technique, to enhance the scalability of the CFSFDP algorithm while maintaining its clustering results as far as possible. Toward this end, we propose two strategies. First, based on concept approximation, an acceleration algorithm (CFSFDP+A) involving fewer distance calculations is proposed to obtain the same clustering results as those of the original algorithm. Second, to further expand the scalability of the original algorithm, an approximate algorithm (CFSFDP+DE) based on exemplar clustering is proposed to rapidly obtain approximate clustering results of the original algorithm. Finally, experiments are conducted to illustrate the effectiveness and scalability of the proposed algorithms on several synthetic and real data sets. (C) 2017 Elsevier Ltd. All rights reserved.
In wireless powered communication networks (WPCNs), the harvested energy varies greatly among user nodes (UNs), resulting in throughput unfairness. Since the harvested energy is limited, each UN must strategically all...
详细信息
In wireless powered communication networks (WPCNs), the harvested energy varies greatly among user nodes (UNs), resulting in throughput unfairness. Since the harvested energy is limited, each UN must strategically allocate the energy used for forwarding the other nodes' information and for transmitting its own information, which further aggravates the global unfairness in terms of throughput. In this paper, we leverage user cooperation in multi-hop transmission to improve the throughput fairness. We formulate the fairness problem as the max-min throughput with resource allocation, which is NP-hard. We design an approximate algorithm to address this problem. The theoretical proof and the simulation results both show that the proposed algorithm provides tight upper and lower bounds for the optimal solution. Compared with the benchmark methods, our proposed method significantly enhances the throughput fairness for WPCNs.
The paper analyzes the influence, exerted by the mutual relations of deadline intervals on behavior of the optimal solution values for the random Sequencing Jobs with Deadlines (SJD) problems. An asymptotically sub-op...
详细信息
The paper analyzes the influence, exerted by the mutual relations of deadline intervals on behavior of the optimal solution values for the random Sequencing Jobs with Deadlines (SJD) problems. An asymptotically sub-optimal algorithm is proposed. It is assumed that the problem coefficients are realizations of independent uniformly distributed random variables and deadlines are deterministic. The results, presented in the paper, significantly extend knowledge on behavior of the optimal solutions to the SJD problem in the asymptotical case. (C) 2017 Elsevier B.V. All rights reserved.
A new formulation of the reverse bin-packing problem is suggested. One distinct feature of the new formulation is that it takes into account of a decision-maker's preferences for a set of objects that are evaluate...
详细信息
A new formulation of the reverse bin-packing problem is suggested. One distinct feature of the new formulation is that it takes into account of a decision-maker's preferences for a set of objects that are evaluated by multiple quality criteria. The aspects of this problem are discussed that relate to the theory of multiple criteria decision making. The known methods for solving the classic and the reverse bin-packing problems (the multiple knapsack problem) are reviewed.
The machine learning techniques for Markov random fields are fundamental in various fields involving pattern recognition, image processing, sparse modeling, and earth science, and a Boltzmann machine is one of the mos...
详细信息
The machine learning techniques for Markov random fields are fundamental in various fields involving pattern recognition, image processing, sparse modeling, and earth science, and a Boltzmann machine is one of the most important models in Markov random fields. However, the inference and learning problems in the Boltzmann machine are NP-hard. The investigation of an effective learning algorithm for the Boltzmann machine is one of the most important challenges in the field of statistical machine learning. In this paper, we study Boltzmann machine learning based on the (first-order) spatial Monte Carlo integration method, referred to as the 1-SMCI learning method, which was proposed in the author's previous paper. In the first part of this paper, we compare the method with the maximum pseudo-likelihood estimation (MPLE) method using a theoretical and a numerical approaches, and show the 1-SMCI learning method is more effective than the MPLE. In the latter part, we compare the 1-SMCI learning method with other effective methods, ratio matching and minimum probability flow, using a numerical experiment, and show the 1-SMCI learning method outperforms them.
An increasing number of high-performance networks are built over the existing IP network infrastructure to provision dedicated channels for big data transfer. The links in these overlay networks correspond to underlyi...
详细信息
ISBN:
(纸本)9781509053360
An increasing number of high-performance networks are built over the existing IP network infrastructure to provision dedicated channels for big data transfer. The links in these overlay networks correspond to underlying paths and may share lower-level link segments. We consider a model of overlay networks that incorporates correlated link capacities and linear capacity constraints (LCCs) to formulate such shared bottleneck components. The overlay links are typically shared by multiple users through advance reservations, resulting in varying bandwidth availability in future time. Therefore, efficient bandwidth scheduling algorithms are needed to improve the network resource utilization and also meet the user's transport requirements. We investigate two advance scheduling problems in overlay networks with LCCs: Fixed-Bandwidth Path and Varying-Bandwidth Path, with the objective to minimize the data transfer end time for a given data size. We prove that both problems are NP-complete and non-approximable, and propose heuristic algorithms using a gradual relaxation procedure on the maximum number of links from each LCC allowed for path computation. The performance superiority of these heuristics is verified by extensive simulation results in comparison with optimal and greedy strategies.
Foreground detection in complex scenarios is a challenging task. In this work, we propose to detect foreground by incrementally learning a cross-covariance based subspace. In our method, we first introduce the cross-c...
详细信息
ISBN:
(纸本)9781509064151;9781509064144
Foreground detection in complex scenarios is a challenging task. In this work, we propose to detect foreground by incrementally learning a cross-covariance based subspace. In our method, we first introduce the cross-covariance based two dimensional principal component analysis(2 DPCA) algorithm into foreground detection field for better background ***, we extend the conventional cross-covariance based 2 DPCA algorithm into an incremental one, which helps to model background in an adaptive way. Moreover, we consider the sparse and the continuous characteristics of the foreground, and formulate them as a fused lasso problem. By adding the fused lasso regularization into the proposed subspace learning process,we integrate the background recovery and the foreground estimation into a single optimization framework. Finally, we design an efficient approximate algorithm which solves the optimization problem effectively. We compare our method with the state of the art methods on multiple challenging video sequences. The experimental results demonstrate the effectiveness and the advantages of the proposed method.
As a major type of continuous spatial queries, the moving k nearest neighbor (kNN) query has been studied extensively. However, most existing studies have focused on only the query efficiency. In this paper, we consid...
详细信息
As a major type of continuous spatial queries, the moving k nearest neighbor (kNN) query has been studied extensively. However, most existing studies have focused on only the query efficiency. In this paper, we consider further the usability of the query results, in particular the diversification of the returned data points. We thereby formulate a new type of query named the moving k diversified nearest neighbor query (MkDNN). This type of query continuously reports the k diversified nearest neighbors while the query object is moving. Here, the degree of diversity of the kNN set is defined on the distance between the objects in the kNN set. Computing the k diversified nearest neighbors is an NP-hard problem. We propose an algorithm to maintain incrementally the k diversified nearest neighbors to reduce the query processing costs. We further propose two approximate algorithms to obtain even higher query efficiency with precision bounds. We verify the effectiveness and efficiency of the proposed algorithms both theoretically and empirically. The results confirm the superiority of the proposed algorithms over the baseline algorithm.
This paper considers wireless networks where communication links are unstable and link interference is a challenge to design high performance scheduling algorithms. Wireless links are time varying and are modeled by M...
详细信息
This paper considers wireless networks where communication links are unstable and link interference is a challenge to design high performance scheduling algorithms. Wireless links are time varying and are modeled by Markov stochastic processes. The problem of designing an optimal link scheduling algorithm to maximize the expected reliability of the network is formulated into a Markov Decision Process first. The optimal solution can be obtained by the finite backward induction algorithm. However, the time complexity is very high. Thus, we develop an approximate link scheduling algorithm with an approximate ratio of 2 (N - 1) (r(M)Delta - r(m)delta);where N is the number of decision epochs, r(M) is the maximum link reliability, r(m) is the minimum link reliability, Delta is the number of links in the largest maximal independent set and delta is the number of links in the smallest maximal independent set. Simulations are conducted in different scenarios under different network topologies.
SimRank is a similarity measure for graph nodes that has numerous applications in practice. Scalable SimRank computation has been the subject of extensive research for more than a decade, and yet, none of the existing...
详细信息
ISBN:
(纸本)9781450335317
SimRank is a similarity measure for graph nodes that has numerous applications in practice. Scalable SimRank computation has been the subject of extensive research for more than a decade, and yet, none of the existing solutions can efficiently derive SimRank scores on large graphs with provable accuracy guarantees. In particular, the state-of-the-art solution requires up to a few seconds to compute a SimRank score in million-node graphs, and does not offer any worst-case assurance in terms of the query error. This paper presents SLING, an efficient index structure for SimRank computation. SLING guarantees that each SimRank score returned has at most epsilon additive error, and it answers any single pair and single-source SimRank queries in O(1/epsilon) and O(n/epsilon) time, respectively. These time complexities are near-optintal, and are significantly better than the asymptotic bounds of the most recent approach. Furthermore, SLING requires only O(n/epsilon) space (which is also near-optimal in an asymptotic sense) and O(m/epsilon + n log n/delta/epsilon(2)) pre-computation time, where delta is the failure probability of the preprocessing algorithm. We experimentally evaluate SLING with a variety of real-world graphs with up to several millions of nodes. Our results demonstrate that SLING is up to 10000 times (rest). 110 times) faster than competing methods for single pair (resp. single-source) SimRank queries, at the cost of higher space overheads.
暂无评论