In this paper, aiming at the problems of traditional K-means clustering algorithm in big data processing, such as performance and determination of initial clustering center, an improved k-means clustering algorithm ba...
详细信息
ISBN:
(纸本)9781728150956
In this paper, aiming at the problems of traditional K-means clustering algorithm in big data processing, such as performance and determination of initial clustering center, an improved k-means clustering algorithm based on Hadoop platform is proposed. This algorithm uses canopy algorithm and cosine similarity to calculate, optimizes the determination of initial clustering center by K-means algorithm, and uses parallel computing framework to expand the algorithm in parallel. To adapt to big data processing. The experimental results show that the improved k-means clustering algorithm based on Hadoop platform has better clustering effect, and also has good speedup and scalability when processing a large number of data.
The flow method, together with an algorithm for self-gravity computations, is applied for the three-dimensional modeling of astrophysical flows. This method is based on the difference approximations of conservation la...
详细信息
As the exact computation of the k-terminal reliability is an NP-Complete problem, runtime and memory requirements grow exponentially with the input size. Shared memory parallelization algorithms were developed for red...
详细信息
ISBN:
(纸本)9781538680605
As the exact computation of the k-terminal reliability is an NP-Complete problem, runtime and memory requirements grow exponentially with the input size. Shared memory parallelization algorithms were developed for reducing runtime. However, even a relatively high amount of memory can already be exhausted within a short period of time. A message-passing based algorithm is proposed in order to circumvent the memory limitation of shared memory implementations. It is the first message-passing based algorithm for the k-terminal problem. The new algorithm is designed for the currently most efficient BDD-based method. New data structures such as the distributed BDD and a distributed hash table lead to good speedup results and load-balanced task distributions. Now the size of computable inputs are limited to the memory carried along by the available cores. The two-terminal reliability of a 17 node complete network was computed on 1024 cores of the SuperMUC within 7 minutes, using 1.28 Terabyte of memory and resulting in more than 6 billion BDD nodes.
This research study aims to conduct the analysis and implementation of efficient algorithms for simulations of micro and nanoparticle transport models in porous media, coupled with the Darcy-Forchheimer fluid model, m...
详细信息
This research study aims to conduct the analysis and implementation of efficient algorithms for simulations of micro and nanoparticle transport models in porous media, coupled with the Darcy-Forchheimer fluid model, modified to include electromagnetic effects. The schemes developed were implemented via a parallel infrastructure for benchmark problems with a flexible algorithm that is efficient, robust, and stable. These improvements in the reliability and efficiency of simulations of nanoparticle transport in porous media contribute to the creation of an efficient method to counteract the contaminants in groundwater, and ultimately increase the availability of clean drinking water.
The sequence alignment is an important basic work in analyzing large biological data. For the massive short reads alignment problem, based on the dynamic programming approach, divide and conquer principle, and FUSE ke...
详细信息
ISBN:
(数字)9781728126166
ISBN:
(纸本)9781728126173
The sequence alignment is an important basic work in analyzing large biological data. For the massive short reads alignment problem, based on the dynamic programming approach, divide and conquer principle, and FUSE kernel module, a parallel short-read alignment method allowing the optimal number of inserting gaps depending on species and sequence length is developed on multi-core cluster. The experimental results on real and synthetic data show that the proposed parallel alignment method can achieve good speedup with the same alignment accuracy as the sequential alignment method. Compared with the existing parallel alignment method, the proposed method can remarkably reduce the time of partitioning reference genome and reads files and accelerate the large-scale short-read alignment.
The row-wise and column-wise prefix-sum computation of a matrix has many applications in the area of image processing such as computation of the summed area table and the Euclidean distance map. It is known that the p...
详细信息
ISBN:
(纸本)9783319780542;9783319780535
The row-wise and column-wise prefix-sum computation of a matrix has many applications in the area of image processing such as computation of the summed area table and the Euclidean distance map. It is known that the prefix-sums of a 1-dimensional array can be computed efficiently on the GPU. Hence, the row-wise prefix-sums of a matrix can also be computed efficiently on the GPU by executing this prefix-sum algorithm for every row in parallel. However, the same approach does not work well for computing the column-wise prefix-sums, because inefficient stride memory access to the global memory is performed. The main contribution of this paper is to present an almost optimal column-wise prefix-sum algorithm on the GPU. Since all elements in an input matrix must be read and the resulting prefix-sums must be written, computation of the column-wise prefix-sums cannot be faster than simple matrix duplication in the global memory of the GPU. Quite surprisingly, experimental results using NVIDIA TITAN X show that our column-wise prefix-sum algorithm runs only 2-6% slower than matrix duplication. Thus, our column-wise prefix-sum algorithm is almost optimal.
The future of main memory appears to lie in the direction of new technologies that provide strong capacity-toperformance ratios, but have write operations that are much more expensive than reads in terms of latency, b...
详细信息
ISBN:
(纸本)9781538643686
The future of main memory appears to lie in the direction of new technologies that provide strong capacity-toperformance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy. Motivated by this trend, we propose sequential and parallel algorithms to solve graph connectivity problems using significantly fewer writes than conventional algorithms. Our primary algorithmic tool is the construction of an o(n)-sized implicit decomposition of a bounded-degree graph G on n nodes, which combined with read-only access to G enables fast answers to connectivity and biconnectivity queries on G. The construction breaks the linear-write "barrier", resulting in costs that are asymptotically lower than conventional algorithms while adding only a modest cost to querying time. For general non-sparse graphs on m edges, we also provide the first parallel algorithms for connectivity and biconnectivity that require o(m) writes and O(m) operations. These algorithms provide insight into how applications can efficiently process computations on large graphs in systems with read-write asymmetry.
The increase in huge amount of data is seen clearly in present days because of requirement for storing more information. To extract certain data from this large database is a very difficult task, including text proces...
详细信息
In this paper, we consider the unordered pseudo-tree matching problem, which is a problem of, given two unordered labeled trees P and T, finding all occurrences of P in T via such many-to-one matchings that preserve n...
详细信息
In this paper, we consider the top-k route search with user's preferences. Specifically, given a set of POIs, our problem is to find k different routes from a source POI to a target POI such that the constraint on...
详细信息
ISBN:
(数字)9781728143286
ISBN:
(纸本)9781728143293
In this paper, we consider the top-k route search with user's preferences. Specifically, given a set of POIs, our problem is to find k different routes from a source POI to a target POI such that the constraint on route cost and the POIs covered by the route can optimally satisfy the user-defined weighted feature preference. It has been shown that the problem is NP-hard. The challenge is how to select from plenty of POIs and construct an optimal route especially when the size of candidate POIs is large. In order to support top-k route search on a large dataset or with a looser budget constraint, we propose a parallel method on a single machine to speed up the search. Moreover, we adopt further effective pruning strategies to reduce the search space. The experimental results on real-world datasets show that our proposed parallel method is much efficient, about 10 times faster than the existing serial algorithm at best.
暂无评论