this paper decomposes the algorithmic parameters that affect the accuracy and parallel run times of mean shift segmentation. Following Comaniciu and Meer, rather than perform calculations in the feature space of the i...
详细信息
this paper decomposes the algorithmic parameters that affect the accuracy and parallel run times of mean shift segmentation. Following Comaniciu and Meer, rather than perform calculations in the feature space of the image, the joint spatial-range domain is represented by the image space, with feature space information associated with each point. We report parallel speedup and segmentation accuracy using a standardised segmentation dataset and the Probabilistic Rand index (PRI) accuracy measure. Changes to the algorithmic parameters are analysed and a sweet spot between PRI and run time is found. Using a range window radius of 20, spatial window radius of 10 and threshold of 50, the PRI is improved by 0.17, an increase of 34% which is comparable to state of the art. Mean shift clustering run time is reduced by 97% withparallelism, a speedup of 32 on a 64-core CPU.
In this manuscript, we present an optimized and parallel version of our previous work IMSAME, an exhaustive gapped aligner for the pairwise and accurate comparison of metagenomes. parallelization strategies are applie...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654829;9783319654812
In this manuscript, we present an optimized and parallel version of our previous work IMSAME, an exhaustive gapped aligner for the pairwise and accurate comparison of metagenomes. parallelization strategies are applied to take advantage of modern multiprocessor architectures. In addition, sequential optimizations in CPU time and memory consumption are provided. these algorithmic and computational enhancements enable IMSAME to calculate near optimal alignments which are used to directly assess similarity between metagenomes without requiring reference databases. We show that the overall efficiency of the parallel implementation is superior to 80% while retaining scalability as the number of parallel cores used increases. Moreover, we also show that sequential optimizations yield up to 8x speedup for scenarios with larger data.
the High Efficiency Video Coding (HEVC) standard, as the newest generation video coding standard issued in 2013, significantly improves compression performance relative to existing standards in about 50% bit-rate redu...
详细信息
this paper describes demonstrative software simulator "E14", helpful in studying on an ordinary PC essentials of parallel calculations. It contains five virtual processors with identical instruction set, one...
详细信息
ISBN:
(纸本)9781509060580
this paper describes demonstrative software simulator "E14", helpful in studying on an ordinary PC essentials of parallel calculations. It contains five virtual processors with identical instruction set, one of which controls the other four. Simulator has several mechanisms of data exchange between processors, so it can be used for studying botharchitectures with shared and distributed memory. parallel architecture of "E14" naturally extends classical single-processor one, hence proposed approach makes students' knowledge more systematic. "E14" may also be employed as a platform for the estimation of parallelalgorithms' performance. the example, considered in the paper, clearly demonstrates how data exchange between processors may essentially degrade speedup, predicted by Amdahl's law.
Characteristics and way of behavior of attacks and infiltrators on computer networks are very difficult and need an expert. In addition;the advancement of computer networks, the number of attacks and infiltrations is ...
详细信息
ISBN:
(纸本)9781538637906
Characteristics and way of behavior of attacks and infiltrators on computer networks are very difficult and need an expert. In addition;the advancement of computer networks, the number of attacks and infiltrations is also increasing. In fact, the knowledge coming from expert will lose its value over time and must be updated and made available to the system and this makes the need for expert person always felt. In machine learning techniques, knowledge is extracted from the data itself which has diminished the role of the expert. Various methods used to detect intrusions, such as statistical models, safe system approach, neural networks, etc., all weaken the fact that it uses all the features of an information packet rotating in the network for intrusion detection. Also, the huge volume of information and the unthinkable state space is also an important issue in the detection of intrusion. therefore, the need for automatic identification of new and suspicious patterns in attempt for intrusion withthe use of more efficient methods (Lower cost and higher performance) is needed more than before. the purpose of this paper is to provide a new method based on intrusion detection systems and its various architectures aimed at increasing the accuracy of intrusion detection in cloud computing.
In this paper, we propose a method for parallel top-k query processing on GPU(s). We employ a novel partitioning strategy which splits the posting lists according to document ID numbers. Individual GPU threads simulta...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654829;9783319654812
In this paper, we propose a method for parallel top-k query processing on GPU(s). We employ a novel partitioning strategy which splits the posting lists according to document ID numbers. Individual GPU threads simultaneously perform top-k query processing within their allocated subsets of posting lists, the results of the query are merged to give the final top-k results. We further design a CPU-GPU cooperative query processing method, where a majority of queries involving shorter posting lists are processed on the GPU side. We experiment with AND, OR, WAND, and Block-Max WAND (BMW) queries, with experimental results showing a promising improvement in query throughput, particularly in the case of BMW queries.
In the field of approximate nearest neighbor (ANN) search, rare of the existing approaches are tailored for video applications. the Ring Intersection Approximate Nearest Neighbor (RIANN) is the first ANN search algori...
详细信息
In the field of approximate nearest neighbor (ANN) search, rare of the existing approaches are tailored for video applications. the Ring Intersection Approximate Nearest Neighbor (RIANN) is the first ANN search algorithm for videos. It achieves real-time by performing the ANN search on the sparse grid and interpolating others. For some applications, the dense ANN search is needed to ensure the searching accuracy. To achieve dense ANN search in real-time, we consider the parallel computing as a solution. However, the RIANN algorithm is not suitable for parallel computing as the algorithm itself suffers from bad thread coherency. In this paper, we propose the Sphere Ring Intersection Approximate Nearest Neighbor (SRIANN), which solves the problem of bad thread coherency and improves the accuracy of ANN search compared to the original RIANN method. the experimental results show that the proposed method is the only one able to perform dense ANN search for CIF videos in real-time.
Heterogeneous CPU-GPU platforms include resources to benefit from different kinds of parallelism present in many data mining applications based on evolutionary algorithmsthat evolve solutions with time-demanding fitn...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654829;9783319654812
Heterogeneous CPU-GPU platforms include resources to benefit from different kinds of parallelism present in many data mining applications based on evolutionary algorithmsthat evolve solutions with time-demanding fitness evaluation. this paper describes an evolutionary parallel multi-objective feature selection procedure with subpopulations using two scheduling alternatives for evaluation of individuals according to the number of subpopulations. Evolving subpopulations usually provides good diversity properties and avoids premature convergence in evolutionary algorithms. the proposed procedure has been implemented in OpenMP to distribute dynamically either subpopulations or individuals among devices and OpenCL to evaluate the individuals taking into account the devices characteristics, providing two parallelism levels in CPU and up to three levels in GPUs. Different configurations of the proposed procedure have been evaluated and compared with a master-worker approach considering not only the runtime and achieved speedups but also the energy consumption between both scheduling models.
Multicore clusters are widely used to solve combinatorial optimization problems, which require high computing power and a large amount of memory. In this sense, Hash Distributed A* (HDA*) parallelizes A*, a combinator...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654829;9783319654812
Multicore clusters are widely used to solve combinatorial optimization problems, which require high computing power and a large amount of memory. In this sense, Hash Distributed A* (HDA*) parallelizes A*, a combinatorial optimization algorithm, using the MPI library. HDA* scales well on multicore clusters and on multicore machines. Additionally, there exist several versions of HDA* that were adapted for multicore machines, using the Pthreads library. In this paper, we present Hybrid HDA* (HHDA*), a hybrid parallel search algorithm based on HDA* that combines message-passing (MPI) with shared-memory programming (Pthreads) to better exploit the computing power and memory of multicore clusters. We evaluate the performance and memory consumption of HHDA* on a multicore cluster, using the 15-puzzle as a case study. the results reveal that HHDA* achieves a slightly higher average performance and uses considerably less memory than HDA*. these improvements allowed HHDA* to solve one of the hardest 15-Puzzle instances.
this book constitutes the refereed proceedings of the 7thinternationalconference on E-Technologies, MCETECH 2017, held in Ottawa, ON, Canada, in May 2017. this years conference drew special attention to the ever-inc...
ISBN:
(数字)9783319590417
ISBN:
(纸本)9783319590400
this book constitutes the refereed proceedings of the 7thinternationalconference on E-Technologies, MCETECH 2017, held in Ottawa, ON, Canada, in May 2017. this years conference drew special attention to the ever-increasing role of the Internet of things (IoT); and the contributions span a variety of application domains such as e-Commerce, e-Health, e-Learning, and e-Justice, comprising research from models and architectures, methodology proposals, prototype implementations, and empirical validation of theoretical models. the 19 papers presented were carefully reviewed and selected from 48 submissions. they were organized in topical sections named: pervasive computing and smart applications; security, privacy and trust; process modeling and adaptation; data analytics and machine learning; and e-health and e-commerce.
暂无评论