When reserve networks are established over time, there is a risk that sites will be developed in areas planned for future reservation, reducing the effectiveness of reserves. We developed a dynamic reserve design mode...
详细信息
When reserve networks are established over time, there is a risk that sites will be developed in areas planned for future reservation, reducing the effectiveness of reserves. We developed a dynamic reserve design model that maximizes the expected number of species conserved, taking account of the risk of future habitat loss and fragmentation. The model makes use of the union-find algorithm, which is an efficient method for maintaining a list of connected regions in a graph as nodes and edges are inserted. A simple extension of the algorithm allows us to efficiently determine, for each species, when a sequence of site selections results in a reserve in which the species can persist. The extension also allows us to determine when a sequence of deforestation events results in the species becoming non-viable. The dynamic reserve design model is much more effective than commonly used heuristics, particularly when multiple connected sites are required for species persistence. The model also is able to solve much larger problems with greater effectiveness than the only previous dynamic reserve design model that considered site connectivity relationships. The union-find algorithm has much scope for addressing ecological management problems in which dynamic connectivity needs to be considered. (C) 2008 Elsevier B.V All rights reserved.
DBSCAN is one of clustering algorithms which can report arbitrarily-shaped clusters and noises without requiring the number of clusters as a parameter (unlike the other clustering algorithms, k-means, for example). Be...
详细信息
DBSCAN is one of clustering algorithms which can report arbitrarily-shaped clusters and noises without requiring the number of clusters as a parameter (unlike the other clustering algorithms, k-means, for example). Because the running time of DBSCAN has quadratic order of growth, i.e. O(n(2)), research studies on improving its performance have been received a considerable amount of attention for decades. Grid-based DBSCAN is a well-developed algorithm whose complexity is improved to O(nlog n) in 2D space, while requiring Omega(n(4/3)) to solve when dimension >= 3. However, we find that Grid-based DBSCAN suffers from two problems: neighbour explosion and redundancies in merging, which make the algorithms infeasible in high dimensional space. In this paper we first propose a novel algorithm called GDCF which utilizes bitmap indexing to support efficient neighbour grid queries. Second, based on the concept of union-find algorithm we devise a forest-like structure, called cluster forest, to alleviate the redundancies in the merging. Moreover, we find that running the cluster forest in different orders can lead to a different number of merging operations needed to perform in the merging step. We propose to perform the merging step in a uniform random order to optimize the number of merging operations. However, for high-density database, a bottleneck could be occurred, we further propose a low-density-first order to alleviate this bottleneck. The experiments resulted on both real-world and synthetic datasets demonstrate that the proposed algorithm outperforms the state-of-the-art exact/approximate DBSCAN and suggests a good scalability. (C) 2019 Elsevier Ltd. All rights reserved.
Numerical Manifold Method (NMM) has gained widespread application in engineering practice due to its capacity to effectively address both continuous and discontinuous problems in a unified framework. With the advancem...
详细信息
Numerical Manifold Method (NMM) has gained widespread application in engineering practice due to its capacity to effectively address both continuous and discontinuous problems in a unified framework. With the advancement of 3D-NMM, there remains a deficiency in ready-made preprocessing tools. Hence, the generation of 3D manifold elements (MEs) is the primary prerequisite for 3D-NMM. In this study, a parallel algorithm to generate the MEs is proposed. Firstly, a novel 3D block identification algorithm is developed to improve the modeling efficiency. Then, 3D joint networks are simulated using the geological statistical results. Thirdly, the mathematical cover system is constructed using uniformly regular hexahedral meshes. Subsequently, the physical domain is cut using the Boolean algorithm to generate the related manifold blocks (MBs) and the union -findalgorithm is used to determine the relationship of MBs with mathematical patches (MPs) and physical patches (PPs). The proposed program incorporates the OpenMP parallel library for CPU parallelization to enhance its efficiency and the OpenGL library is used to build a real-time visualization program. Finally, the proposed program is validated using several representative examples, and the obtained results demonstrate its efficient capability in generating MEs.
Constraint Handling Rules (CHR) is a committed-choice rule-based language that was originally intended for writing constraint solvers. In this paper we show that it is also possible to write the classic union-find alg...
详细信息
Constraint Handling Rules (CHR) is a committed-choice rule-based language that was originally intended for writing constraint solvers. In this paper we show that it is also possible to write the classic union-find algorithm and variants in CHR. The programs neither compromise in declarativeness nor efficiency. We study the time complexity of our programs: they match the almost-linear complexity of the best known imperative implementations. This fact is illustrated with experimental results.
We present two optimization strategies to improve connected-component labeling algorithms. Taking together, they form an efficient two-pass labeling algorithm that is fast and theoretically optimal. The first optimiza...
详细信息
We present two optimization strategies to improve connected-component labeling algorithms. Taking together, they form an efficient two-pass labeling algorithm that is fast and theoretically optimal. The first optimization strategy reduces the number of neighboring pixels accessed through the use of a decision tree, and the second one streamlines the union-find algorithms used to track equivalent labels. We show that the first strategy reduces the average number of neighbors accessed by a factor of about 2. We prove our streamlined union-find algorithms have the same theoretical optimality as the more sophisticated ones in literature. This result generalizes an earlier one on using union-find in labeling algorithms by Fiorio and Gustedt (Theor Comput Sci 154(2):165-181, 1996). In tests, the new union-find algorithms improve a labeling algorithm by a factor of 4 or more. Through analyses and experiments, we demonstrate that our new two-pass labeling algorithm scales linearly with the number of pixels in the image, which is optimal in computational complexity theory. Furthermore, the new labeling algorithm outperforms the published labeling algorithms irrespective of test platforms. In comparing with the fastest known labeling algorithm for two-dimensional (2D) binary images called contour tracing algorithm, our new labeling algorithm is up to ten times faster than the contour tracing program distributed by the original authors.
Habitat connectivity is required at large spatial scales to facilitate movement of biota in response to climatic changes and to maintain viable populations of wide-ranging species. Nevertheless, it may require decades...
详细信息
Habitat connectivity is required at large spatial scales to facilitate movement of biota in response to climatic changes and to maintain viable populations of wide-ranging species. Nevertheless, it may require decades to acquire habitat linkages at such scales, and areas that could provide linkages are often developed before they can be reserved. Reserve scheduling methods usually consider only current threats, but threats change over time as development spreads and reaches presently secure areas. We investigated the importance of considering future threats when implementing projects to maintain habitat connectivity at a regional scale. To do so, we compared forward-looking scheduling strategies with strategies that consider only current threats. The strategies were applied to a Costa Rican case study, where many reserves face imminent isolation and other reserves will probably become isolated in the more distant future. We evaluated strategies in terms of two landscape-scale connectivity metrics, a pure connectivity metric and a metric of connected habitat diversity. Those strategies that considered only current threats were unreliable because they often failed to complete planned habitat linkage projects. The most reliable and effective strategies considered the future spread of development and its impact on the likelihood of completing planned habitat linkage projects. Our analyses highlight the critical need to consider future threats when building connected reserve networks over time.
This paper presents a new fast and scalable Parallel union-find algorithm for image segmentation and its System-on-Chip (SoC) implementation using 65nm CMOS technology following the Application-Specific Integrated Cir...
详细信息
ISBN:
(纸本)9781479917624
This paper presents a new fast and scalable Parallel union-find algorithm for image segmentation and its System-on-Chip (SoC) implementation using 65nm CMOS technology following the Application-Specific Integrated Circuit (ASIC) design flow. The algorithm is capable of labeling all foreground and background pixels, using the least possible pixels scanning. This contrasts the classical labeling algorithms that label only foreground (or background) pixels in a single run. The new algorithm utilizes only two memory blocks. In one memory block, it labels image segments using their seeds as the label and, simultaneously, the segments sizes are used as the other label in second memory block. By this parallel labeling, monitoring the image segments is very fast and efficient. With 350 MHz operating frequency, the processing rate estimated to be 2100 frames/sec, the total chip area of 15950.5 mu m(2) (off-chip memory) and very low-power of 0.3 mW, the SoC tends to be an excellent candidate for mobile devices and real-time applications.
OPTICS is a hierarchical density-based data clustering algorithm that discovers arbitrary-shaped clusters and eliminates noise using adjustable reachability distance thresholds. Parallelizing OPTICS is considered chal...
详细信息
ISBN:
(纸本)9781450323789
OPTICS is a hierarchical density-based data clustering algorithm that discovers arbitrary-shaped clusters and eliminates noise using adjustable reachability distance thresholds. Parallelizing OPTICS is considered challenging as the algorithm exhibits a strongly sequential data access order. We present a scalable parallel OPTICS algorithm (POPTICS) designed using graph algorithmic concepts. To break the data access sequentiality, POPTICS exploits the similarities between the OPTICS algorithm and PRIM'S Minimum Spanning Tree algorithm. Additionally, we use the disjoint-set data structure to achieve a high parallelism for distributed cluster extraction. Using high dimensional datasets containing up to a billion floating point numbers, we show scalable speedups of up to 27.5 for our OpenMP implementation on a 40-core shared-memory machine, and up to 3,008 for our MPI implementation on a 4,096-core distributed-memory machine. We also show that the quality of the results given by POPTICS is comparable to those given by the classical OPTICS algorithm.
DBSCAN is a widely used isodensity-based clustering algorithm for particle data well-known for its ability to isolate arbitrarily-shaped clusters and to filter noise data. The algorithm is super-linear (O(nlogn)) and ...
详细信息
ISBN:
(纸本)9781479955008
DBSCAN is a widely used isodensity-based clustering algorithm for particle data well-known for its ability to isolate arbitrarily-shaped clusters and to filter noise data. The algorithm is super-linear (O(nlogn)) and computationally expensive for large datasets. Given the need for speed, we propose a fast heuristic algorithm for DBSCAN using density based sampling, which performs equally well in quality compared to exact algorithms, but is more than an order of magnitude faster. Our experiments on astrophysics and synthetic massive datasets (8.5 billion numbers) shows that our approximate algorithm is up to 56x faster than exact algorithms with almost identical quality (Omega-Index >= 0.99). We develop a new parallel DBSCAN algorithm, which uses dynamic partitioning to improve load balancing and locality. We demonstrate near-linear speedup on shared memory (15x using 16 cores, single node Intel (R) Xeon (R) processor) and distributed memory (3917x using 4096 cores, multinode) computers, with 2x additional performance improvement using Intel (R) Xeon Phi (TM) coprocessors. Additionally, existing exact algorithms can achieve up to 3.4 times speedup using dynamic partitioning.
DBSCAN is a well-known density based clustering algorithm capable of discovering arbitrary shaped clusters and eliminating noise data. However, parallelization of DBSCAN is challenging as it exhibits an inherent seque...
详细信息
ISBN:
(纸本)9781467308052;9781467308045
DBSCAN is a well-known density based clustering algorithm capable of discovering arbitrary shaped clusters and eliminating noise data. However, parallelization of DBSCAN is challenging as it exhibits an inherent sequential data access order. Moreover, existing parallel implementations adopt a master-slave strategy which can easily cause an unbalanced workload and hence result in low parallel efficiency. We present a new parallel DBSCAN algorithm (PDSDBSCAN) using graph algorithmic concepts. More specifically, we employ the disjoint-set data structure to break the access sequentiality of DBSCAN. In addition, we use a tree-based bottom-up approach to construct the clusters. This yields a better-balanced workload distribution. We implement the algorithm both for shared and for distributed memory. Using data sets containing up to several hundred million high-dimensional points, we show that PDSDBSCAN significantly outperforms the master-slave approach, achieving speedups up to 25.97 using 40 cores on shared memory architecture, and speedups up to 5,765 using 8,192 cores on distributed memory architecture.
暂无评论