Watershed delineation is one of the fundamental tasks in hydrological studies. Tools for extracting watersheds from digital elevation models and flow direction rasters are commonly implemented in GIS software packages...
详细信息
Watershed delineation is one of the fundamental tasks in hydrological studies. Tools for extracting watersheds from digital elevation models and flow direction rasters are commonly implemented in GIS software packages. However, the performance of available techniques and algorithms often turns out to be far from sufficient, especially when working with large datasets. While modern hardware offers high computing performance through massive parallelism, there is still a need for algorithms that can effectively use these capabilities. This paper proposes an algorithm for rapid watershed delineation directly from flow direction rasters, using the possibilities offered by modern GPU devices. Performance measurements show a significant reduction in execution time compared to other parallel solutions proposed for this task in the literature. Moreover, this implementation makes it possible to delineate multiple watersheds from the same dataset simultaneously, each having one or more outlet cells, with virtually no additional computational cost.
The article presents and evaluates a scalable algorithm for validating solutions to linear programming problems on cluster computing systems. The main idea of the method is to generate a regular set of points (validat...
详细信息
When all the qubits needed for solving a problem are not located in a single quantum computer, qubits from different quantum computers can be collectively utilized. In this case, quantum communication is needed for th...
详细信息
ISBN:
(数字)9798331531591
ISBN:
(纸本)9798331531607
When all the qubits needed for solving a problem are not located in a single quantum computer, qubits from different quantum computers can be collectively utilized. In this case, quantum communication is needed for the multiple quantum computers to communicate with each other. Several studies address the problem of minimizing the number of quantum communications when evaluating a general quantum circuit. The solutions proposed typically involve solving some intractable problems. In this paper, we show that we can obtain much better solutions when we focus on solving specific problems (instead of seeking solutions for generic circuits). Specifically, we consider several fundamental quantum circuits and identify communication protocols that need a much smaller number of communication steps than those offered by generic solutions. Our work is in line with traditional parallel and distributed computing research where typically scientists focus on solving specific problems (such as sorting, matrix multiplication, network flow, etc.) in a parallel or distributed setting.
N-body simulations is a fundamental problem in molecular dynamics. For example, N-body simulations can be employed to study powder flow. It is worth noting that powder or granular materials are the second ubiquitous s...
详细信息
ISBN:
(数字)9798331531591
ISBN:
(纸本)9798331531607
N-body simulations is a fundamental problem in molecular dynamics. For example, N-body simulations can be employed to study powder flow. It is worth noting that powder or granular materials are the second ubiquitous substance in the industry after water. Numerous sequential and parallel algorithms have been proposed in the literature for N-body simulations on classical computers. Given a set of n particles in a d-dimensional space, a brute force algorithm will take Ω(n 2 d) time to simulate each step of this system of particles. Numerous algorithms in the literature perform better than this under some suitable assumptions. Given the potential speedups offered by quantum computing, an interesting open problem is to investigate how much speedups can be obtained for N-body simulations using quantum computers. In this paper we present efficient algorithms that run on quantum-classical hybrid models of computing. Specifically, our algorithms solve the problem of finding close neighbors using a quantum computer and the other steps on a parallel Random Access Machine (PRAM). Our quantum algorithms outperform other algorithms in the literature asymptotically as well as on empirical evaluations.
As data volume grows, computational speed becomes a key challenge. Data reduction helps address this by eliminating redundancy in rough sets using a reduct. However, most reduct-generation algorithms rely on software,...
详细信息
Focusing on the Wavelet Transform, the paper explores four parallel Wavelet Transform algorithms and techniques from the perspectives of data parallel and algorithm parallel for remote sensing images. Among them, the ...
详细信息
ISBN:
(数字)9781728199283
ISBN:
(纸本)9781728199290
Focusing on the Wavelet Transform, the paper explores four parallel Wavelet Transform algorithms and techniques from the perspectives of data parallel and algorithm parallel for remote sensing images. Among them, the algorithm based on "Working Pool parallel" achieves dynamic load balance without any limits to the scale of the data and the number of the Slaves. Therefore, this algorithm is easier to achieve the goal of processing the vast data of remote sensing images rapidly in the distributed network systems.
We present a new parallel algorithm for k-clique counting/listing that has polylogarithmic span (parallel time) and is work-efficient (matches the work of the best sequential algorithm) for sparse graphs. Our algorith...
详细信息
The cosmic microwave background (CMB) experiments have reached an era of unprecedented precision and complexity. Aiming to detect the primordial B-mode polarization signal, these experiments will soon be equipped with...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
The cosmic microwave background (CMB) experiments have reached an era of unprecedented precision and complexity. Aiming to detect the primordial B-mode polarization signal, these experiments will soon be equipped with $10^{4}$ to $10^{5}$ detectors. Consequently, future CMB missions will face the substantial challenge of efficiently processing vast amounts of raw data to produce the initial scientific outputs - the sky maps - within a reasonable time frame and with available computational resources. To address this, we introduce BrahMap, a new map-making framework that will be scalable across both CPU and GPU platforms. Implemented in C++ with a user-friendly Python interface for handling sparse linear systems, BrahMap employs advanced numerical analysis and high-performance computing techniques to maximize the use of super-computing infrastructure. This work features an overview of the BrahMap’s capabilities and preliminary performance scaling results, with application to a generic CMB polarization experiment.
Finding the shortest path between nodes in a graph has wide applications in many important areas such as transportation and computer networks. However, the current reference algorithms for this task, Dijkstra's fo...
详细信息
ISBN:
(数字)9781728189468
ISBN:
(纸本)9781728189475
Finding the shortest path between nodes in a graph has wide applications in many important areas such as transportation and computer networks. However, the current reference algorithms for this task, Dijkstra's for single threaded environments and Δ-stepping for multi-threaded ones, leave performance and efficiency on the table by not taking advantage of additional information available about the graph. In this paper we present and experimentally evaluate novel algorithms SP 1 , SP 2 and ParSP 2 that leverage these constraints to solve the problem faster and more efficiently in key metrics. In single threaded execution, we show how SP 1 and SP 2 out-perform Dijsktra's algorithm by up to 46%. In multi-threaded execution we show how our algorithms compare favorably to Δ-stepping algorithm in the ability to establish the shortest path between the source and the median node.
This paper implements the Fast Fourier Transform (FFT) algorithm for signal data processing using Open Computing Language (OpenCL). A parallel algorithm model suitable for staged FFT across different GPUs is proposed,...
详细信息
ISBN:
(数字)9798350363760
ISBN:
(纸本)9798350363777
This paper implements the Fast Fourier Transform (FFT) algorithm for signal data processing using Open Computing Language (OpenCL). A parallel algorithm model suitable for staged FFT across different GPUs is proposed, including methods for execution and memory model settings. The characteristics of the OpenCL model and specific data structures are applied to optimize the logical structure of the parallel algorithm. Finally, the proposed method is applied and implemented in the Synthetic Aperture Radar(SAR) imaging RD algorithm. Experimental data confirm that the computational speed of the parallel algorithm in this paper is significantly higher than that of a serial CPU-based algorithm. Compared to the fastest FFT algorithm FFTW on the current CPU platform, it achieves substantially better performance. Additionally, compared to the CUDA-based CUFFT parallel algorithm, the performance of the algorithm in this paper is notably improved. In the SAR imaging RD algorithm, based on classical airborne SAR imaging parameters, it shows a significant improvement over FFTW.
暂无评论