Two dimensional 2D convolution is one of the most complex calculations and memory intensive algorithms used in image processing. In our paper, we present the 2D convolution algorithm used in the Gaussian blur which is...
详细信息
ISBN:
(纸本)9789897585111
Two dimensional 2D convolution is one of the most complex calculations and memory intensive algorithms used in image processing. In our paper, we present the 2D convolution algorithm used in the Gaussian blur which is a filter widely used for noise reduction and has high computational requirements. Since, single threaded solutions cannot keep up with the performance and speed needed for image processing techniques. Therefore, parallelizing the image convolution on parallel systems enhances the performance and reduces the processing time. This paper aims to give an overview on the performance enhancement of the parallel systems on image convolution using Gaussian blur algorithm. We compare the speed up of the algorithm on two parallel systems: multi-core central processing unit CPU and graphics processing unit GPU using Google Colaboratory or "colab".
parallel computing has established itself as another standard method for applied research and data analysis. The R system, being internally constrained to mostly singly-threaded operations, can nevertheless be used al...
详细信息
parallel computing has established itself as another standard method for applied research and data analysis. The R system, being internally constrained to mostly singly-threaded operations, can nevertheless be used along with different parallel computing approaches. This brief review covers OpenMP and Intel TBB at the CPU- and compiler level, moves to process-parallel approaches before discussing message-passing parallelism and big data technologies for parallel processing such as Spark, Docker and Kubernetes before concluding with a focus on the future package integrating many of these approaches. This article is categorized under: Algorithms and Computational Methods > Methods for High Performance computing Software for Computational Statistics > Software/Statistical Software Software for Computational Statistics > High Performance Software
Accounting for variability in generation and load and strategies to tackle variability cost-efficiently are key components of investment models for modern electricity systems. This work presents and evaluates the Hour...
详细信息
We propose an optimal scheduling strategy to enable fault-tolerant reliable computation to protect the integrity of computation. Specifically, we determine the optimal redundancy-failure rate tradeoff to incorporate r...
详细信息
ISBN:
(纸本)9781728181042
We propose an optimal scheduling strategy to enable fault-tolerant reliable computation to protect the integrity of computation. Specifically, we determine the optimal redundancy-failure rate tradeoff to incorporate redundancy into parallel computing units running multiple-precision arithmetic that are useful for applications such as asymmetric cryptography and fast integer multiplication. Inspired by network coding, we propose coding matrices to strategically map partial computation to available computing units, so that the central unit can reliably reconstruct the results of any failed machine without recalculations to yield the final correct computation output. We propose optimization-based algorithms to efficiently construct the optimal coding matrices subject to fault tolerance specifications. Performance evaluation demonstrates that the optimal scheduling effectively reduces the overall running time of parallel computing while resisting wide-ranging failure rates.
In recent years, the development of personal computers hardware has been aimed at increasing the number of processor cores. At the same time, the efficiency and reliability of computer interconnecting networks is incr...
详细信息
ISBN:
(纸本)9788363578190
In recent years, the development of personal computers hardware has been aimed at increasing the number of processor cores. At the same time, the efficiency and reliability of computer interconnecting networks is increased. This enables the introduction and development of parallel and distributed processing methods also in CAD systems. The paper presents the problems related to the parallelization of the computational process and the methods of solving them, based on the example of typical computational experiments used in the design and optimization of integrated circuits topography.
A domain of problem-solving models the problems using graphs, for the graphs are effective representation of such problems, leading to their efficient solutions. The nodes in a graph represent a division of unit work...
详细信息
The Spatiotemporal Weighted Regression (STWR) model is an extension of the Geographically Weighted Regression (GWR) model for exploring the heterogeneity of spatiotemporal processes. A key feature of STWR is that it u...
详细信息
The Spatiotemporal Weighted Regression (STWR) model is an extension of the Geographically Weighted Regression (GWR) model for exploring the heterogeneity of spatiotemporal processes. A key feature of STWR is that it utilizes the data points observed at previous time stages to make better fit and prediction at the latest time stage. Because the temporal bandwidths and a few other parameters need to be optimized in STWR, the model calibration is computationally intensive. In particular, when the data amount is large, the calibration of STWR becomes heavily time-consuming. For example, with 10,000 points in 10 time stages, it takes about 2307 s for a single-core PC to process the calibration of STWR. Both the distance and the weighted matrix in STWR are memory intensive, which may easily cause memory insufficiency as data amount increases. To improve the efficiency of computing, we developed a parallel computing method for STWR by employing the Message Passing Interface (MPI). A cache in the MPI processing approach was proposed for the calibration routine. Also, a matrix splitting strategy was designed to address the problem of memory insufficiency. We named the overall design as Fast STWR (F-STWR). In the experiment, we tested F-STWR in a High-Performance computing (HPC) environment with a total number of 204,611 observations in 19 years. The results show that F-STWR can significantly improve STWR's capability of processing large-scale spatiotemporal data.
In parallel high-performance computing (HPC) systems, network congestion is one of the main factors to degrade communication performance, because it may lead to increased end-to-end latency and power consumption. In t...
详细信息
ISBN:
(纸本)9781665427449
In parallel high-performance computing (HPC) systems, network congestion is one of the main factors to degrade communication performance, because it may lead to increased end-to-end latency and power consumption. In this work, we address this issue by the means of a simple routing algorithm on a target fine-grained circuit-switched (FGCS) network. The number of allocated slots for each FGCS switch in the network is a direct factor to affect the end-to-end latency. Our proposed approach employs a minimal congestion-aware routing (MiniCAR) method to perform better routing decisions and alleviate the network congestion so that the minimum necessary number of slots can be reduced in a target FGCS network. Evaluation results show that, compared to the traditional dimension order routing algorithm, MiniCAR occupies a smaller number of time slots by up to 50.8% on a 2-D torus interconnection network.
Fast and accurate wheeling pricing has emerged as an important issue in the recent competitive power market. Embedded cost-based wheeling pricing is well accepted by power market, because it is based on actual flow of...
详细信息
Fast and accurate wheeling pricing has emerged as an important issue in the recent competitive power market. Embedded cost-based wheeling pricing is well accepted by power market, because it is based on actual flow of power wheeled by them. It also recovers fully the fixed cost of wheeling facility installation and operation. In this article, metaphor-less Rao-3-based ACOPF, MVA-mile method and Bialek tracing has been employed to compute wheeling prices across various generators and loads. In actual power market due to continuously varying load conditions, the computation of wheeling prices is quite a time taking process. Because for computing wheeling prices, the optimal power flow (OPF) program has to be run each time for every loading condition. In this scenario, the artificial neural network (ANN) approach has been found to be very useful, to estimate wheeling prices instantly and accurately for any unseen loading scenario. Here, a number of ANNs have been developed under parallel computing environment. This article presents a metaphor-less Rao-3-based approach to project wheeling prices in the competitive power market by developing a new radial basis function neural network (RBFNN). The present work of wheeling pricing has been demonstrated and examined on IEEE 30-bus system.
Volunteer computing (VC) is one of the distributed computing paradigms, which exploits idle computing resources provided by vast amount of users on the Internet. In VC, individual nodes are usually unable to communica...
详细信息
Volunteer computing (VC) is one of the distributed computing paradigms, which exploits idle computing resources provided by vast amount of users on the Internet. In VC, individual nodes are usually unable to communicate with each other directly;therefore, current VC supports only bag-of-tasks computation, and this prevents widespread use of VC. Toward the realization of parallel VC, this paper proposes a parallel VC system based on the concept of server assisted communication. The proposed method replaces inter-node communication with a pair of two request-driven communication between sender/server and server/receiver. In the proposed parallel VC system, a VC server consists of an Apache web server and a MySQL database server, to ease the implementation of multi-threaded communication and stable and efficient data-management functions. A software tool is also developed to convert a parallel program written with a common MPI communication library into a program with a standard socket library with HTTP protocol. To demonstrate the feasibility of the proposed system, we have implemented the parallel VC system and evaluated the execution time of basic communication functions and parallel programs in NAS parallel benchmarks. The results show that the execution time of basic communication functions is acceptable for the practical use of VC and benchmark programs are successfully executed on the proposed systems, demonstrating the feasibility of parallel computation in VC environments. (c) 2025 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.
暂无评论