The main goal of distribution network(DN)expansion planning is essentially to achieve minimal investment con-strained by specified reliability *** reliability-constrained distribution network planning(RcDNP)problem ca...
详细信息
The main goal of distribution network(DN)expansion planning is essentially to achieve minimal investment con-strained by specified reliability *** reliability-constrained distribution network planning(RcDNP)problem can be cast as an instance of mixed-integer linear programming(MILP)which involves ultra-heavy computation burden especially for large-scale *** this paper,we propose a parallel computing based solution method for the RcDNP *** RcDNP is decomposed into a backbone grid and several lateral grid problems with ***,a parallelizable augmented Lagrangian algorithm with acceleration method is developed to solve the coordination planning *** lateral grid problems are solved in parallel through coordinating with the backbone grid planning ***-Seidel iteration is adopted on the subset of the convex hull of the feasible region constructed by *** mild conditions,the optimality and convergence of the proposed method are *** tests show that the proposed method can significantly reduce the solution time and make the RcDNP applicable for real-worldproblems.
parallel computing is a common method to accelerate remote sensing image processing. This article briefly describes six commonly used interpolation functions and studies three commonly used parallel computing methods ...
详细信息
parallel computing is a common method to accelerate remote sensing image processing. This article briefly describes six commonly used interpolation functions and studies three commonly used parallel computing methods of the corresponding nine interpolation algorithms in remote sensing image processing. First, two kinds of general parallel interpolation algorithms (for CPU and GPU, respectively) are designed. Then, in two typical application scenarios (data-intensive and computing-intensive), four computing methods (one serial method and three parallel methods) of these interpolation algorithms are tested. Finally, the acceleration effects of all parallel algorithms are compared and analyzed. On the whole, the acceleration effect of the parallel interpolation algorithm is better in computer-intensive scenario. In CPU-oriented methods, the speedup of all parallel interpolation algorithms mainly depends on the number of physical cores of CPU, whereas in GPU-oriented methods, a speedup is greatly affected by the computation complexity of an algorithm and the application scenario. GPU has a better acceleration effect on the interpolation algorithms with bigger computation complexity and has more advantages in the computing-intensive scenarios. In most cases, GPU-based interpolation is ideal for efficient interpolation.
Permanent Magnet Synchronous Motor (PMSM) drives are widely used for motion control industrial applications and electrical vehicle powertrains, where they provide a good torque-to-weight ratio and a high dynamical per...
详细信息
Permanent Magnet Synchronous Motor (PMSM) drives are widely used for motion control industrial applications and electrical vehicle powertrains, where they provide a good torque-to-weight ratio and a high dynamical performance. With the increasing usage of these machines, the demands on exploiting their abilities are also growing. Usual control techniques, such as field-oriented control (FOC), need some workaround to achieve the requested behavior, e.g., field-weakening, while keeping the constraints on the stator currents. Similarly, when applying the linear model predictive control, the linearization of the torque function and defined constraints lead to a loss of essential information and sub-optimal performance. That is the reason why the application of nonlinear theory is necessary. Nonlinear Model Predictive Control (NMPC) is a promising alternative to linear control methods. However, this approach has a major drawback in its computational demands. This paper presents a novel approach to the implementation of PMSMs' NMPC. The proposed controller utilizes the native parallelism of population-based optimization methods and the supreme performance of field-programmable gate arrays to solve the nonlinear optimization problem in the time necessary for proper motor control. The paper presents the verification of the algorithm's behavior both in simulation and laboratory experiments. The proposed controller's behavior is compared to the standard control technique of FOC and linear MPC. The achieved results prove the superior quality of control performed by NMPC in comparison with FOC and LMPC. The controller was able to follow the Maximal Torque Per Ampere strategy without any supplementary algorithm, altogether with constraint handling.
Accurate 3-dimensional(3-D)reconstruction technology for nondestructive testing based on digital radiography(DR)is of great importance for alleviating the drawbacks of the existing computed tomography(CT)-based *** co...
详细信息
Accurate 3-dimensional(3-D)reconstruction technology for nondestructive testing based on digital radiography(DR)is of great importance for alleviating the drawbacks of the existing computed tomography(CT)-based *** commonly used Monte Carlo simulation method ensures well-performing imaging results for ***,for 3-D reconstruction,it is limited by its high time *** solve this problem,this study proposes a parallel computing method to accelerate Monte Carlo simulation for projection images with a parallel interface and a specific DR *** images are utilized for 3-D reconstruction of the test *** verify the accuracy of parallel computing for DR and evaluate the performance of two parallel computing modes-multithreaded applications(G4-MT)and message-passing interfaces(G4-MPI)-by assessing parallel speedup and *** study explores the scalability of the hybrid G4-MPI and G4-MT *** results show that the two parallel computing modes can significantly reduce the Monte Carlo simulation time because the parallel speedup increment of Monte Carlo simulations can be considered linear growth,and the parallel efficiency is maintained at a high *** hybrid mode has strong scalability,as the overall run time of the 180 simulations using 320 threads is 15.35 h with 10 billion particles emitted,and the parallel speedup can be up to *** 3-D reconstruction of the model is achieved based on the filtered back projection(FBP)algorithm using 180 projection images obtained with the hybrid G4-MPI and *** quality of the reconstructed sliced images is satisfactory because the images can reflect the internal structure of the test *** method is applied to a complex model,and the quality of the reconstructed images is evaluated.
Option pricing is one of the most active Financial Economics research fields. Black-Scholes-Merton option pricing theory states that risk-neutral density is lognormal. However, markets' pieces of evidence do not s...
详细信息
Option pricing is one of the most active Financial Economics research fields. Black-Scholes-Merton option pricing theory states that risk-neutral density is lognormal. However, markets' pieces of evidence do not support that assumption. More realistic assumptions impose substantial computational burdens to calculate option pricing functions. Risk-neutral density is a pivotal element to price derivative assets, which can be estimated through nonparametric kernel methods. A significant computational challenge exists for determining optimal kernel bandwidths, addressed in this study through a parallel computing algorithm performed using Graphical Processing Units. The paper proposes a tailor-made Cross-Validation criterion function used to define optimal bandwidths. The selection of optimal bandwidths is crucial for nonparametric estimation and is also the most computationally intensive. We tested the developed algorithms through two data sets related to intraday data for VIX and S&P500 indexes.(c) 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
The scenario simulation analysis of water environmental emergencies is very important for risk prevention and control,and emergency *** quickly and accurately simulate the transport and diffusion process of high-inten...
详细信息
The scenario simulation analysis of water environmental emergencies is very important for risk prevention and control,and emergency *** quickly and accurately simulate the transport and diffusion process of high-intensity pollutants during sudden environmental water pollution events,in this study,a high-precision pollution transport and diffusion model for unstructured grids based on Compute Unified Device Architecture(CUDA)is *** finite volume method of a total variation diminishing limiter with the Kong proposed r-factor is used to reduce numerical diffusion and oscillation errors in the simulation of pollutants under sharp concentration conditions,and graphics processing unit acceleration technology is used to improve computational *** advection diffusion process of the model is verified numerically using two benchmark cases,and the efficiency of the model is evaluated using an engineering *** results demonstrate that the model perform well in the simulation of material transport in the presence of sharp ***,it has high computational *** acceleration ratio is 46 times the single-thread acceleration effect of the original *** efficiency of the accelerated model meet the requirements of an engineering application,and the rapid early warning and assessment of water pollution accidents is achieved.
In query-by-example spoken term detection (QbE-STD), reference utterances are matched with an audio query. A matching matrix-based approach to QbE-STD needs to compute a matching matrix between a query and reference u...
详细信息
In query-by-example spoken term detection (QbE-STD), reference utterances are matched with an audio query. A matching matrix-based approach to QbE-STD needs to compute a matching matrix between a query and reference utterance using an appropriate similarity metric. Recent approaches use kernel-based matching to compute this matching matrix. The matching matrices are converted to grayscale images and given to a CNN-based classifier. In this work, we propose to speed up QbE-STD by computing the matching matrix in parallel using a coarse-grained data parallelism approach. We explore two approaches to coarse-grained data parallelism: In the first approach, we compute parts of the matching matrix in parallel and then combine them to form a matching matrix, while in the second one, we propose to compute matrices in parallel. We also propose to convert the matching matrices into two-colored images using the threshold and use these images for QbE-STD. The efficacy of the proposed parallel computation approach is explored using the TIMIT dataset.
Artificial intelligence (AI) technology is developing explosively in application fields such as speech recognition, semantic understanding, and computer vision. In particular, the new generation of knowledge enhanceme...
详细信息
Artificial intelligence (AI) technology is developing explosively in application fields such as speech recognition, semantic understanding, and computer vision. In particular, the new generation of knowledge enhancement large language models have gradually become the infrastructure of social productivity. With the development of large models towards multi-mode, the input data and model parameters characterized by tensors are becoming increasingly large. This puts high demands on computation and storage. Tucker decomposition obtains the optimal low-rank representation of the natural tensor by factor matrices and a core tensor, reducing storage and computation requirements in big data and artificial intelligence applications. However, the existing Tucker decomposition methods show limited computational speed and convergence performance. In this paper, a general heterogeneous computing framework of the Tucker decomposition method is proposed, which analyzes the row independence of factor matrices and the column independence of Kruskal matrices, and updates the Kruskal matrices instead of the core tensor in a column-wise manner and factor matrices in a row-wise manner respectively to reduce the storage overhead in computation. Further, the proposed method employs a heterogeneous computing platform to speed up the computing bottlenecks and takes full advantage of finegrained parallel optimization technology to improve memory access efficiency. The experimental results show that the computation speed achieves a 3.1 to 75.4 times improvement compared to the latest methods. Among all the methods, it exhibits the best convergence performance.
This paper presents the development of a dynamics model for long track sections. It is based on an established short track model that utilises the Finite Element Method to describe rails and block models to describe s...
详细信息
This paper presents the development of a dynamics model for long track sections. It is based on an established short track model that utilises the Finite Element Method to describe rails and block models to describe sleepers, ballast and subballast. By implementing a parallel computing method, this innovation enables the construction of a true long track model: by segmenting the long track into shorter segments that are easier to compute. The model facilitates simulations to be run in parallel, thereby permitting simultaneous calculations of various numerical track variables. The model employs a Message Passing Interface framework to seamlessly link the track segments, handling the flow of data among the computing cores designated to each subdivided section. This strategic framework allows the long track model with the capability to simulate tracks of virtually any length, with the only constraints being the available computational resources and time. The claimed contribution about modelling capability is verified using two case studies on a 6km-long track involving different practical and conceptual train operational scenarios: emergency braking and constant braking force with constant train speed. These case studies show the flexibility and scalability of the method and its capability to handle complex track dynamic systems.
Accurate parameter identification is crucial for constructing high-fidelity battery digital twins. Existing methods often struggle with overfitting, measurement errors, and the neglect of hysteresis effects, directly ...
详细信息
Accurate parameter identification is crucial for constructing high-fidelity battery digital twins. Existing methods often struggle with overfitting, measurement errors, and the neglect of hysteresis effects, directly leading to poor model performance. To address these limitations, this study proposes an improved parameter identification method for LiFePO4 batteries. By categorizing model parameters into different groups, incorporating hysteresis considerations, employing a non-dominated sorting differential evolution optimization based on reference points, and combining Python battery mathematical modeling with parallel computing, the proposed method achieves excellent efficiency and accuracy. Under constant current conditions, the root mean square error remains consistently below 6 mV, and the average relative error is less than 0.15 % within the 3 % to 97 % state of charge range. Even for dynamic conditions, the error remains at a low level, with RMSE consistently below 8 mV and MRE consistently less than 0.15 %. Moreover, this method not only prevents the algorithm from becoming stuck, but also significantly accelerates the simulation, achieving a 696.5 % speedup compared to traditional method.
暂无评论