A recently developed sparse representation algorithm, has been proved to be useful for multi-object tracking and this study is a proposal for developing its parallelisation. An online dictionary learning is used for o...
详细信息
A recently developed sparse representation algorithm, has been proved to be useful for multi-object tracking and this study is a proposal for developing its parallelisation. An online dictionary learning is used for object recognition. After detection, each moving object is represented by a descriptor containing its appearance features and its position feature. Any detected object is classified and indexed according to the sparse solution obtained by an orthogonal matching pursuit (OMP) algorithm. For a real-time tracking, the visual information needs to be processed very fast without reducing the results accuracy. However, both the large size of the descriptor and the growth of the dictionary after each detection, slow down the system process. In this work, a novel accelerating OMP algorithm implementation on a graphics processing unit is proposed. Experimental results demonstrate the efficiency of the parallel implementation of the used algorithm by significantly reducing the computation time.
In this paper we present two sets of parallel algorithms for identifying real-time, small-signal dynamic models of power systems using multiple sources of Synchrophasor data. The first problem is posed in terms of ide...
详细信息
In this paper we present two sets of parallel algorithms for identifying real-time, small-signal dynamic models of power systems using multiple sources of Synchrophasor data. The first problem is posed in terms of identifying the transfer matrix of single-input multiple-output (SIMO) power system models using linear least-squares (LLS), where parallelism can be implemented through parallel execution of matrix multiplications using multiple processors or workers. Given the constraints of sequential communication and limited local memory, which may arise due to multiple applications running in the workers at the same time, a novel scheduling algorithm is proposed to enable flexible deadlines that meet these constraints. The scheduling algorithm minimizes the total time of execution under constraints, and can be solved via integer programming. The second problem is posed as a similar parallel algorithm for identifying a linearized state-variable (SV) model of a power system using both linear and nonlinear least-squares (NLS) in presence of scheduling. The performance of all the algorithms are studied via simulations of an IEEE 145-bus, 50-machine power system model, and compared with their centralized, non-parallel implementation.
The Machine-Part Cell Formation Problem (MPCFP) is a NP-Hard optimization problem that consists in grouping machines and parts in a set of cells, so that each cell can operate independently and the intercell movements...
详细信息
The Machine-Part Cell Formation Problem (MPCFP) is a NP-Hard optimization problem that consists in grouping machines and parts in a set of cells, so that each cell can operate independently and the intercell movements are minimized. This problem has largely been tackled in the literature by using different techniques ranging from classic methods such as linear programming to more modern nature-inspired metaheuristics. In this paper, we present an efficient parallel version of the Migrating Birds Optimization metaheuristic for solving the MPCFP. Migrating Birds Optimization is a population metaheuristic based on the V-Flight formation of the migrating birds, which is proven to be an effective formation in energy saving. This approach is enhanced by the smart incorporation of parallel procedures that notably improve performance of the several sorting processes performed by the metaheuristic. We perform computational experiments on 1080 benchmarks resulting from the combination of 90 well-known MPCFP instances with 12 sorting configurations with and without threads. We illustrate promising results where the proposal is able to reach the global optimum in all instances, while the solving time with respect to a nonparallel approach is notably reduced.
Recently, bio-inspired metaheuristic algorithms have been widely used as powerful optimization tools to estimate crucial parameters of photovoltaic (PV) models. However, the computational cost involved in terms of the...
详细信息
Recently, bio-inspired metaheuristic algorithms have been widely used as powerful optimization tools to estimate crucial parameters of photovoltaic (PV) models. However, the computational cost involved in terms of the time increases as data size or the complexity of the applied PV electrical model increases. Hence, to overcome these limitations, this paper presents the parallel particle swarm optimization (PPSO) algorithm implemented in Open Computing Language (OpenCL) to solve the parameter estimation problem for a wide range of PV models. Experimental and simulation results demonstrate that the PPSO algorithm not only has the capability of obtaining all the parameters with extremely high accuracy but also dramatically improves the computational speed. This is possible and is shown in this work via the inherent capabilities of the parallel processing framework. Copyright (C) 2015 John Wiley & Sons, Ltd.
parallel-computing-based implementation of the two recent fast parallel algorithms for the discrete Gabor transform (DGT) is presented in this paper. First of all, the first existing block time-recursive DGT algorithm...
详细信息
parallel-computing-based implementation of the two recent fast parallel algorithms for the discrete Gabor transform (DGT) is presented in this paper. First of all, the first existing block time-recursive DGT algorithm with parallel lattice structure is analysed, and then an improved implementation method under a parallel computing environment is presented. Each parallel channel (i.e. process in parallel computing) in the improved method is independent, thereby reducing the interprocess communication by 99.2% on average over the original algorithm. Second, the second existing fast parallel DGT algorithm based on multirate filtering is analysed. Through the use of parallel computing, the communication overhead of the multirate filtering-based parallel DGT algorithm is optimised and its time efficiency is raised from 31.26 times to 54.52 times faster than the serial fast DGT algorithm in processing of long sequences. Finally, the experimental results are compared and analysed, which indicate that the proposed fast DGT implementation methods are attractive for real-time signal processing.
Efficient GF(2(m)) arithmetic clearly affects the performance of compute-intensive applications. A new low-complexity parallel-in/out systolic AB(2) multiplier based on the least significant bit-first scheme is presen...
详细信息
Efficient GF(2(m)) arithmetic clearly affects the performance of compute-intensive applications. A new low-complexity parallel-in/out systolic AB(2) multiplier based on the least significant bit-first scheme is presented. Compared with related works, the scheme yields significantly lower area-time complexity.
The High Efficiency Video Coding (HEVC) standard, as the newest generation video coding standard issued in 2013, significantly improves compression performance relative to existing standards in about 50% bit-rate redu...
详细信息
ISBN:
(纸本)9781509054022
The High Efficiency Video Coding (HEVC) standard, as the newest generation video coding standard issued in 2013, significantly improves compression performance relative to existing standards in about 50% bit-rate reduction for equal perceptual video quality with the cost of greatly increasing the computation complexity of the encoder/decoder. In order to improve the decoding efficiency, we design a set of parallel decoding algorithms based on the CPU+GPU heterogeneous platform for the HEVC decoder, in which, the reconstruction processes with high computation complexity, including the inverse quantization (IQ), the inverse discrete cosine transformation (IDCT), the intra/inter decoder, the de-blocking filter (DF), and the sample adaptive offset (SAO), are processed by GPU in parallel, while the network abstract layer (NAL) bit stream parsing and the CABAC bit stream decoding are processed by CPU using serial algorithms on account that they are not suitable for parallel implementation due to their internally contextual relevance. We implement the parallel algorithms by using the compute unified device architecture (CUDA) and test them with various video sequences. The experimental results show that our method can achieve a significant improvement on the computation efficiency for the whole decoding processes and can achieve real-time decoding with more than 39 frames per second for HD videos.
Content delivery networks have been providing content delivery services for the last two decades using their own infrastructure. Now-a-days content delivery networks have the better option of using storage cloud sites...
详细信息
ISBN:
(纸本)9781509066223
Content delivery networks have been providing content delivery services for the last two decades using their own infrastructure. Now-a-days content delivery networks have the better option of using storage cloud sites as edge servers. The problems of replicating the content required by the users on optimal sites in Cloud and assigning the sites to users are considered in this work. Given a set of current user requests and cloud sites potential to the user, the combined problem of finding the optimal sites for content placement and content dissemination is set-cover problem. The Previous works solved this problem by using greedy algorithm. Primal-dual parallel algorithm for optimal content delivery in Cloud content delivery networks is proposed in this work. The proposed algorithm is an efficient parallel algorithm that requires only local information. Primal-dual algorithm takes less time than greedy algorithm and the experimental results demonstrate the fact.
We develop a parallel algorithm based on proximal method to solve the problem of minimizing summation of convex (not necessarily smooth) functions over a star network. We show that this method converges to an optimal ...
详细信息
ISBN:
(纸本)9781509045839
We develop a parallel algorithm based on proximal method to solve the problem of minimizing summation of convex (not necessarily smooth) functions over a star network. We show that this method converges to an optimal solution for any choice of constant stepsize for convex objective functions. Under further assumption of Lipschitz-gradient and strong convexity of objective functions, the method converges linearly.
Suffix trees have recently become very successful data structures in handling large data sequences such as DNA or Protein sequences. Consequently parallel architectures have become ubiquitous. We present a novel alpha...
详细信息
暂无评论