An improved parallel ray casting algorithm in embedded multi-core DSP system is proposed in this paper. In order to speed up the process of intersection, the algorithm takes advantage of the improved bounding volume h...
详细信息
A Bayesian network is one of the graphical models that represent the causality or correlation relationship among multiple observed phenomena. the structure learning of this network is generally NP difficult, and the c...
详细信息
ISBN:
(纸本)9781665423021
A Bayesian network is one of the graphical models that represent the causality or correlation relationship among multiple observed phenomena. the structure learning of this network is generally NP difficult, and the computational time to obtain an approximate solution becomes huge. this paper proposes an FPGA accelerator for structure learning of Bayesian networks. the proposed method employs a dataflow type architecture and executes processes without dependency in dynamic programming in parallel. By iteratively using processing elements at each processing stage, we can efficiently use limited resources while taking advantage of the parallel performance of FPGAs. We implemented the proposed method for Xilinx Alveo U200 using high-level synthesis. Evaluation results showed that we achieved up to 12.6 times faster than single-core execution of software and up to 2.98 times faster than on multi-core execution. We further appliedthe proposed method to the Local-to-Global algorithm and achieves 8.6 times faster than the software execution in the structure learning of a practical network with 37 nodes.
Nowadays, cryptography plays an important role in the field of information security. the most common symmetric cryptographic algorithm is Advanced Encryption Standard (AES), which is based on the well-known Rijndael a...
详细信息
ISBN:
(纸本)9781728166872
Nowadays, cryptography plays an important role in the field of information security. the most common symmetric cryptographic algorithm is Advanced Encryption Standard (AES), which is based on the well-known Rijndael algorithm and is used worldwide in every domain. In this document, we present the implementation of the AES algorithm in two parallel modes of operation (CTR and XTS) withthe OpenCL programming language. We used OpenCL because it is designed for parallel computing on heterogeneous platforms and ensures portability. Furthermore, we applied 128, 192 and 256 bit cryptographic key size and a file size ranging from 512B to 8MB to evaluate the performance of the kernel runtime and throughput (Gbps). the results have shown that, the performance of the CTR mode is better than the XTS mode. CTR mode speeds up the process of encryption with 128 bit key over 10.15%, 192 bit key over 10.09% and 256 bit key over 10.05%. the decryption process shows 128 bit key acceleration over 10.11%, 192 bit key over 10.05% and 256 bit key over 10.02%. Finally, comparing the results of our implementation to other similar parallel models, we have achieved better throughput performance.
Modern large-scale distributed computing systems, processing large volumes of data, require mature monitoring systems able to control and track in resources, networks, computing tasks, queues and other components. In ...
详细信息
Smart meters allow energy providers to monitor their customers' power consumption. this fine-grained data stream generates many data points, which hides broader trends in power consumption and makes it difficult f...
详细信息
ISBN:
(纸本)9781728162515
Smart meters allow energy providers to monitor their customers' power consumption. this fine-grained data stream generates many data points, which hides broader trends in power consumption and makes it difficult for energy providers to make decisions regarding a specific customer or a subset of customers. Since the raw power data has little direct use, various algorithms have been proposed to lower the dimensionality of data, discover trends, study relationships between different features of collected data, and summarize data. these analytical techniques make the data more palatable to the end user. Analyzing smart meter data is computationally intensive as there is a large number of households connected to one energy provider, and each household generates years of data at hourly intervals. To speed up the analysis, clusters of commodity computers have been used. Ironically, such clusters consume substantial energy - studies have shown that about 10% of the world-wide supply of electrical power is consumed by the computing infrastructure. In this paper, we describe the use of a graphics processing unit (GPU) to analyze smart meter data, and compare its performance with a conventional multi-core CPU. We discuss the technical challenges in programming a GPU effectively to process smart meter data, and demonstrate experimentally that this choice of implementation enables substantial improvements in terms of both running time and energy-efficiency as compared to the multi-core CPU.
this paper presents an experimental performance study of a parallel implementation of the Poissonian image restoration algorithm. Hybrid parallelization based on MPI and OpenMP standards is investigated. the implement...
详细信息
In the present paper, a novel approach to parallel computations for solving time-consuming multicriteria global optimization problems is presented. this approach includes various methods for the scalarization of vecto...
详细信息
the voice quality evaluation of communication systems is necessary for technical and commercial reasons for the expansion of digital networks, mobile or VoIP and speech synthesis systems. Voice quality can be evaluate...
详细信息
the voice quality evaluation of communication systems is necessary for technical and commercial reasons for the expansion of digital networks, mobile or VoIP and speech synthesis systems. Voice quality can be evaluated using two types of methods: subjective MOS (Mean Opinion Score) and objective PESQ (Perceptual Evaluation Speech Quality). In this work, we propose a study of voice quality evaluation methods; in which a subjective and objective evaluation platform of voice quality is provided. We applied different tests used in the assessment of the speech signal quality to speech synthesis signal for the Arabic Language (GArabic: Generic Arabic). the listeners recognized the majority of changes of consonants and vowels in words and sentences with respective percentages of 98.26%, 92.97%. the comparison between MOS and PESQ tests gave a good correlation coefficient of 0.922.
the aim of this paper is to present two versions of a new divide and conquer parallel algorithm for solving tridiagonal Toeplitz systems of linear equations. Our new approach is based on a recently developed algorithm...
详细信息
the Bartels-Stewart algorithm is a standard approach to solving the dense Sylvester equation. It reduces the problem to the solution of the triangular Sylvester equation. the triangular Sylvester equation is solved wi...
详细信息
暂无评论