Increasingly complex systems need parallelized simulation engines. In the context of SystemC simulation, existing proposals require predicting communication in the simulated system. However, this is often unpredictabl...
详细信息
Increasingly complex systems need parallelized simulation engines. In the context of SystemC simulation, existing proposals require predicting communication in the simulated system. However, this is often unpredictable. In order to deal with unpredictable systems, this paper presents a parallelization approach using asynchronous communication without modification of the SystemC simulation engine. Simulated system model is cut up and distributed across separate simulation engines, each part being evaluated in parallel of others. Functional consistency is preserved thanks to the simulated system write exclusive memory access policy while temporal consistency is guaranteed using explicit synchronization. Experimental results show up a speed-up up to 13× on 16 processors.
In this paper, we undertake the development and performance evaluation of a parallel system for the compression of medical images. The system under study consists of a network of workstations running a parallel implem...
详细信息
In this paper, we undertake the development and performance evaluation of a parallel system for the compression of medical images. The system under study consists of a network of workstations running a parallel implementation of a wavelets compression software. Different data distribution and synchronization strategies are evaluated aiming to minimize the system response time. Numerical results of the different strategies under investigation are provided.
Finding pair wise document relatedness plays an important role in a variety of Natural Language processing problems. Google Trigram Method (GTM) is one of the corpus-based unsupervised method that can be used to captu...
详细信息
Finding pair wise document relatedness plays an important role in a variety of Natural Language processing problems. Google Trigram Method (GTM) is one of the corpus-based unsupervised method that can be used to capture word relatedness and document relatedness. It has been shown that it is possible to apply GTM to construct high quality document relatedness applications. However, there are challenges in implementing GTM for pair-wise document relatedness computation on a large volume of document set given its high computational complexity. This paper presents time and space efficient methods for the computation of pair-wise document relatedness using GTM. In order to improve the performance algorithmic engineering, data structure enhancement, and parallel computing methods are applied. Two parallelmethods are discussed in this paper: shared memory multicore implementation and distributed memory Hadoop implementation. Both parallelmethods provide an order of magnitude improvement in accelerating the pair-wise document relatedness computation using GTM.
It is important for digital modulation identification based on neural network to determine a suitable size of a network. This paper proposes that the number of nodes in hidden layer which is the core for network optim...
详细信息
It is important for digital modulation identification based on neural network to determine a suitable size of a network. This paper proposes that the number of nodes in hidden layer which is the core for network optimizing can be confirmed by K-L transform. The validity and robustness are verified by simulation. The percentage of correct identification (PCI) is almost the same before and after the network optimizing.
Converters play an imperative part in joining conveyed generators and directing control stream of microgrid and active distributed network (ADN). The hybrid active power filter is proposed to improve responsive contro...
详细信息
Converters play an imperative part in joining conveyed generators and directing control stream of microgrid and active distributed network (ADN). The hybrid active power filter is proposed to improve responsive control with a moo amplitude of operation's voltage. Its control extends and characteristics are distinctive based on a routine voltage source converter (VSC). Regulation of parallel-connected HAPF and VSC in active distributed networks is examined in this work. To completely utilize the control efficiency of every sort of converter, the power output allocation calculation among these two devices is considered for reduce add up to converter rating. A control reference assurance strategy is hence proposed and inserted in various leveled control system of the active distributed network. Comes about demonstrate that converter capacity is decreased by combining HAPF and VSC. Decoupling regulation method is utilized as the essential regulation strategy to every converter. Reenactment comes about are given to appear the legitimacy of the proposed strategy.
In this paper we identified and outlined the disadvantages of the traditional data mapping methods for the numerical solution of PDEs on distributed memory MIMD machines and we proposed a new approach that eliminates ...
详细信息
In this paper we identified and outlined the disadvantages of the traditional data mapping methods for the numerical solution of PDEs on distributed memory MIMD machines and we proposed a new approach that eliminates some of the disadvantages. Specifically, we presented a data-mapping approach based on parallel structured grid generation. The new approach is based on composite block structures to contract the size of the data-mapping problem. It is ten times faster than the fastest traditional data-mapping method, for relatively small problems, and approximately O(P) times faster, for very large problems (i.e., millions of grid points) that are processed on coarse-grain distributed memory MIMD machines with P processors.< >
image segmentation is one of the mostly used procedures in the medical imageprocessing applications. Due to the high resolution characteristics of the medical images and a large amount of computational load in mathem...
详细信息
ISBN:
(纸本)9781457702150
image segmentation is one of the mostly used procedures in the medical imageprocessing applications. Due to the high resolution characteristics of the medical images and a large amount of computational load in mathematical methods, medical image segmentation process has an excessive computational complexity. Recently, FPGA implementation has been applied in many areas due to its parallelprocessing capability. In this study, neighbor-pixel-intensity based method for feature extraction and Grow and Learn (GAL) network for segmentation process are proposed. The proposed method is comparatively examined on both PC and FPGA platforms.
Achieving an efficient global illumination method is one of the more important current issues in computer graphics. The main problem is that the simulation for large scenes is a highly time-consuming process. In this ...
详细信息
Achieving an efficient global illumination method is one of the more important current issues in computer graphics. The main problem is that the simulation for large scenes is a highly time-consuming process. In this paper we present a new nonuniform partitioning method in order to achieve a good load balance on a distributed memory system. This method is based on the minimization of a distribution function. Furthermore, we have developed a scheduling based on processor hierarchy which permits the utilization of a visibility mask with crossed frontiers. Finally, in order to evaluate the proposed method, we have used a progressive radiosity algorithm, running on a PC cluster with modern processor and network technology as a testbed. An efficient performance in terms of execution time and speedup has been obtained.
In this paper, eavesdropping in paralleldistributed sequential detections is considered. The privacy risk is evaluated by the minimal achievable Bayesian risk of a greedy and informed eavesdropper who is curious abou...
详细信息
ISBN:
(纸本)9781479970896
In this paper, eavesdropping in paralleldistributed sequential detections is considered. The privacy risk is evaluated by the minimal achievable Bayesian risk of a greedy and informed eavesdropper who is curious about the hypothesis realization. We propose a novel metric based on Bayesian risk to take the detection performance and privacy risk with different weights into account. We formulate and study the privacy-concerned paralleldistributed Bayesian sequential detection problem under a finite time-horizon assumption. Solving this problem will lead to the optimal distributed sequential detection design which achieves the minimal privacy-concerned Bayesian risk. The study shows that it is not sufficient to consider a deterministic likelihood-ratio test for a remote decision maker at an active time index in the optimal privacy-concerned system design. However, properties of the optimal design indicate that the standard method can be extended to solve the proposed problem.
Collaborative filtering algorithms are important building blocks in many practical recommendation systems. For example, many large-scale data processing environments include collaborative filtering models for which th...
详细信息
Collaborative filtering algorithms are important building blocks in many practical recommendation systems. For example, many large-scale data processing environments include collaborative filtering models for which the Alternating Least Squares (ALS) algorithm is used to compute latent factor matrix decompositions. In this paper, we propose an approach to accelerate the convergence of parallel ALS-based optimization methods for collaborative filtering using a nonlinear conjugate gradient (NCG) wrapper around the ALS iterations. We also provide a parallel implementation of the accelerated ALS-NCG algorithm in the Apache Spark distributed data processing environment, and an efficient line search technique as part of the ALS-NCG implementation that requires only one pass over the data on distributed datasets. In serial numerical experiments on a linux workstation and parallel numerical experiments on a 16 node cluster with 256 computing cores, we demonstrate that the combined ALS-NCG method requires many fewer iterations and less time than standalone ALS to reach movie rankings with high accuracy on the MovieLens 20M dataset. In parallel, ALS-NCG can achieve an acceleration factor of 4 or greater in clock time when an accurate solution is desired; furthermore, the acceleration factor increases as greater numerical precision is required in the solution. Furthermore, the NCG acceleration mechanism is efficient in parallel and scales linearly with problem size on synthetic datasets with up to nearly 1 billion ratings. The acceleration mechanism is general and may also be applicable to other optimization methods for collaborative filtering.
暂无评论