Cellular automata (CA) are fully parallel computational models and are widely applied to numerical modelling for many complex systems or nonlinear systems, such as fluid dynamics. Those systems are often governed by n...
详细信息
ISBN:
(纸本)0818674601
Cellular automata (CA) are fully parallel computational models and are widely applied to numerical modelling for many complex systems or nonlinear systems, such as fluid dynamics. Those systems are often governed by nonlinear partial differential equations which are hard to solve by using traditional numerical methods. In this paper, based on CA, a general model for a kind of evolutionary physics systems is proposed. As an example, a CA-like model for nonlinear parabolic equation is built by using multi-scalar analysis. The model is applied to several typical problems and satisfactory results are achieved.
Heterogeneous clusters claim for new models and algorithms. In this paper, a new parallelcomputational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous ...
详细信息
Heterogeneous clusters claim for new models and algorithms. In this paper, a new parallelcomputational model is presented. The model, based on the LogGP model, has been extended to be able to deal with heterogeneous parallel systems. For that purpose, the LogGP's scalar parameters have been replaced by vector and matrix parameters to take into account the different nodes' features. The work presented here includes the parametrization of a real cluster, which illustrates the impact of node heterogeneity over the model's parameters. Finally, the paper presents some experiments that can be used for assessing the method's validity, together with the main conclusions and future work.
In this paper,the state-of-the-art parallelcomputational model research is *** will introduce various models that were developed during the past *** to their targeting architecture features,especially memory organiza...
详细信息
In this paper,the state-of-the-art parallelcomputational model research is *** will introduce various models that were developed during the past *** to their targeting architecture features,especially memory organization,we classify these parallel computational models into three *** models and their characteristics are discussed based on three generations *** believe that with the ever increasing speed gap between the CPU and memory systems,incorporating non-uniform memory hierarchy into computationalmodels will become *** the emergence of multi-core CPUs,the parallelism hierarchy of current computing platforms becomes more and more *** this complicated parallelism hierarchy in future computationalmodels becomes more and more important.A semi-automatic toolkit that can extract model parameters and their values on real computers can reduce the model analysis complexity,thus allowing more complicated models with more parameters to be *** memory and hierarchical parallelism will be two very important features that should be considered in future model design and research.
The main goal of this work is to provide a set of tools able to give direct support for the most known parallel processing techniques when developing parallel application. Our approach departs from the classification ...
详细信息
ISBN:
(纸本)3540593934
The main goal of this work is to provide a set of tools able to give direct support for the most known parallel processing techniques when developing parallel application. Our approach departs from the classification of parallel computing paradigms and the associated parallelization techniques and from the definition of a set of structures and procedural interfaces able to partially solve the problems associated with these paradigms. Defining the concept of parallelcomputational Frames (PCF) we propose a way to combine different parallelization techniques to solve a complex problems. Moreover we provide an interactive graphical development environment, XHive, in which the whole applications development take place.
Convolutional neural networks (CNNs) have been widely used for image analysis and recognition. For example, LeNet-5 is a 7-layer convectional neural network, which can attain more than 99% test accuracy for classifica...
详细信息
ISBN:
(纸本)9781665435772
Convolutional neural networks (CNNs) have been widely used for image analysis and recognition. For example, LeNet-5 is a 7-layer convectional neural network, which can attain more than 99% test accuracy for classification of handwritten digits. CNNs repeats convolution and pooling operations alternately. However, the computational capability of such operations is not clear. We are curious to know a class of problems that can be solved by CNNs. As a formal approach for this task, we introduce a theoretical parallelcomputational model of CNNs that we call the convolution-pooling machine. It captures the essence of convolution and pooling operations, and application of non-linear activation functions performed in CNNs. In this paper, we assume the convolution-pooling machine operating on 1-dimensional arrays for simplicity, and focus on the problem of classification of inputs by the distance of two feature points. More specifically, we will design a convolution-pooling machine solving the problem D-k (k >= 1), a problem to determine if the distance of the two 1's is at most k or not. For designing the convolution-pooling machine solving the problem Dk, we generate a mixed-integer linear programming problem (MILP) with constraints and objective functions. We have solved the generated linear programming problem for each Dk (1 <= k <= 128) by Gurobi optimizer, a commercial MILP solver. We succeeded in finding a solution for all D-k (1 <= k <= 128) and designing the convolution-pooling machine for solving them. This fact indicates that convolution and pooling operations in CNNs may have the computational capability of classification by the distance of feature points.
We propose a novel computational model for GPU. Known parallel computational models such as the PRAM model are not appropriate for evaluating GPU algorithms. Our model, called AGPU, abstracts the essence of current GP...
详细信息
ISBN:
(纸本)9781479941162
We propose a novel computational model for GPU. Known parallel computational models such as the PRAM model are not appropriate for evaluating GPU algorithms. Our model, called AGPU, abstracts the essence of current GPU architectures such as global and shared memory, memory coalescing and bank conflicts. We can therefore evaluate asymptotic behavior of GPU algorithms more accurately than known models and we can develop algorithms that are efficient on many real architectures. As a showcase, we first analyze known comparison-based sorting algorithms using the AGPU model and show that they are not I/O optimal, that is, the number of global memory accesses is more than necessary. Then we propose a new algorithm which uses an asymptotically optimal number of global memory accesses and whose time complexity is also nearly optimal.
This paper presents a framework of usingresource metricsto characterize the various models of parallel computation. Our framework reflects the approach of recent models to abstract architectural details into several g...
详细信息
This paper presents a framework of usingresource metricsto characterize the various models of parallel computation. Our framework reflects the approach of recent models to abstract architectural details into several generic parameters, which we call resource metrics. We examine the different resource metrics chosen by different parallelmodels, categorizing the models into four classes: the basic synchronous models, and extensions of the basic models which more accurately reflect practical machines by incorporating notions of asynchrony, communication cost, and memory hierarchy. We then present a new parallel computation model, the LogP-HMM model, as an illustration of design principles based on the framework of resource metrics. The LogP-HMM model extends an existing parameterized network model (LogP) with a sequential hierarchical memory model (HMM) characterizing each processor. The result captures both network communication costs and the effects of multileveled memory such as local cache and I/O. More generally, the LogP-HMM is representative of a class of models formed by combining a network model with any of several existing hierarchical memory models. Along these lines we introduce a variant of the LogP-HMM model, the LogP-UMH, which combines the LogP with the Universal Memory Hierarchy (UMH) model. We examine the potential utility of both our models in the design of several near optimal FFT and sorting algorithms. We also examine the potential of the LogP-UMH to more accurately reflect parallel machines by matching the model to the CM-5 and IBM SP2.
暂无评论