Traditionally developers in the automatic testing community have had no viable paradigm for handling concurrent systems. Test suites that could be processed in one hour using concurrency are being processed in 24 hour...
详细信息
Traditionally developers in the automatic testing community have had no viable paradigm for handling concurrent systems. Test suites that could be processed in one hour using concurrency are being processed in 24 hours sequentially. The structured distributed programming paradigm (SDPP) is a non-traditional paradigm that allows developers to conceive viable distributed/concurrent/parallel systems with the same ease and assurance that they have been conceiving regular single-process sequential systems for the past 30 years. Structured distributed programming (SDP) is primitive, recursive, and applicable at all levels from application systems to operating systems to hardware systems to network systems. SDP is a mindset like object-oriented programming (OOP) is a mindset. As with OOP, the concepts of SDP are primitive and supportable by primitive syntax/semantics in programming languages. At compile time, an OOP sequential system is completely visualizable as a tree-structured graph. At compile time, an SDP system is completely visualizable as dual graphs, one a tree-structured graph of time and the other a cyclic graph of space. Since SDP systems are completely visualizable at compile time, test systems can be guaranteed, at compile time, to never race or deadlock; to always complete, cooperate and integrate with their environment; and to be maintainable, modifiable, reliable, and scalable to any degree.
The demand for high-density, high-speed programming in flash memories has been increasing because their expanding applications in portable equipment such as digital still cameras and music players. A multilevel techni...
详细信息
The demand for high-density, high-speed programming in flash memories has been increasing because their expanding applications in portable equipment such as digital still cameras and music players. A multilevel technique is one of the most effective approaches for improving memory density. But long cell programming time and precise control of the memory cell's threshold voltage (Vth) degrade its programming performance. To realize fast cell programming, we have developed a so-called assist-gate (AG)-AND-type flash cell, in which programming is performed by source side channel hot electron injection (SSI). In this paper, we developed a constant-charge-injection programming, which realizes fast precise control of Vth by suppressing the characteristic deviation. By utilizing proposed scheme, we achieved. 10.3-MB/s programming throughput in multilevel AG-AND flash memories.
Symmetric Multiprocessor (SMP) system and cluster of SMPs have become widely available as cost-effective parallel platforms. In this paper, the comparison of Motion Estimation (ME) algorithms implemented on SMP system...
详细信息
Symmetric Multiprocessor (SMP) system and cluster of SMPs have become widely available as cost-effective parallel platforms. In this paper, the comparison of Motion Estimation (ME) algorithms implemented on SMP system and cluster of SMPs was evaluated. The same parallel program written using MPI (Message-Passing Interface) programming model with different compilation procedures were implemented. Results show that the parallel ME algorithms scales well for SMP system (up to 6 processors) and cluster of SMPs (up to 23 processors). In terms of price/performance ratio, it is preferable to have a node with dual-processors than single-processor.
This paper proposes a novel genetic parallel programming (GPP) paradigm for evolving optimal parallel programs running on a multi-ALU processor by linear genetic programming. GPP uses a two-phase evolution approach. I...
详细信息
This paper proposes a novel genetic parallel programming (GPP) paradigm for evolving optimal parallel programs running on a multi-ALU processor by linear genetic programming. GPP uses a two-phase evolution approach. It evolves completely correct solution programs in the first phase. Then it optimizes execution speeds of solution programs in the second phase. Besides, GPP also employs a new genetic operation that swaps sub-instructions of a solution program. Three experiments (Sextic, Fibonacci and Factorial) are given as examples to show that GPP could discover novel parallel programs that fully utilize the processor's parallelism.
Traditionally, a local area network (LAN) has been used for parallel programming with PVM and MPI. The improvement of communications in wireless local area networks (WLANs) achieving up to 11 Mbps make them, according...
详细信息
ISBN:
(纸本)0769514448
Traditionally, a local area network (LAN) has been used for parallel programming with PVM and MPI. The improvement of communications in wireless local area networks (WLANs) achieving up to 11 Mbps make them, according to some authors, candidates to be used as a resource for grid computing. In this paper we use our library based on LAM/MPI named LAMGAC in order to parallelize an algorithm that finds the global minimum of a nonlinear real valued continuous function. The algorithm uses a strategy based on the division of the domain into small boxes and it locates the extreme by means of a multiple start algorithm (MRS). The local minimizer is carried out by means of the steepest descent and the DFP method. The novelty of this approach is that we can vary the parallel virtual machine in runtime (spawning new processes using functions defined in MPI-2), we generate algorithms in which computations and communications are efficiently overlapped and we include a Web interface to offer our system as a grid resource. We have measured the execution time of some algorithms and the components of LAMGAC, obtaining interesting results.
The paper considers the modular programming with hierarchically structured multi-processor tasks on top of SPMD tasks for distributed memory machines. The parallel execution requires a corresponding decomposition of t...
详细信息
ISBN:
(纸本)9780769515243
The paper considers the modular programming with hierarchically structured multi-processor tasks on top of SPMD tasks for distributed memory machines. The parallel execution requires a corresponding decomposition of the set of processors into a hierarchical group structure onto which the tasks are mapped. This results in a multi-level group SPMD computation model with varying processor group structures. The advantage of this kind of mixed task and data parallelism is a potential to reduce the communication overhead and to increase scalability. We present a runtime library to support the coordination of hierarchically structured multi-processor tasks. The library exploits an extended parallel group SPMD programming model and manages the entire task execution including the dynamic hierarchy of processor groups. The library is built on top of MPI, has an easy-to-use interface, and leads to only a marginal overhead while allowing static planning and dynamic restructuring.
We investigate remapping multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of array transpose needs an auxiliary array of the same size and a copy bac...
详细信息
We investigate remapping multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an in-place method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. The independence of vacancy tracking cycles allows efficient parallelization of the in-place method on SMP architectures at node level. Performance of multi-threaded parallelism using OpenMP are tested with different scheduling methods and different number of threads. The vacancy tracking method is parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76. Across entire cluster of SMP nodes, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 4.44, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.
Shared object Distributed Shared Memory (DSM) minimizes the problem of false sharing by allowing programmer to control the sharing size. This shared object approach for distributed parallel programming works well in t...
详细信息
Shared object Distributed Shared Memory (DSM) minimizes the problem of false sharing by allowing programmer to control the sharing size. This shared object approach for distributed parallel programming works well in task parallelism but not in data parallelism. When the data of a shared object is being modified, a lock on that object must be enforced to exclude any concurrent access on that same object. If the shared data within an object is large, internal false sharing would become a problem. We present a multi-locking mechanism for shared object DSM which allows multiple locks be applied to the different data sets of a shared object and thus enhances its concurrency power.
This article studies a static scheduling method based on workload balancing. An equation is presented for the case when the workload is equally distributed onto all the processors. An efficient load balance scheduling...
详细信息
This article studies a static scheduling method based on workload balancing. An equation is presented for the case when the workload is equally distributed onto all the processors. An efficient load balance scheduling algorithm is developed assuming that the workload has certain properties. Finally, some computational results are given for the product between an upper diagonal matrix and a vector.
暂无评论