In this paper we study the development of parallel algorithms to solve advection-diffusion equations. Both synchronous and asynchronous algorithms contexts are considered. The solver we present is based on the multisp...
详细信息
In this paper we study the development of parallel algorithms to solve advection-diffusion equations. Both synchronous and asynchronous algorithms contexts are considered. The solver we present is based on the multisplitting Newton method that provides a coarse-grained scheme. Experiments are carried out in an heterogeneous grid environment in which both parallel algorithms are analyzed. Experiments allow us to draw some conclusions about the use of parallel iterative algorithms in a grid computing environment.
Summary form only given. We present a parallel algorithm for the construction of minimum redundancy length-restricted codes that is based on the package-merge algorithm of Larmore and Hirschberg (1990). Our algorithm ...
详细信息
Summary form only given. We present a parallel algorithm for the construction of minimum redundancy length-restricted codes that is based on the package-merge algorithm of Larmore and Hirschberg (1990). Our algorithm constructs a length-restricted code in O(L) time with n processors on a CREW PRAM. Thus our algorithm has the same time-processor product as the sequential algorithm of (1990). We also consider the problem of constructing the almost-optimal length-restricted codes.
BLAST (basic local alignment search tool), as a heuristic algorithm, is one of the most widely used sequence similarity search tools. MegaBlast, as an improved version of BLAST, speeds up the searches and improves the...
详细信息
BLAST (basic local alignment search tool), as a heuristic algorithm, is one of the most widely used sequence similarity search tools. MegaBlast, as an improved version of BLAST, speeds up the searches and improves the total throughput owing to greedy algorithm and batch processing. However, MegaBlast consumes a great deal of memory, which is proportional to the product of the size of the query file and database file. This paper proposes an optimized MegaBlast algorithm based on MegaBlast. The new algorithm exchanges the query and subject sequences, and builds a hash table based on new subject sequences. The optimized algorithm overlaps I/O with computation, further decreases the overall time and the cost of memory, which is only proportional to the size of the database file. The optimized algorithm is suitable to be parallelized on cluster systems. As our experiments shown, the parallel program, which is implemented with MPI, achieves high speedup.
Digital signal processors (DSP) are widespread in real-time systems. In the last ten years phase-locked loops have widely been used in DSP as control devices correcting a clock skew. In this paper new type of floating...
详细信息
Digital signal processors (DSP) are widespread in real-time systems. In the last ten years phase-locked loops have widely been used in DSP as control devices correcting a clock skew. In this paper new type of floating phase locked loops for DSP is designed. For the floating phase locked loops new stability conditions are obtained.
This paper presents a new generalized particle model approach of price and demands dynamic modulating for network bandwidth allocation. The approach transforms the complicated network bandwidth allocation problem into...
详细信息
ISBN:
(纸本)0780389379
This paper presents a new generalized particle model approach of price and demands dynamic modulating for network bandwidth allocation. The approach transforms the complicated network bandwidth allocation problem into the kinematics and dynamics of numerous link particles in force-field. The generalized particle model approach of dynamically modulating price and demands is featured by the powerful processing ability under complex environment, the market mechanism between the demands and service, and better adapting ability to the real time variation of network environment. Finally, we demonstrate the corresponding parallel algorithm and its simulations on bandwidth allocation
In this paper, we present a new, conceptual model that captures the benefits of protocol offload in the context of high performance computing systems. In contrast to the LAWS model, the extensible message-oriented off...
详细信息
In this paper, we present a new, conceptual model that captures the benefits of protocol offload in the context of high performance computing systems. In contrast to the LAWS model, the extensible message-oriented offload model (EMO) emphasizes communication in terms of messages rather than flows. In contrast to the LogP model, EMO emphasizes the performance of the network protocol rather than the parallel algorithm. The extensible message-oriented offload model allows protocol developers to consider the tradeoffs and specifics associated with offloading protocol processing including the reduction in message latency along with benefits associated with reduction in overhead and improvements to throughput. We give an overview of the EMO model and show how our model can be mapped to the LAWS and LogP models. We also show how it can be used to analyze individual messages within TCP flows by contrasting full offload (TCP offload engines) with other approaches, e.g., interrupt coalescing and splintered TCP
Scalability is a key factor of the design of distributed systems and parallel algorithms and machines. However, conventional scalabilities are designed for homogeneous parallel processing. There is no suitable and com...
详细信息
ISBN:
(纸本)0769523803
Scalability is a key factor of the design of distributed systems and parallel algorithms and machines. However, conventional scalabilities are designed for homogeneous parallel processing. There is no suitable and commonly accepted definition of scalability metric for heterogeneous systems. Isospeed scalability is a well-defined metric for homogeneous computing. This study extends the isospeed scalability metric to general heterogeneous computing systems. The proposed isospeed-efficiency metric is suitable for both homogeneous and heterogeneous computing. Through theoretical analysis, we derive methodologies of scalability measurement and prediction for heterogeneous systems. Experimental results verify the analytical results and confirm that the proposed isospeed-efficiency scalability works well in both homogeneous and heterogeneous environments.
We presented PRFX, an API dedicated to the programming of irregular parallel algorithms with static properties and its runtime support for clusters of SMP nodes. The programming model is based on a task paradigm with ...
详细信息
We presented PRFX, an API dedicated to the programming of irregular parallel algorithms with static properties and its runtime support for clusters of SMP nodes. The programming model is based on a task paradigm with implicit synchronizations. These tasks operate on data that are dynamically allocated in an isomemory. The synchronization between tasks is statically computed by an inspector. This allows building data dependencies thanks to a partial pre-execution of the code, and produces a task DAG with all necessary information for the static scheduler and the parallel executor. The parallel executor uses POSIX threads with one-sided communications and works on shared and distributed memory machines. Its performances were validated by experimental results for a sparse Cholesky factorization algorithm on an IBM/sup /spl reg// SP cluster with nodes of 32 Power4 processors.
Automatic pattern search in event traces is a powerful method to identify performance problems in parallel applications. We demonstrate that knowledge about the virtual topology, which defines logical adjacency relati...
详细信息
Automatic pattern search in event traces is a powerful method to identify performance problems in parallel applications. We demonstrate that knowledge about the virtual topology, which defines logical adjacency relationships between processes, can be exploited to explain the occurrence of inefficiency patterns in terms of the parallelization strategy used in an application. We show correlations between higher-level events related to a parallel wavefront scheme and wait states identified by our pattern analysis. In addition, we visually expose relationships between pattern occurrences and the topological characteristics of the affected processes.
This paper presents a parallelization strategy in heterogeneous clusters of the Gauss-Seidel's method applied for the solution of sparse equation systems. From the point of view of the numerical solution for matri...
详细信息
This paper presents a parallelization strategy in heterogeneous clusters of the Gauss-Seidel's method applied for the solution of sparse equation systems. From the point of view of the numerical solution for matrices of coefficients with low density of non null-elements, the standard lines of thought are followed, that is, only non-null elements are stored and iterative solution-search methods are used. Two basic guidelines are defined for the parallel algorithm: one-dimensional data distribution and broadcast messages for all data communications. One-dimensional data distribution eases the processing workload balance on heterogeneous clusters. The use of broadcast messages for every data communication is directly oriented to optimize performance on the the most common cluster interconnection: Ethernet. Experimental results obtained in a local network of heterogeneous computers are presented.
暂无评论