Additional upgrades to a software tool developed in the Aerospace Systems Directorate to perform domain decomposition of structured overset-grid systems for use in a Message-Passing Interface (MPI) environment on mass...
详细信息
ISBN:
(数字)9781624105784
ISBN:
(纸本)9781624105784
Additional upgrades to a software tool developed in the Aerospace Systems Directorate to perform domain decomposition of structured overset-grid systems for use in a Message-Passing Interface (MPI) environment on massively parallel high-performance computing (HPC) platforms are presented. This code, referred to as BUNGE, recursively sweeps through all potential partitions of a given topology, but does so in such a way as to quickly rule out invalid partitions, thus limiting the overall time spent performing the search. In this work, modifications to BUNGE, including the implementation of a partitioning approach based on a user-specified target block size and the development of a new approach to quickly find high-quality partitions with a specified number of processors are presented and examined for robustness and quality of the partition. The results from the BUNGE partitions are compared to those produced by the NASA structured-overset CFD solver OVERFLOW using its default partitioning algorithm over a wide range of test grid systems and processor counts. OVERFLOW was also modified to accept user-specified partitions, and the timing obtained from multiple runs of OVERFLOW using partitions obtained from BUNGE are compared to its default algorithm. Relative speedups of between 7% and 37% for both steady-state and unsteady computations are demonstrated when using the BUNGE partition with OVERFLOW compared to its default splitting algorithm over a numerous sample grid systems and processor counts.
Recently we proposed algorithms for concurrent execution on multiple clusters [11]. In this case, data partitioning is done at two levels;first, the data is distributed to a collection of heterogeneous parallel system...
详细信息
ISBN:
(纸本)9780769528335
Recently we proposed algorithms for concurrent execution on multiple clusters [11]. In this case, data partitioning is done at two levels;first, the data is distributed to a collection of heterogeneous parallel systems with different resources and startup time, then, on each system the data is evenly partitioned to the available nodes. In this paper we report on a simulation study of the algorithms.
Topic modeling is a very powerful technique in data analysis and data mining but it is generally slow. Many parallelization approaches have been proposed to speed up the learning process. However, they are usually not...
详细信息
ISBN:
(纸本)9781467377898
Topic modeling is a very powerful technique in data analysis and data mining but it is generally slow. Many parallelization approaches have been proposed to speed up the learning process. However, they are usually not very efficient because of the many kinds of overhead, especially the load-balancing problem. We address this problem by proposing three partitioning algorithms, which either run more quickly or achieve better load balance than current partitioning algorithms. These algorithms can easily be extended to improve parallelization efficiency on other topic models similar to LDA, e.g., Bag of Timestamps, which is an extension of LDA with time information. We evaluate these algorithms on two popular datasets, NIPS and NYTimes. We also build a dataset containing over 1,000,000 scientific publications in the computer science domain from 1951 to 2010 to experiment with Bag of Timestamps parallelization, which we design to demonstrate the proposed algorithms' extensibility. The results strongly confirm the advantages of these algorithms.
We present two practical algorithms for partitioning circuit components represented by rectilinear polygons so that they can be stored using the L-shaped corner stitching data structure; i.e., our algorithms decompose...
详细信息
We present two practical algorithms for partitioning circuit components represented by rectilinear polygons so that they can be stored using the L-shaped corner stitching data structure; i.e., our algorithms decompose a simple polygon into non-overlapping L-shapes and rectangles by using horizontal cuts only. The more general of our algorithms computes an optimal configuration for a wide variety of optimization functions, while the other computes a minimum configuration of rectangles and L-shapes. Both run in O(n+h log h) time, where n is the number of vertices in the polygon and h is the number of H-pairs. Experimental results on VLSI data demonstrate the gains in performance for corner stitching obtained by using our algorithms instead of traditional rectangular partitioning algorithms.
Using some little-known but very important results (Evseev's (1983) performance bound and the partitioned decoding technique), new algorithms have been devised, providing useful new performance results and complex...
详细信息
Using some little-known but very important results (Evseev's (1983) performance bound and the partitioned decoding technique), new algorithms have been devised, providing useful new performance results and complexity evaluations for practical soft-decision decoders. In addition, a deeper and clearer understanding of the problems and possibilities of this topic has emerged.< >
We consider a generalized Voronoi partitioning problem for a team of vehicles with planar rigid body dynamics. The proximity metric, that is, the generalized metric that determines the proximity relations between the ...
详细信息
ISBN:
(纸本)9781467360890
We consider a generalized Voronoi partitioning problem for a team of vehicles with planar rigid body dynamics. The proximity metric, that is, the generalized metric that determines the proximity relations between the vehicles and arbitrary points in the configuration space, corresponds to the decrease of a generalized energy metric that takes place during the transfer of each vehicle to its goal configuration. In particular, the employed proximity metric is induced by a quasi-Lyapunov function of a corresponding stabilization problem. One of the main motivations for the choice of this proximity metric is to obtain a class of spatial partitions whose computational cost is significantly lower than the one of spatial partitions whose proximity metric is the cost-to-go function of a corresponding optimal control problem, which were studied in our previous work. In particular, the structure of the generalized proximity metric utilized in this work allows us to develop simple and easily implementable partitioning algorithms that are applicable to problems involving vehicles with nonlinear dynamics. More importantly, the proposed partitioning algorithms can be implemented, under some mild assumptions, in a decentralized fashion that allows each vehicle to compute its own cell independently from its teammates. Numerical simulations that illustrate the theoretical developments are also presented.
The area of underwater passive target tracking has received considerable attention in the past decades, due to both its theoretical interest and its practical importance in several applications. Many powerful tools fr...
详细信息
The area of underwater passive target tracking has received considerable attention in the past decades, due to both its theoretical interest and its practical importance in several applications. Many powerful tools from the fields of signal processing, image processing, and estimation theory have been brought to bear for the solution of the passive target tracking problem. Among the latter, techniques based on Kalman filtering and techniques based on partitioning filters have been successfully used. The approaches based on Kalman filtering do not usually perform adequately when facing a maneuvering target, whereas the techniques based on partitioning filters perform very satisfactorily in the same case. In this paper, four approaches to the problem of underwater passive target tracking, based on the partitioning theory are reviewed and discussed. Their performance is also checked against that of Kalman filtering-based approaches in both maneuvering and non-maneuvering targets scenaria.< >
A sliced-layout architecture is presented to alleviate the problems of the general bit-sliced layouts. Also described are partitioning algorithms that are used to generate the floorplan for this layout architecture. T...
详细信息
A sliced-layout architecture is presented to alleviate the problems of the general bit-sliced layouts. Also described are partitioning algorithms that are used to generate the floorplan for this layout architecture. The partitioning algorithms not only select the best suited layout style for each component, but also consider critical paths, I/O pin locations, and connections between logic blocks. This approach improves the overall area utilization and minimizes the total wire length.< >
暂无评论