We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate r...
详细信息
ISBN:
(纸本)9781538668740
We present a new parallel algorithm for probabilistic graphical model optimization. The algorithm relies on data-parallel primitives (DPPs), which provide portable performance over hardware architecture. We evaluate results on CPUs and GPUs for an image segmentation problem. Compared to a serial baseline, we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare our performance to a reference, OpenMP-based algorithm, and find speedups of up to 7X (CPU).
A key component of most large-scale rendering systems is a parallel image compositing algorithm, and the most commonly used compositing algorithms are binary swap and its variants. Although shown to be very efficient,...
详细信息
ISBN:
(纸本)9781538668740
A key component of most large-scale rendering systems is a parallel image compositing algorithm, and the most commonly used compositing algorithms are binary swap and its variants. Although shown to be very efficient, one of the classic limitations of binary swap is that it only works on a number of processes that is a perfect power of 2. Multiple variations of binary swap have been independently introduced to overcome this limitation and handle process counts that have factors that are not 2. To date, few of these approaches have been directly compared against each other, making it unclear which approach is best. This paper presents a fresh implementation of each of these methods using a common software framework to make them directly comparable. These methods to run binary swap with odd factors are directly compared. The results show that some simple compositing approaches work as well or better than more complex algorithms that are more difficult to implement.
Ghost cells are important for distributed memory parallel operations that require neighborhood information, and are required for correctness on the boundaries of local data partitions. Ghost cells are one or more laye...
详细信息
ISBN:
(纸本)9781538606179
Ghost cells are important for distributed memory parallel operations that require neighborhood information, and are required for correctness on the boundaries of local data partitions. Ghost cells are one or more layers of grid cells surrounding the external boundary of the local partition, which are owned by other data partitions. They are used by the local partition when neighbor information is required. Finding ghost cells in structured data is trivial and can normally be done by calculating on cell indices. Obtaining ghost cells for unstructured grid data, however, is a nontrivial task. It requires an analysis of the connectivity of the grid in order to find neighbor cells. When the grid is distributed, the operation is further complicated by the need to determine which processes own the neighbor cells, and coordinating communication with them. This is a problem when operating on unstructured grid data sets that do not already have ghost cells. parallelvisualization algorithms will usually assume that a cell does not exist if it is not in the local data partition. Without ghost cells, this leads to operations that need neighborhood information, such as point data interpolated from cell data, being calculated incorrectly at partition boundaries. Production visualization tools generally support the existence of ghost cells, but not their generation, especially for unstructured grids. In the literature, there is no documented algorithm for generating in parallel one or more layers of ghost cells for unstructured grid data which has already been partitioned. We present a new algorithm to compute ghost cells in parallel on distributed unstructured data sets with no global cell or point IDs. Given a partitioned data set without ghost cells, this algorithm is capable of producing any number of layers of ghost cells necessary to support parallel operations. Performance results and timing comparisons to ParaView's D3 filter are presented. A number of optimizations to th
data sampling has been extensively studied for large scale graph mining. Many analyses and tasks become more efficient when performed on graph samples of much smaller size. The use of proxy objects is common in softwa...
详细信息
data sampling has been extensively studied for large scale graph mining. Many analyses and tasks become more efficient when performed on graph samples of much smaller size. The use of proxy objects is common in software engineering for analysis and interaction with heavy objects or systems. In this paper, we coin the term 'proxy graph' and empirically investigate how well a proxy graph visualization can represent a big graph. Our investigation focuses on proxy graphs obtained by sampling;this is one of the most common proxy approaches. Despite the plethora of data sampling studies, this is the first evaluation of sampling in the context of graph visualization. For an objective evaluation, we propose a new family of quality metrics for visual quality of proxy graphs. Our experiments cover popular sampling techniques. Our experimental results lead to guidelines for using sampling-based proxy graphs in visualization.
The enumeration of all maximal cliques in an undirected graph is a fundamental problem arising in several research areas. We consider maximal clique enumeration on shared-memory, multi-core architectures and introduce...
详细信息
ISBN:
(纸本)9781538606179
The enumeration of all maximal cliques in an undirected graph is a fundamental problem arising in several research areas. We consider maximal clique enumeration on shared-memory, multi-core architectures and introduce an approach consisting entirely of data-parallel operations, in an effort to achieve efficient and portable performance across different architectures. We study the performance of the algorithm via experiments varying over benchmark graphs and architectures. Overall, we observe that our algorithm achieves up to a 33-time speedup and 9-time speedup over state-of-the-art distributed and serial algorithms, respectively, for graphs with higher ratios of maximal cliques to total cliques. Further, we attain additional speedups on a GPU architecture, demonstrating the portable performance of our data-parallel design.
Figure *** rendered by Galaxy using 64-ray cross-node ambient occlusion shadow sampling: (left) volumetric asteroid impact simulation; (center) geometric limestone karst core sample scan; (right) n-body Cosmic Web dar...
详细信息
ISBN:
(纸本)9781538668740
Figure *** rendered by Galaxy using 64-ray cross-node ambient occlusion shadow sampling: (left) volumetric asteroid impact simulation; (center) geometric limestone karst core sample scan; (right) n-body Cosmic Web dark matter simulation. The long-range ambient-occlusion effects in Asteroid and Cosmic Web cannot be performed by conventional sort-last distributed ray tracers, where rays must stop at local data *** present Galaxy, a fully asynchronous distributed parallel rendering engine geared towards using full global illumination for large-scale visualization. Galaxy provides performant distributed rendering of complex lighting and material models, particularly those that require ray traversal across nodes. Our design is favorable for tightly-coupled in situ scenarios, where data remains on simulation nodes. By employing asynchronous framebuffer updates and a novel subtractive lighting model, we achieve acceptable image quality from the first ray generation, and improve quality throughout the render epoch. On simulated in situ rendering tasks, Galaxy outperforms the current best-of-breed scientific ray tracer by over 3× for distributed geometric and particle data, while providing expanded rendering capability for global illumination and complex materials.
Scientific visualization tools are rapidly embracing the necessary challenge of simultaneously visualizing multiple parameterized simulation data sets [8]. In the new paradigm, scientists hope to understand parameter ...
详细信息
ISBN:
(纸本)9781538668740
Scientific visualization tools are rapidly embracing the necessary challenge of simultaneously visualizing multiple parameterized simulation data sets [8]. In the new paradigm, scientists hope to understand parameter relationships and stochastic trends that exist in a parameter space [6], [7]. At the same time, virtual reality (VR) environments have enabled exciting possible opportunities for exploring and comparing time varying spatial data sets [3]. Although VR offers a unique perspective to view 3D and 4D data, it requires high framerates for interactivity and optimized use of precious GPU memory. Accurate simulations, on the other hand, are often very large due to dynamic unstructured mesh resolutions and small timesteps, making it difficult to simply render even one data set. To solve this, largedatavisualization frameworks often use data sampling and efficient rendering techniques to engage the GPU [1], [8]. Even then, VR is mostly used to add a stereoscopic view, and is rarely an integral part of interactive data instance comparison [3].
Oftentimes multivariate data are not available as sets of equally multivariate tuples, but only as sets of projections into subspaces spanned by subsets of these attributes. For example, one may find data with five at...
详细信息
Oftentimes multivariate data are not available as sets of equally multivariate tuples, but only as sets of projections into subspaces spanned by subsets of these attributes. For example, one may find data with five attributes stored in six tables of two attributes each, instead of a single table of five attributes. This prohibits the visualization of these data with standard high-dimensional methods, such as parallel coordinates or MDS, and there is hence the need to reconstruct the full multivariate (joint) distribution from these marginal ones. Most of the existing methods designed for this purpose use an iterative procedure to estimate the joint distribution. With insufficient marginal distributions and domain knowledge, they lead to results whose joint errors can be large. Moreover, enforcing smoothness for regularizations in the joint space is not applicable if the attributes are not numerical but categorical. We propose a visual analytics approach that integrates both anecdotal data and human experts to iteratively narrow down a large set of plausible solutions. The solution space is populated using a Monte Carlo procedure which uniformly samples the solution space. A level-of-detail high dimensional visualization system helps the user understand the patterns and the uncertainties. Constraints that narrow the solution space can then be added by the user interactively during the iterative exploration, and eventually a subset of solutions with narrow uncertainty intervals emerges.
In this project we explore several different techniques for visualizing data created by Hardware/Hybrid Accelerated Cosmology Code (HACC) cosmology simulations. We present four methods that have thus far been explored...
详细信息
ISBN:
(纸本)9781538668740
In this project we explore several different techniques for visualizing data created by Hardware/Hybrid Accelerated Cosmology Code (HACC) cosmology simulations. We present four methods that have thus far been explored. Mainly, we discuss visualizing data through ParaView, vl3, SPH interpolation, and virtual reality environments. We display our preliminary results, as well as our plans to begin applying our techniques to large scale data sets.
In this paper, we proposed a practical and efficient algorithm based on conventional semi-direct monocular visual odometry (SVO) algorithm, which mainly aims at the future application of the Simultaneous Localization ...
详细信息
ISBN:
(纸本)9781728152103
In this paper, we proposed a practical and efficient algorithm based on conventional semi-direct monocular visual odometry (SVO) algorithm, which mainly aims at the future application of the Simultaneous Localization and Mapping (SLAM) for embedded or mobile platforms such as robots and wearable devices. By applying the velocity momentum during the initial pose estimation, we present a novel algorithm for obtaining the initial pose, which is closer to the true value and more effective to solving the limitation of non-convergence in most existing approaches. A sparse image alignment module is also proposed to rectify the pose offset occurred at the corner, by elaborately resetting the relative pose at the location with large photometric error. The proposed lifted semi-direct monocular visual odometry has been extensively evaluated on benchmark dataset. The experimental result demonstrates that our method can explicitly generate the accurate initial poses without reducing the speed.
暂无评论