datasets obtained through recently advanced measurement techniques tend to possess a large number of dimensions. This leads to explosively increasing computation costs for analyzing such datasets, thus making formulat...
详细信息
ISBN:
(纸本)9781509057382
datasets obtained through recently advanced measurement techniques tend to possess a large number of dimensions. This leads to explosively increasing computation costs for analyzing such datasets, thus making formulation and verification of scientific hypotheses very difficult. Therefore, an efficient approach to identifying feature subspaces of target datasets, that is, the subspaces of dimension variables or subsets of the data samples, is required to describe the essence hidden in the original dataset. This paper proposes a visual data mining framework for supporting semiautomatic data analysis that builds upon asymmetric biclustering to explore highly correlated feature subspaces. For this purpose, a variant of parallel coordinate plots, many-to-many parallel coordinate plots, is extended to visually assist appropriate selections of feature subspaces as well as to avoid intrinsic visual clutter. In this framework, biclustering is applied to dimension variables and data samples of the dataset simultaneously and asymmetrically. A set of variable axes are projected to a single composite axis while data samples between two consecutive variable axes are bundled using polygonal strips. This makes the visualization method scalable and enables it to play a key role in the framework. The effectiveness of the proposed framework has been empirically proven, and it is remarkably useful for many-to-many parallel coordinate plots.
Pathfinder network scaling is a graph sparsification technique that has been popularly used due to its efficacy of extracting the "important" structure of a graph. However, existing algorithms to compute the...
详细信息
ISBN:
(纸本)9781538606179
Pathfinder network scaling is a graph sparsification technique that has been popularly used due to its efficacy of extracting the "important" structure of a graph. However, existing algorithms to compute the pathfinder network (PFNET) of a graph have prohibitively expensive time complexity for large graphs: O(n(3)) for the general case and O(n(2) logn) for a specific parameter setting, PFNET(r = infinity, q = n-1), which is considered in many applications. In this paper, we introduce the first distributed technique to compute the pathfinder network with the specific parameters (r = 8 and q = n-1) of a large graph with millions of edges. The results of our experiments show our technique is scalable;it efficiently utilizes a parallel distributed computing environment, reducing the running times as more processing units are added.
Plans for exascale computing have identified power and energy as looming problems for simulations running at that scale. In particular, writing to disk all the data generated by these simulations is becoming prohibiti...
详细信息
ISBN:
(纸本)9781538639146
Plans for exascale computing have identified power and energy as looming problems for simulations running at that scale. In particular, writing to disk all the data generated by these simulations is becoming prohibitively expensive due to the energy consumption of the supercomputer while it idles waiting for data to be written to permanent storage. In addition, the power cost of data movement is also steadily increasing. A solution to this problem is to write only a small fraction of the data generated while still maintaining the cognitive fidelity of the visualization. With domain scientists increasingly amenable towards adopting an in-situ framework that can identify and extract valuable data from extremely large simulation results and write them to permanent storage as compact images, a large-scale simulation will commit to disk a reduced dataset of data extracts that will be much smaller than the raw results, resulting in a savings in both power and energy. The goal of this paper is two-fold: (i) to understand the role of in-situ techniques in combating power and energy issues of extreme-scale visualization and (ii) to create a model for performance, power, energy, and storage to facilitate what-if analysis. Our experiments on a specially instrumented, dedicated 150-node cluster show that while it is difficult to achieve power savings in practice using in-situ techniques, applications can achieve significant energy savings due to shorter write times for in-situ visualization. We present a characterization of power and energy for in-situ visualization;an application-aware, architecturespecific methodology for modeling and analysis of such in-situ workflows;and results that uncover indirect power savings in visualization workflows for high-performance computing (HPC).
The size of large-scale scientific datasets created from simulations and computed on modern supercomputers continues to grow at a fast pace. A daunting challenge is to analyze and visualize these intractable datasets ...
详细信息
ISBN:
(纸本)9781509057382
The size of large-scale scientific datasets created from simulations and computed on modern supercomputers continues to grow at a fast pace. A daunting challenge is to analyze and visualize these intractable datasets on commodity hardware. A recent and promising area of research is to replace the dataset with a distribution based proxy representation that summarizes scalar information into a much reduced memory footprint. Proposed representations subdivide the dataset into local blocks, where each block holds important statistical information, such as a histogram. A key drawback is that a distribution representing the scalar values in a block lacks spatial information. This manifests itself as large errors in visualization algorithms. We present a novel statistically-based representation by augmenting the block-wise distribution based representation with location information, called a value-based spatial distribution. Information from both spatial and scalar spaces are combined using Bayes' rule to accurately estimate the data value at a given spatial location. The representation is compact using the Gaussian Mixture Model. We show that our approach is able to preserve important features in the data and alleviate uncertainty.
High-resolution simulation data sets provide plethora of information, which needs to be explored by application scientists to gain enhanced understanding about various phenomena. Visual-analytics techniques using raw ...
详细信息
ISBN:
(纸本)9781509057382
High-resolution simulation data sets provide plethora of information, which needs to be explored by application scientists to gain enhanced understanding about various phenomena. Visual-analytics techniques using raw data sets are often expensive due to the data sets' extreme sizes. But, interactive analysis and visualization is crucial for big data analytics, because scientists can then focus on the important data and make critical decisions quickly. To assist efficient exploration and visualization, we propose a new region-based statistical data summarization scheme. Our method is superior in quality, as compared to the existing statistical summarization techniques, with a more compact representation, reducing the overall storage cost. The quantitative and visual efficacy of our proposed method is demonstrated using several data sets along with an in situ application study for an extreme-scale flow simulation.
This paper presents a new algorithm for the fast, shared memory multi-core computation of augmented merge trees on triangulations. In contrast to most existing parallel algorithms, our technique computes augmented tre...
详细信息
ISBN:
(纸本)9781538606179
This paper presents a new algorithm for the fast, shared memory multi-core computation of augmented merge trees on triangulations. In contrast to most existing parallel algorithms, our technique computes augmented trees. This augmentation is required to enable the full extent of merge tree based applications, including data segmentation. Our approach completely revisits the traditional, sequential merge tree algorithm to re-formulate the computation as a set of independent local tasks based on Fibonacci heaps. This results in superior time performance in practice, in sequential as well as in parallel thanks to the OpenMP task runtime. In the context of augmented contour tree computation, we show that a direct usage of our merge tree procedure also results in superior time performance overall, both in sequential and parallel. We report performance numbers that compare our approach to reference sequential and multi-threaded implementations for the computation of augmented merge and contour trees. These experiments demonstrate the runtime efficiency of our approach as well as its scalability on common workstations. We demonstrate the utility of our approach in data segmentation applications. We also provide a lightweight VTK-based C++ implementation of our approach for reproduction purposes.
The proceedings contain 8 papers. The topics discussed include: PaViz: a power-adaptive framework for optimizing visualization performance;prediction of distributed volume visualization performance to support render h...
ISBN:
(纸本)9783038680345
The proceedings contain 8 papers. The topics discussed include: PaViz: a power-adaptive framework for optimizing visualization performance;prediction of distributed volume visualization performance to support render hardware acquisition;progressive CPU volume rendering with sample accumulation;photo-guided exploration of volume data features;a space-efficient method for navigable ensemble analysis and visualization;interactive exploration of dissipation element geometry;a task-based parallel rendering component for large-scale visualization applications;and achieving portable performance for wavelet compression using dataparallel primitives.
Information diffusion by social network occurs when a large number of users are involved in the *** dynamic model has been proved to be an effective method to visualize such *** dynamic model is extremely time-consumi...
详细信息
ISBN:
(纸本)9781509036202
Information diffusion by social network occurs when a large number of users are involved in the *** dynamic model has been proved to be an effective method to visualize such *** dynamic model is extremely time-consuming for large scale *** Processing Units(GPU),originally designed for graphics,texture and pixels rendering,now provide computational power for scientific *** this paper,we use the CUDA toolkit to improve the fluid dynamic model(FDM) for higher *** GPU acceleration,the FDM approach provides an real-time visualization.
In this paper, we develop a method to encapsulate and embed interactive 3D volume rendering into the standard web Document Object Model (DOM). The package we implemented for this work is called Tapestry. Using Tapestr...
详细信息
ISBN:
(纸本)9781538606179
In this paper, we develop a method to encapsulate and embed interactive 3D volume rendering into the standard web Document Object Model (DOM). The package we implemented for this work is called Tapestry. Using Tapestry, data-intensive and interactive volume rendering can be easily incorporated into web pages. For example, we can enhance a Wikipedia page on supernova to contain several interactive 3D volume renderings of supernova volume data. There is no noticeable slowdown during the page load by the web browser. A user can choose to interact with any of the volume renderings of supernova at will. We refer to each embedded 3D visualization as a hyperimage. Hyperimages depend on scalable server-side support where volume rendering jobs are performed and managed elastically. We show the minimal code change required to embed hyperimages into previously static web pages. We also demonstrate the supporting Tapestry server's scalability along several dimensions: web page complexity, rendering complexity, frequency of rendering requests, and number of concurrent sessions. Using solely standard open-source components, this work proves that it is now feasible to make volume rendering a scalable web service that supports a diverse audience with varying use cases.
Multivariate volumetric datasets are often encountered in results generated by scientific simulations. Compared to univariate datasets, analysis and visualization of multivariate datasets are much more challenging due...
详细信息
ISBN:
(纸本)9781509057382
Multivariate volumetric datasets are often encountered in results generated by scientific simulations. Compared to univariate datasets, analysis and visualization of multivariate datasets are much more challenging due to the complex relationships among the variables. As an effective way to visualize and analyze multivariate datasets, volume rendering has been frequently used, although designing good multivariate transfer functions is still non-trivial. In this paper, we present an interactive workflow to allow users to design multivariate transfer functions. To handle large scale datasets, in the preprocessing stage we reduce the number of data points through data binning and aggregation, and then a new set of data points with a much smaller size are generated. The relationship between all pairs of variables is presented in a matrix juxtaposition view, where users can navigate through the different subspaces. An entropy based method is used to help users to choose which subspace to explore. We proposed two weights: scatter weight and size weight that are associated with each projected point in those different subspaces. Based on those two weights, data point filter and kernel density estimation operations are employed to assist users to discover interesting features. For each user-selected feature, a Gaussian function is constructed and updated incrementally. Finally, all those selected features are visualized through multivariate volume rendering to reveal the structure of data. With our system, users can interactively explore different subspaces and specify multivariate transfer functions in an effective way. We demonstrate the effectiveness of our system with several multivariate volumetric datasets.
暂无评论