检索结果-内蒙古大学图书馆

Foveated Encoding for large High-Resolution Displays

ieee TRANSACTIONS ON visualization AND COMPUTER graphics 2021年第2期27卷 1850-1859页

作者： Friess, Florian Braun, Matthias Bruder, Valentin Frey, Steffen Reina, Guido Ertl, Thomas Univ Stuttgart Stuttgart Germany Univ Groningen Groningen Netherlands

Collaborative exploration of scientific data sets across large high-resolution displays requires both high visual detail as well as low-latency transfer of image data (oftentimes inducing the need to trade one for the other). In this work, we present a system that dynamically adapts the encoding quality in such systems in a way that reduces the required bandwidth without impacting the details perceived by one or more observers. Humans perceive sharp, colourful details, in the small foveal region around the centre of the field of view, while information in the periphery is perceived blurred and colourless. We account for this by tracking the gaze of observers, and respectively adapting the quality parameter of each macroblock used by the H.264 encoder, considering the so-called visual acuity fall-off. This allows to substantially reduce the required bandwidth with barely noticeable changes in visual quality, which is crucial for collaborative analysis across display walls at different locations. We demonstrate the reduced overall required bandwidth and the high quality inside the foveated regions using particle rendering and parallel coordinates.

关键词： visualization Bandwidth Encoding Streaming media Image coding Hardware Computational modeling large high-resolution displays Fovetaed Encoding Remote Visualisation

来源：评论

学校读者我要写书评

暂无评论

Amortised Encoding for large High-Resolution Displays 11

Amortised Encoding for Large High-Resolution Displays

引用

ieee 11th symposium on large data Analysis and visualization (LDAV)

作者： Friess, Florian Braun, Matthias Reina, Guido Ertl, Thomas Univ Stuttgart Stuttgart Germany

ISBN: (纸本)9781665432832

Both visual detail and a low-latency transfer of image data are required for collaborative exploration of scientific data sets across large high-resolution displays. In this work, we present an approach that reduces the resolution before the encoding and uses temporal upscaling to reconstruct the full resolution image, reducing the overall latency and the required bandwidth without significantly impacting the details perceived by observers. Our approach takes advantage of the fact that humans do not perceive the full details of moving objects by providing a perfect reconstruction for static parts of the image, while non-static parts are reconstructed with a lower quality. This strategy enables a substantial reduction of the encoding latency and the required bandwidth with barely noticeable changes in visual quality, which is crucial for collaborative analysis across display walls at different locations. Additionally, our approach can be combined with other techniques aiming to reduce the required bandwidth while keeping the quality as high as possible, such as foveated encoding. We demonstrate the reduced overall latency, the required bandwidth, as well as the high image quality using different visualisations.

关键词： Human-centered computing visualization systems and tools Computing methodologies graphics systems and interfaces

来源：评论

学校读者我要写书评

暂无评论

Density-Aware parallel Hyperdimensional Genome Sequence Matching 30

Density-Aware Parallel Hyperdimensional Genome Sequence Matc...

引用

ieee 30th International symposium on Field-Programmable Custom Computing Machines (FCCM)

作者： Chen, Hanning Imani, Mohsen Univ Calif Irvine Dept Comp Sci Irvine CA 92697 USA

ISBN: (纸本)9781665483322

In this paper, we propose a Hyper-Dimensional genome analysis platform. Instead of working with original sequences, our method maps the genome sequences into high-dimensional space and performs sequence matching with simple and parallel similarity searches. At the algorithm level, we revisit the sequence searching with brain-like memorization that Hyper-Dimensional computing natively supports. Instead of working on the original data, we map all data points into high-dimensional space, enabling the main sequence searching operations to process in a hardware-friendly way. We accordingly design a density-aware FPGA implementation. Our solution searches the similarity of an encoded query and large-scale genome library through different chunks. We exploit the holographic representation of patterns to stop search operations on libraries with a lower chance of a match. This translates our computation from dense to highly sparse just after a few chuck-based searches. Our large-scale evaluation shows that our accelerator can provide 46x speedup and 188x energy efficiency improvement compared to a state-of-the-art Hyper-Dimensional computing GPU implementation.

关键词： Genomics graphics processing units Libraries Energy efficiency Bioinformatics Pattern matching Field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Visuals on the House: Optimizing HPC Workflows with No-Cost CPU visualization

Visuals on the House: Optimizing HPC Workflows with No-Cost ...

引用

ieee symposium on large data Analysis and visualization (LDAV)

作者： Victor A. Mateevitsi Andres Sewell Mathis Bode Paul Fischer Jens Henrik Göbbert Joseph A. Insley Ioannis Kavroulakis Damaskinos Konioris Yu-Hsiang Lan Misun Min Dimitrios Papageorgiou Michael E. Papka Steve Petruzza Silvio Rizzi Ananias Tomboulides Argonne National Laboratory University of Illinois Chicago Utah State University University of Illinois Urbana-Champaign Forschungszentrum Jülich Northern Illinois University Aristotle University of Thessaloniki

ISBN: (数字)9798331516925

ISBN: (纸本)9798331516932

The rise of heterogeneous resources in modern High Performance Computing (HPC) systems has propelled the scientific community beyond the exascale threshold. To maximize simulation performance on HPCs, applications increasingly rely on device resources, such as GPUs, leading to under-utilization of host resources, partic-ularly CPUs. In situ analysis and visualization techniques minimize data movement by operating on data in-memory, but this still in-volves blocking operations that incur a small penalty on simulation performance. We explore a novel instrumentation approach where GPU-based time step data is copied from device memory to host memory, enabling CPUs to concurrently perform visualization and analysis tasks. This strategy allows simulations to continue uninter-rupted by an in situ library's analysis and visualization processes.

关键词： Performance evaluation visualization Leadership Ion radiation effects High performance computing data visualization Propulsion Rendering (computer graphics) Resource management Magnetosphere

来源：评论

学校读者我要写书评

暂无评论

Adaptive Multi-Resolution Encoding for Interactive large-Scale Volume visualization through Functional Approximation

Adaptive Multi-Resolution Encoding for Interactive Large-Sca...

引用

ieee symposium on large data Analysis and visualization (LDAV)

作者： Jianxin Sun David Lenz Hongfeng Yu Tom Peterka University of Nebraska-Lincoln Argonne National Laboratory

ISBN: (数字)9798331516925

ISBN: (纸本)9798331516932

Functional approximation as a high-order continuous representation provides a more accurate value and gradient query compared to the traditional discrete volume representation. Volume visualization directly rendered from functional approximation generates high-quality rendering results without high-order artifacts caused by trilinear interpolations. However, querying an encoded functional approximation is computationally expensive, especially when the input dataset is large, making functional approximation impractical for interactive visualization. In this paper, we proposed a novel functional approximation multi-resolution representation, Adaptive-FAM, which is lightweight and fast to query. We also design a GPU-accelerated out-of-core multi-resolution volume visualization framework that directly utilizes the Adaptive-FAM representation to generate high-quality rendering with interactive responsiveness. Our method can not only dramatically decrease the caching time, one of the main contributors to input latency, but also effectively improve the cache hit rate through prefetching. Our approach significantly outperforms the traditional function approximation method in terms of input latency while maintaining comparable rendering quality.

关键词： visualization Adaptation models Interpolation Three-dimensional displays Prefetching Rendering (computer graphics) Sampling methods Encoding Function approximation Testing

来源：评论

学校读者我要写书评

暂无评论

Binned k-d Tree Construction for Sparse Volume data on Multi-Core and GPU Systems

引用

ieee TRANSACTIONS ON visualization AND COMPUTER graphics 2021年第3期27卷 1904-1915页

作者： Zellmann, Stefan Schulze, Juergen P. Lang, Ulrich Univ Cologne Chair Comp Sci D-50923 Cologne Germany Univ Calif San Diego Dept Comp Sci La Jolla CA 92093 USA

While k-d trees are known to be effective for spatial indexing of sparse 3-d volume data, full reconstruction, e.g. due to changes to the alpha transfer function during rendering, is usually a costly operation with this hierarchical data structure. In a recent publication we showed how to port a clever state of the art k-d tree construction algorithm to a multi-core CPU architecture and by means of thorough optimization we were able to obtain interactive reconstruction rates for moderately sized to large data sets. The construction scheme is based on maintaining partial summed-volume tables that fit in the L1 cache of the multi-core CPU and that allow for fast occupancy queries. In this work we propose a GPU implementation of the parallel k-d tree construction algorithm and compare it with the original multi-core CPU implementation. We conduct a thorough comparative study that outlines performance and scalability of our implementation.

关键词： Scientific visualization Sparse data Direct Volume Rendering k-d Tree parallel and GPGPU Computing

来源：评论

学校读者我要写书评

暂无评论

A Distributed-Memory parallel Approach for Volume Rendering with Shadows

A Distributed-Memory Parallel Approach for Volume Rendering ...

引用

ieee symposium on large data Analysis and visualization (LDAV)

作者： Manish Mathai Matthew Larsen Hank Childs University of Oregon Luminary Cloud

We present a parallel, distributed-memory technique that enhances traditional ray-casting volume rendering of large data sets to highlight the depth and perception of interesting volumetric features. The technique introduces a lighting system that accounts for global shadows across distributed MPI nodes while using shared-memory parallelism within each node to compute shading information efficiently. The first stage of the approach involves estimating energy attenuation from a point light source through the global volume, using a reduced spatial resolution representation of the volume, with minimal global communication between nodes. It is then used in the second stage during volume rendering to shade sample points captured during ray-casting, generating a high-quality image. In this work, we study the technique's performance across varying spatial resolutions of the estimated light attenuation using synthetic and real-world volumetric data sets on distributed systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PARSEC: parallel Subgraph Enumeration in CUDA 36

PARSEC: PARallel Subgraph Enumeration in CUDA

引用

36th ieee International parallel and Distributed Processing symposium (ieee IPDPS)

作者： Dodeja, Vibhor Almasri, Mohammad Nagi, Rakesh Xiong, Jinjun Hwu, Wen-mei Univ Illinois C3SR Urbana IL 61801 USA Univ Buffalo Buffalo NY USA Nvidia Corp Santa Clara CA USA

ISBN: (纸本)9781665481069

Subgraph enumeration is an important problem in the field of Graph Analytics with numerous applications. The problem is provably NP-complete and requires sophisticated heuristics and highly efficient implementations to be feasible on problem sizes of realistic scales. parallel solutions have shown a lot of promise on CPUs and distributed environments. Recently, GPU-based parallel solutions have also been proposed to take advantage of the massive execution resources in modern GPUs. Subgraph enumeration involves traversing a search tree for each vertex of the data graph to find matches of a query in a graph. Most GPU-based solutions traverse the tree in breadth-first manner that exploits parallelism at the cost of high memory requirement and presents a formidable challenge for processing large graphs with high-degree vertices since the memory capacity of GPUs is significantly lower than that of CPUs. In this work, we propose a novel GPU solution based on a hybrid BFS and DFS approach where the top level(s) of the search trees are traversed in a fully parallel, breadth-first manner while each subtree is traversed in a more space-efficient, depth-first manner. The depth-first traversal of subtrees requires less memory but presents more challenges for parallel execution. To overcome the less parallel nature of depth-first traversal, we exploit fine-grained parallelism in each step of the depth-first traversal of subtrees. We further identify and implement various optimizations to efficiently utilize memory and compute resources of the GPUs. We evaluate our performance in comparison with the state-of-the-art GPU and CPU implementations. We outperform the GPU and CPU implementations with a geometric mean speedup of 9.47x (up to 92.01x) and 2.37x (up to 12.70x), respectively. We also show that the proposed approach can efficiently process the graphs that previously cannot be processed by the state-of-the-art GPU solutions due to their excessive memory requirement.

关键词： Distributed processing Instruction sets Memory management graphics processing units Process control Organizations parallel processing

来源：评论

学校读者我要写书评

暂无评论

Adaptive Spatially Aware I/O for Multiresolution Particle data Layouts 35

Adaptive Spatially Aware I/O for Multiresolution Particle Da...

引用

35th ieee International parallel and Distributed Processing symposium (IPDPS)

作者： Usher, Will Huang, Xuan Petruzza, Steve Kumar, Sidharth Slattery, Stuart R. Reeve, Sam T. Wang, Feng Johnson, Chris R. Pascucci, Valerio Univ Utah SCI Inst Salt Lake City UT 84112 USA Utah State Univ Logan UT 84322 USA Univ Alabama Birmingham Birmingham AL USA Oak Ridge Natl Lab Oak Ridge TN USA Lawrence Livermore Natl Lab Livermore CA USA

ISBN: (纸本)9781665440660

large-scale simulations on nonuniform particle distributions that evolve over time are widely used in cosmology, molecular dynamics, and engineering. Such data are often saved in an unstructured format that neither preserves spatial locality nor provides metadata for accelerating spatial or attribute subset queries, leading to poor performance of visualization tasks. Furthermore, the parallel I/O strategy used typically writes a file per process or a single shared file, neither of which is portable or scalable across different HPC systems. We present a portable technique for scalable, spatially aware adaptive aggregation that preserves spatial locality in the output. We evaluate our approach on two supercomputers, Stampede2 and Summit, and demonstrate that it outperforms prior approaches at scale, achieving up to 2.5v faster writes and reads for nonuniform distributions. Furthermore, the layout written by our method is directly suitable for visual analytics, supporting low-latency reads and attribute-based filtering with little overhead.

关键词： parallel I/O Load Balancing

来源：评论

学校读者我要写书评

暂无评论

Accelerating CKKS Homomorphic Encryption with data Compression on GPUs

Accelerating CKKS Homomorphic Encryption with Data Compressi...

引用

Midwest symposium on Circuits and Systems (MWSCAS)

作者： Quoc Bao Phan Linh Nguyen Tuy Tan Nguyen School of Informatics Computing and Cyber Systems Northern Arizona University Flagstaff AZ USA

ISBN: (数字)9798350387179

ISBN: (纸本)9798350387186

Homomorphic encryption (HE) algorithms, particularly the Cheon-Kim-Kim-Song (CKKS) scheme, offer significant potential for secure computation on encrypted data, making them valuable for privacy-preserving machine learning. However, high latency in large integer operations in the CKKS algorithm hinders the processing of large datasets and complex computations. This paper proposes a novel strategy that combines lossless data compression techniques with the parallel processing power of graphics processing units to address these challenges. Our approach demonstrably reduces data size by 90% and achieves significant speedups of up to 100 times compared to conventional approaches. This method ensures data confidentiality while mitigating performance bottlenecks in CKKS-based computations, paving the way for more efficient and scalable HE applications.

关键词： data privacy Machine learning algorithms System performance graphics processing units data compression Machine learning parallel processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：