During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final imag...
详细信息
During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image, thus wasting precious time and energy. To help discard occluded surfaces, most current GPUs include an Early-Depth test before the fragment processing stage. However, to be effective it requires that opaque objects are processed in a front-to-back order. Depth sorting and other occlusion culling techniques at the object level incur overheads that are only offset for applications having substantial depth and/or fragment shading complexity, which is often not the case in mobile workloads. We propose a novel architectural technique for GPUs, Visibility Rendering Order (VRO), which reorders objects front-to-back entirely in hardware by exploiting the fact that the objects in graphics animated applications tend to keep its relative depth order across consecutive frames (temporal coherence). Since order relationships are already tested by the Depth Test, VRO incurs minimal energy overheads because it just requires adding a small hardware to capture that information and use it later to guide the rendering of the following frame. Moreover, unlike other approaches, this unit works in parallel with the graphics pipeline without any performance overhead. We illustrate the benefits of VRO using various unmodified commercial 3D applications for which VRO achieves 27 percent speed-up and 15.8 percent energy reduction on average over a state-of-the-art mobile GPU.
Many interactive rendering algorithms require operations on multiple fragments (i.e., ray intersections) at the same pixel location;however, current Graphics processing Units (GPUs) capture only a single fragment per ...
详细信息
ISBN:
(纸本)9781595936288
Many interactive rendering algorithms require operations on multiple fragments (i.e., ray intersections) at the same pixel location;however, current Graphics processing Units (GPUs) capture only a single fragment per pixel. Example effects include transparency, translucency, constructive solid geometry, depth-of-field, direct volume rendering, and isosurface visualization. With current GPUs, programmers implement these effects using multiple passes over the scene geometry, often substantially limiting performance. This paper introduces a generalization of the Z-buffer, called the k-buffer, that makes it possible to efficiently implement such algorithms with only a single geometry pass, yet requires only a small, fixed amount of additional memory. The k-buffer uses framebuffer memory as a read-modify-write (RMW) pool of k entries whose use is programmatically defined by a small k-buffer program. We present two proposals for adding k-buffer support to future GPUs and demonstrate numerous multiple-fragment, single-pass graphics algorithms running on both a software-simulated k-buffer and a k-buffer implemented with current GPUs. The goal of this work is to demonstrate the large number of graphics algorithms that the k-buffer enables and that the efficiency is superior to current multi-pass approaches.
In this paper various techniques for optimizing queries in distributed databases are presented. Although no attempt is made to cover all proposed algorithms on this topic, quite a few ideas extracted from existing alg...
详细信息
In this paper various techniques for optimizing queries in distributed databases are presented. Although no attempt is made to cover all proposed algorithms on this topic, quite a few ideas extracted from existing algorithms are outlined. It is hoped that large- scale experiments will be conducted to verify the usefulness of these ideas and that they will be integrated to construct a powerful algorithm for distributed query processing. [ABSTRACT FROM AUTHOR]
暂无评论