Recently, many approaches apply light-field imageprocessing on smartphones and wearable devices. A Graphic processing Unit (GPU) is commonly used to exploit parallelism in such imageprocessing. However, because the ...
详细信息
ISBN:
(纸本)9781509026562
Recently, many approaches apply light-field imageprocessing on smartphones and wearable devices. A Graphic processing Unit (GPU) is commonly used to exploit parallelism in such imageprocessing. However, because the access pattern in the light-field application is more sparse than typical stencil applications and does not use all data in a cache line. Furthermore, the data requests to multiple locations generate enormous short-burst memory transfers in the cache system, cost high latency, and do not fully utilize the high memory bandwidth of GPU. Therefore, an alternative architecture that exploits a long-burst data transmission, which improves the memory bandwidth utilization, is essential. We propose a sparse stencil oriented Coarse Grain Reconfigurable Accelerator (CGRA) that we call EMAXV. Unlike on-demand multiple data loading on GPU, EMAXV loads the input data with a long burst transferring before the execution proceeds to conceal the sparse memory access and multi-threading cache races. It further obscures the memory loading latency with an execution latency from different activations. We evaluated the EMAXV and mobile GPU (Tegra K1) performances with identical host CPU's frequency and main memory bandwidth. Although EMAXV has much lower computation capability, we achieved four times performance of mobile GPU for light-field depth extraction and 89% of the performance for light-field image rendering.
Immersive molecular visualization provides the viewer with intuitive perception of complex structures and spatial relationships that are of critical interest to structural biologists. The recent availability of commod...
详细信息
ISBN:
(纸本)9781509036837
Immersive molecular visualization provides the viewer with intuitive perception of complex structures and spatial relationships that are of critical interest to structural biologists. The recent availability of commodity head mounted displays (HMDs) provides a compelling opportunity for widespread adoption of immersive visualization by molecular scientists, but HMDs pose additional challenges due to the need for low-latency, high-frame-rate rendering. State-of-the-art molecular dynamics simulations produce terabytes of data that can be impractical to transfer from remote supercomputers, necessitating routine use of remote visualization. Hardware-accelerated video encoding has profoundly increased frame rates and image resolution for remote visualization, however round-trip network latencies would cause simulator sickness when using HMDs. We present a novel two-phase rendering approach that overcomes network latencies with the combination of omnidirectional stereoscopic progressive ray tracing and high performance rasterization, and its implementation within VMD, a widely used molecular visualization and analysis tool. The new rendering approach enables immersive molecular visualization with rendering techniques such as shadows, ambient occlusion lighting, depth-of-field, and high quality transparency, that are particularly helpful for the study of large biomolecular complexes. We describe ray tracing algorithms that are used to optimizeinteractivity and quality, and we report key performance metrics of the system. The new techniques can also benefit many other application domains.
Designated among 10 breakthrough technologies by MIT Technology Review [1], Deep Learning (DL) outperform current approaches in many situations, e.g. image or speech processing. One of the most important deep architec...
详细信息
Designated among 10 breakthrough technologies by MIT Technology Review [1], Deep Learning (DL) outperform current approaches in many situations, e.g. image or speech processing. One of the most important deep architecture is represented by the Convolutional Neural Network (CNN). The purpose of this paper is to provide practical recommendations in the deployment and development of the CNN based applications. They refer to the hardware as well as software available solutions and go beyond by providing guidance in choosing the appropriate hyper-parameters (structure, training algorithm, learning rate, regularization techniques, etc.). The experimental results are reported using the CIFAR-10 dataset.
Although head-mounted displays (HMDs) are ideal devices for personal viewing of immersive stereoscopic content, exposure to VR applications on them results in significant discomfort for the majority of people, with sy...
详细信息
Although head-mounted displays (HMDs) are ideal devices for personal viewing of immersive stereoscopic content, exposure to VR applications on them results in significant discomfort for the majority of people, with symptoms including eye fatigue, headaches, nausea, and sweating. A conflict between accommodation and vergence depth cues on stereoscopic displays is a significant cause of visual discomfort. This article describes the results of an evaluation used to judge the effectiveness of dynamic depth-of-field (DoF) blur in an effort to reduce discomfort caused by exposure to stereoscopic content on HMDs. Using a commercial game engine implementation, study participants report a reduction of visual discomfort on a simulator sickness questionnaire when DoF blurring is enabled. The study participants reported a decrease in symptom severity caused by HMD exposure, indicating that dynamic DoF can effectively reduce visual discomfort.
This study proposes the development of a virtual environment, as a computer game format, that aid the training and the practice of Paralympic sport shooting, using a digital image analysis system, involving techniques...
详细信息
ISBN:
(纸本)9781467371292
This study proposes the development of a virtual environment, as a computer game format, that aid the training and the practice of Paralympic sport shooting, using a digital image analysis system, involving techniques of acquisition and processing of acquired images. The image acquisition was performed by a webcam that interacts with the software, which performs the processing and image analysis in matrix form, displaying the results dynamically inside the game's screen. It can be said as a conclusion that it was possible to use programming techniques in Matlab, combined with the capture and imageprocessing, to create a training tool able to simulate the practice of Paralympic sport shooting, providing the appropriate interactivity using graphic interfaces.
This work revisits the Shock Filters of Osher and Rudin [OR90] and shows how the proposed filtering process can be interpreted as the advection of image values along flow-lines. Using this interpretation, we obtain an...
详细信息
This work revisits the Shock Filters of Osher and Rudin [OR90] and shows how the proposed filtering process can be interpreted as the advection of image values along flow-lines. Using this interpretation, we obtain an efficient implementation that only requires tracing flow-lines and re-sampling the image. We show that the approach is stable, allowing the use of arbitrarily large time steps without requiring a linear solve. Furthermore, we demonstrate the robustness of the approach by extending it to the processing of signals on meshes in 3D.
Previous approaches to rendering large point clouds on immersive displays have generally created a trade-off between interactivity and quality. While these approaches have been quite successful for desktop environment...
详细信息
ISBN:
(纸本)9781509008377
Previous approaches to rendering large point clouds on immersive displays have generally created a trade-off between interactivity and quality. While these approaches have been quite successful for desktop environments when interaction is limited, virtual reality systems are continuously interactive, which forces users to suffer through either low frame rates or low image quality. This paper presents a novel approach to this problem through a progressive feedback-driven rendering algorithm. This algorithm uses reprojections of past views to accelerate the reconstruction of the current view. The presented method is tested against previous methods, showing improvements in both rendering quality and interactivity.
This paper presents a mixed-reality gaming system architecture with a two-phase spatial information processing model. This model assumes that the physical world consists of mainly stable elements such as buildings, an...
详细信息
ISBN:
(纸本)9781509045723
This paper presents a mixed-reality gaming system architecture with a two-phase spatial information processing model. This model assumes that the physical world consists of mainly stable elements such as buildings, and wide-area physical structures are captured in advance by using a 3D laser range scanner. The captured structure data has exactly the same structure and topology as the physical world. Our system recognizes the physical world at runtime by overlaying this virtual structure data on a runtime monocular camera image at exactly the same position. By adopting this model, it is straightforward to implement image-based lighting for photometric integrity and occlusion handling for geometric integrity, because the overlaid virtual structure provides precise distance metrics and depth information. This model also resolves conflicts involving multiple sensing results obtained by multiple users. This paper shows the results of empirical evaluation experiments using the implemented prototype system.
This paper examines the performance of two power efficient hardware implementations using deep neural networks to perform a simple image classification task. We provide the first ever examination of the accuracy-energ...
详细信息
ISBN:
(纸本)9781479953424
This paper examines the performance of two power efficient hardware implementations using deep neural networks to perform a simple image classification task. We provide the first ever examination of the accuracy-energy trade-offs of deep neural networks running on both an embedded GPU, and a neuromorphic processor. IBM's TrueNorth is a brain-inspired event-driven neuromorphic processor. It was designed to be scalable and to consume extremely low amounts of power. NVIDIA's Tegra K1 SoC is a mobile processor also designed with low power and a small footprint in mind. While these two chips were designed with similar constraints, the resulting architectures and performance trade-offs achieved are significantly different. On our simple image classification task Convolutional Neural Networks utilizing the Tegra K1 SoC achieve up to 89 % accuracy with a normalized accuracy per active energy, ||Acc||/EA, score of up to 24.22 on our test dataset, while Tea Networks running on the TrueNorth processor achieve less accuracy at 82%, but a better accuracy-energy trade-off with a ||Acc||/EA score of up to 158.49.
New imaging stations aim for high spatial and temporal resolution and are characterized by ever increasing sampling rates and demanding data processing workflows. Key to successful imaging experiments is to open up hi...
详细信息
New imaging stations aim for high spatial and temporal resolution and are characterized by ever increasing sampling rates and demanding data processing workflows. Key to successful imaging experiments is to open up high-performance computing resources. This includes carefully selected components for computing hardware and development of advanced imaging algorithms optimized for efficient use of parallel processor architectures. We present the novel UFO computing platform for online data processing for imaging experiments and image-based feedback. The platform handles the full data life cycle from the X-ray detector to long-term data archives. Core components of this system are an FPGA platform for ultra-fast data acquisition, the GPU-based UFO imageprocessing framework, and the fast control system “Concert”. Reconstruction algorithms implemented in the UFO framework are optimized for the latest GPU architectures and provide a reconstruction throughput in the GB/s-range. The control system “Concert” integrates high-speed computing nodes and fast beamline devices and thus enables image-based control loops and advanced workflow automation for efficient beam time usage. Low latencies are ensured by direct communication between FPGA and GPUs using AMDs DirectGMA technology. Time resolved tomography is supported by cutting edge regularization methods for high quality reconstructions with a reduced number of projections. The new infrastructure at ANKA has dramatically accelerated tomography from hours to second and resulted in new application fields, like high-throughput tomography, pump-probe radiography and stroboscopic tomography. Ultra-fast X-ray cine-tomography for the first time allows one to observe internal dynamics of moving millimeter-sized objects in real-time.
暂无评论