With recent advances in deep convolutional neural networks (CNN), deep learning has brought significant quality improvement and flexibility on single image super resolution (SR). In this paper, we describe how CNN bas...
详细信息
ISBN:
(纸本)9781509061686
With recent advances in deep convolutional neural networks (CNN), deep learning has brought significant quality improvement and flexibility on single image super resolution (SR). In this paper, we describe how CNN based SR can be accelerated on integrated GPUs. To this end, we employ a CNN model from an existing single image SR approach, and develop the model within a well-known deep learning framework with OpenCL support. We also introduce a multi-tile approach in which we divide a large input into smaller tiles to generate SR for better utilization of memory bandwidth and to overcome size constraints posed by certain frameworks and devices thereby improving performance. This contributes to extending single image SR to video SR as well where video frames are considered as a group of multiple tiles. We prove that our approach is useful to resolve memory issues in generating ultrahigh SR and to speed-up CNN based SR up to 44fps to generate various sizes of SR without quality impact.
Software-based network packet processing on standard high volume servers promises better flexibility, manageability and scalability, thus gaining tremendous momentum in recent years. Numerous research efforts have foc...
详细信息
ISBN:
(纸本)9781467394871
Software-based network packet processing on standard high volume servers promises better flexibility, manageability and scalability, thus gaining tremendous momentum in recent years. Numerous research efforts have focused on boosting packet processing performance by offloading to discrete Graphics Processing Units (GPUs). While integrated GPUs, residing on the same die with the CPU, offer many advanced features such as on-chip interconnect CPU-GPU communication, and shared physical/virtual memory, their applicability for packet processing workloads has not been fully understood and exploited. In this paper, we conduct in-depth profiling and analysis to understand the integrated GPU's capabilities and performance potential for packet processing workloads. Based on that understanding, we introduce a GPU accelerated network packet processing framework that fully utilizes integrated GPU's massive parallel processing capability without the need for large numbers of packet batching, which might cause a significant processing delay. We implemented the proposed framework and evaluated the performance with several common, light-weight packet processing workloads on the Intel ® Xeon ® Processor E3-1200 v4 product family (codename Broadwell) with an integrated GT3e GPU. The results show that our GPU accelerated packet processing framework improved the throughput performance by 2-2.5x, compared to optimized packet processing on CPU only.
A new method for automatically identifying rare features in fingerprints based on a combination of level 1 features and minutia-based triangular descriptors is described. A feature is considered rare if it is statisti...
详细信息
A new method for automatically identifying rare features in fingerprints based on a combination of level 1 features and minutia-based triangular descriptors is described. A feature is considered rare if it is statistically uncommon; for example, such a rare feature should be unique among N>1000 randomly sampled prints. A fingerprint feature that is rare has higher discriminatory power when it is identified in a print (latent or otherwise), and multiple rare features in a single print can increase discriminatory power dramatically. In the case of latent matching, such information can be significant for reaching a decision. The new approach was tested experimentally using the NIST SD-27 database and an FBI database of 11,036 unique fingerprints. The results indicated that every randomly selected fingerprint from the composite database has a small set of highly distinctive statistically rare features, some of with occurrence of 1 in 1000 fingerprints.
This paper provides an overview of error-resilient techniques to design robust and energy-efficient nanoscale DSP systems while focusing on statistical error compensation techniques. We demonstrate that logic-level er...
详细信息
This paper provides an overview of error-resilient techniques to design robust and energy-efficient nanoscale DSP systems while focusing on statistical error compensation techniques. We demonstrate that logic-level error resiliency devises techniques independent of the application context. This results in significant complexity overhead especially with the highly unreliable circuits fabric. On the other hand, system-level error resiliency, such as statistical error compensation, employs techniques from statistical signal processing in order to exploit the hardware error behavior at application level and engineer the error compensation mechanism to match the application requirements. The benefits of such a design philosophy are tremendous gains in robustness (> 1000x) and energy efficiency (3×-to-6×) In addition, the paper paves the way to the deployment of novel statistical error compensation techniques based on principles from pattern recognition and iterative/turbo detection.
暂无评论