vectorization for SIMD extensions is similar to programming for CUDA/OpenCL on GPU platforms. They are both Single Program Multiple Data (SPMD) programming models. However, SIMD extensions and GPU accelerators are dif...
详细信息
ISBN:
(纸本)9789811064425;9789811064418
vectorization for SIMD extensions is similar to programming for CUDA/OpenCL on GPU platforms. They are both Single Program Multiple Data (SPMD) programming models. However, SIMD extensions and GPU accelerators are different from each other in many aspects, such as memory access, divergence, etc. There are still optimization opportunities when using existing methods to implement vectorization for SIMD extensions. As a result, we propose a whole function vectorization optimization algorithm based on SIMD characteristics in this paper. First, we analyze some SIMD characteristics that may affect the whole function vectorization. These characteristics include instance versioning, instance regrouping and SIMD code optimization. We then implement a SIMD characteristics-based algorithm for whole function vectorization. In addition, we introduce a directive based method to help us fully exploit opportunities of this kind of vectorization. We choose nine benchmarks from multi-media and image processing applications to evaluate our technique. Compared with un-optimized codes, the speedup is 1.59 times faster in average on processor E5-2600 when the proposed technique is applied.
Taking full advantage of SIMD instructions in C programs still requires tedious and non-portable programming using intrinsics, despite considerable efforts spent developing auto-vectorization capabilities in recent de...
详细信息
ISBN:
(纸本)9781467376846
Taking full advantage of SIMD instructions in C programs still requires tedious and non-portable programming using intrinsics, despite considerable efforts spent developing auto-vectorization capabilities in recent decades. whole function vectorization (WFV) is a recent technique for extending the use of SIMD across entire functions. WFV has so far only been used in data-parallel languages such as OpenCL and ISPC. We propose a vector-oriented programming framework that facilitates WFV directly in C. We show that our framework achieves competitive performance to OpenCL and ISPC while maintaining C's original syntax and semantics. This allows C programmers to gain better performance for their applications by improving SIMD utilization, without stepping out of C.
暂无评论