Tensor network and tensor computation are widely applied in scientific and engineering domains like quantum physics, electronic design automation, and machine learning. As one of the most fundamental operations for te...
详细信息
Tensor network and tensor computation are widely applied in scientific and engineering domains like quantum physics, electronic design automation, and machine learning. As one of the most fundamental operations for tensor networks, a tensor contraction eliminates the sharing orders among tensors and produces a compact sub-network. Different contraction sequence usually yields distinct storage and compute costs, and searching the optimal sequence is known as a hard problem. Prior work have designed heuristic and fast algorithms to solve this problem, however, several issues still remain unsolved. For example, the data format and data structure are not efficient, the constraints during modeling are impractical, the search of the optimal solution might fail, and the search cost is very high. In this paper, we first introduce a log(k) order representation and design an adjacency matrix-based data structure to efficiently accelerate the search of the optimal contraction sequence. Then, we propose an outer product pruning method with acceptable overhead to reduce the search space. Finally, we use a multithread optimization in our implementation to further improve the execution performance. We also present indepth analysis of factors that influence the search time. This work provides a full-stack solution for optimal contraction sequence search from both high-level data structure and search algorithm to low-level execution parallelism, and it will benefit a broad range of tensor-related applications.
Any processor's performance is dependent on three important factor speed, area and power. The better tread-off between factors, an effective once. Multiplier are common used in computation process. In this paper, ...
详细信息
ISBN:
(纸本)9781538604496
Any processor's performance is dependent on three important factor speed, area and power. The better tread-off between factors, an effective once. Multiplier are common used in computation process. In this paper, the proposed multiplier by design based on the sutra "Urdhva Tiryakbhyam and Nikhilam" of Vedic are analyzed and the performance results of multiplier are compare with conventional multipliers and karatsuba once of the most popular. In conclusion of experiment Vedic mathematics can be improvement computation effective. Specifically, in accurate computation, in many digit mathematics or very long digit, that want power of computation. By helping of special opcode instruction that bundle in processor such as SSE, AVX and etc. in modern processor to increment parallel process level in single core by SIMD and vector register.
暂无评论