We propose a concrete ("pointer as integer") memory semantics for C that supports verified compilation to a target environment having simple "public vs. private"data protection based on tagging or ...
详细信息
Deep Learning (DL) compilers such as TVM enable the efficient deployment of diverse DL models on heterogeneous and resource-constrained devices to meet the needs for low latency, privacy protection, and enhanced relia...
详细信息
The existence of control dependencies within programs necessitates intricate data reorganization, significantly hindering the vectorization capabilities in automated SIMD compilation processes. The latest iteration of...
详细信息
This research paper offers a comprehensive performance study of SpecFEM-3D, a well known software package devised for simulating seismic wave propagation in complex 3D geological structures, on the ARM A64FX compute a...
详细信息
In this paper we present our early work at improving Smalltalk performance by inlining message sends during compilation. Smalltalk developers typically write small method bodies with one or two statements, this limits...
详细信息
Graph neural networks (GNNs) have recently em-powered various novel computer vision (CV) tasks. In GNN-based CV tasks, a combination of CNN layers and GNN layers or only GNN layers are employed. This paper introduces ...
详细信息
This paper discusses the ongoing development of the Zag Smalltalk LLVM JIT Compiler project. The project is aimed at enhancing the performance of dynamic languages through JIT compilation using LLVM. We highlight the ...
详细信息
Register Transfer Level (RTL) code optimization is crucial for enhancing the efficiency and performance of digital circuits during early synthesis stages. Currently, optimization relies heavily on manual efforts by sk...
详细信息
ISBN:
(纸本)9798400710773
Register Transfer Level (RTL) code optimization is crucial for enhancing the efficiency and performance of digital circuits during early synthesis stages. Currently, optimization relies heavily on manual efforts by skilled engineers, often requiring multiple iterations based on synthesis feedback. In contrast, existing compiler-based methods fall short in addressing complex designs. This paper introduces RTLRewriter, an innovative framework that leverages large models to optimize RTL code. A circuit partition pipeline is utilized for fast synthesis and efficient rewriting. A multi-modal program analysis is proposed to incorporate vital visual diagram information as optimization cues. A specialized search engine is designed to identify useful optimization guides, algorithms, and code snippets that enhance the model's ability to generate optimized RTL. Additionally, we introduce a Cost-aware Monte Carlo Tree Search (C-MCTS) algorithm for efficient rewriting, managing diverse retrieved contents and steering the rewriting results. Furthermore, a fast verification pipeline is proposed to reduce verification cost. To cater to the needs of both industry and academia, we propose two benchmarking suites: the long Rewriter benchmark, targeting complex scenarios with extensive circuit partitioning, optimization trade-offs, and verification challenges, and the short Rewriter benchmark, designed for a wider range of scenarios and patterns. Our comparative analysis with established compilers such as Yosys and E-graph demonstrates significant improvements, highlighting the benefits of integrating large models into the early stages of circuit design. We provide our benchmarks at https://***/yaoxufeng/RTLRewriter-Bench.
In order to achieve the peek program performance, compilers employ numerous optimizations. Some of these optimizations, although highly effective, come with the high price in terms of compilation time, and the compile...
详细信息
With the growth of digital data and rising security concerns, techniques for privacy-preserving computation have become increasingly essential. Big integer multiplication, pivotal for these applications, is compute-in...
详细信息
暂无评论