There is a trend towards increased specialization of data management software for perfor- mance reasons. The improved performance not only leads to a more efficient usage of the underlying hardware and cuts the operat...
详细信息
There is a trend towards increased specialization of data management software for perfor- mance reasons. The improved performance not only leads to a more efficient usage of the underlying hardware and cuts the operation costs of the system, but also is a game-changing competitive advantage for many emerging application domains such as high-frequency algo- rithmic trading, clickstream analysis, infrastructure monitoring, fraud detection, and online advertising to name a few. In this thesis, we study the automatic specialization and optimization of database appli- cation programs – sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, user-defined functions (UDFs), transactional workloads and triggers in languages such as PL/SQL. We propose to build online transaction processing (OLTP) systems around a modern compiler infrastructure. We show how to build an optimizing compiler for transaction programs using generative programming and state-of-the-art com- piler technology, and present techniques for aggressive code inlining, fusion, deforestation, and data structure specialization in the domain of relational transaction programs. We also identify and explore the key optimizations that can be applied in this domain. In addition, we study the advantage of using program dependency analysis and restructuring to enable the concurrency control algorithms to achieve higher performance. Traditionally, optimistic concurrency control algorithms, such as optimistic Multi-Version Concurrency Control (MVCC), avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for use cases with high contention objects or long running transactions. In addition, restarting from scratch creates a negative feedback loop in the system, because the system
暂无评论