This paper investigates how the model size affects the ability of a generative ai language model, or briefly a GLM, to support the text-to-SQL task for databases with large schemas typical of real-world applications. ...
详细信息
ISBN:
(纸本)9783031758713;9783031758720
This paper investigates how the model size affects the ability of a generative ai language model, or briefly a GLM, to support the text-to-SQL task for databases with large schemas typical of real-world applications. The paper first introduces a text-to-SQL framework that combines a prompt strategy and a Retrieval-Augmented Generation (RAG) technique, leaving as flexibilization points the GLM and the database. Then, it describes a benchmark based on an open-source database featuring a schema much larger than the schemas of most of the databases in familiar text-to-SQL benchmarks. The paper proceeds with experiments to assess the performance of the text-to-SQL framework instantiated with the benchmark database and GLMs of different sizes. The paper concludes with recommendations to help select which GLM size is appropriate for a text-to-SQL scenario, characterized by the difficulty of the expected NL questions and the data privacy requirements, among other characteristics.
暂无评论