Optimizing customer support using Text2SQL to query natural language databases

Maj, Michał; Pliszczuk, Damian; Marek, Patryk; Wilczewska, Weronika; Przysucha, Bartosz; Rymarczyk, Tomasz

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/128634

Title:	Optimizing customer support using Text2SQL to query natural language databases
Authors:	Maj, Michał Pliszczuk, Damian Marek, Patryk Wilczewska, Weronika Przysucha, Bartosz Rymarczyk, Tomasz
Keywords:	Natural language processing (Computer science) Customer services -- Data processing SQL (Computer program language) Databases
Issue Date:	2024
Publisher:	University of Piraeus. International Strategic Management Association
Citation:	Maj, M., Pliszczuk, D., Marek, P., Wilczewska, W., Przysucha, B., & Rymarczyk, T. (2024). Optimizing customer support using Text2SQL to query natural language databases. European Research Studies Journal, 27(s3), 426-438.
Abstract:	PURPOSE: This paper explores the challenges and potential solutions associated with integrating Text2SQL technology into customer support operations. By leveraging large language models (LLMs) and tools like Vanna.AI, the study aims to enhance the efficiency and accuracy of handling customer queries without requiring specialized SQL knowledge. DESIGN/METHODOLOGY/APPROACH: A comprehensive analysis was conducted comparing the effectiveness of three large language models—Llama3:70b-instruct, Gemma2:27b, and Codegemma—in generating correct SQL queries from natural language questions. The models were trained with identical datasets and evaluated using six benchmark questions over two iterations, with and without detailed database schema information. Performance metrics included correctness of the generated queries and response times. FINDINGS: The results indicated that while Llama3 and Gemma2 initially demonstrated higher accuracy, the addition of detailed database schema information did not improve model performance. Instead, it led to decreased accuracy and increased response times, particularly for Llama3. Codegemma showed shorter response times but slightly lower accuracy. The study highlights that excessive contextual information can overwhelm LLMs, suggesting the need for optimized context provision. PRACTICAL IMPLICATIONS: The findings suggest that simplifying database schema information and focusing on essential contextual data can enhance the performance of LLMs in generating SQL queries. Implementing tools like Vanna.AI, which utilize Retrieval Augmented Generation (RAG), can improve customer support processes by enabling quick and accurate data access without specialized SQL expertise. ORIGINALITY/VALUE: This paper provides valuable insights into the practical challenges of implementing Text2SQL technology in customer support. It offers recommendations for balancing context provision and model capabilities, contributing to the optimization of LLM performance in real-world applications.
URI:	https://www.um.edu.mt/library/oar/handle/123456789/128634
Appears in Collections:	European Research Studies Journal, Volume 27, Special Issue 1 - Part 2

Files in This Item:

File	Description	Size	Format
ERSJ27(s3)A27.pdf		307.2 kB	Adobe PDF	View/Open

Show full item record Statistics