Designing A Multimodal SQL Explanation System for Query Comprehension and SQL Learning
- Type:Master's thesis
- Supervisor:
Introduction
Natural Language Interfaces to Databases (NLIDBs) provide a promising approach to lower barriers for non-technical users to access data by generating SQL queries from natural language input. However, non-technical users often lack the skills to determine whether generated queries accurately align with their intended purpose. We developed MiXQL, a novel multimodal SQL explanation system designed to enhance query comprehension for non-technical users. In a preliminary study (N = 20), we evaluated our design using crowd workers. Results indicate that while comprehension benefits are limited for simple queries, multimodal explanations substantially improve understanding of complex SQL constructs such as joins, aggregations, and subqueries. We now plan to conduct a large-scale study to extend MiXQL and evaluate its effect on query comprehension and long-term benefits for learning SQL.
Aim
- Extend the functionality of MiXQL
- Design and conduct an empirical evaluation for MiXQL (e.g., online experiment)
- Analysis of quantitative results
- Possibility for a joint research publication
Requirements
- Good programming skills in Python and JavaScript
- Interest in Human-Computer Interaction/Human-AI Collaboration
- Interest in quantitative research
- Strong time management and communication skills, and proficiency in English
Literature
Cembalo, M., De Santis, A., & Ferraro Petrillo, U. (2011). SAVI: A new system for advanced SQL visualization. Proceedings of the 2011 Conference on Information Technology Education, 165–170. https://doi.org/10.1145/2047594.2047641
Folland, K. A. T. (2016). viSQLizer: An interactive visualizer for learning SQL [Master thesis]. In 159. https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/2402729
Leventidis, A., Zhang, J., Dunne, C., Gatterbauer, W., Jagadish, H. V., & Riedewald, M. (2020). QueryVis: Logic-based Diagrams help Users Understand Complicated SQL Queries Faster. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2303–2318. https://doi.org/10.1145/3318464.3389767
Miedema, D., & Fletcher, G. (2021). SQLVis: Visual Query Representations for Supporting SQL Learners. 2021 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 1–9. https://doi.org/10.1109/VL/HCC51201.2021.9576431
Narechania, A., Fourney, A., Lee, B., & Ramos, G. (2021). DIY: Assessing the Correctness of Natural Language to SQL Systems. Proceedings of the 26th International Conference on Intelligent User Interfaces, 597–607. https://doi.org/10.1145/3397481.3450667
Tian, Y., Kummerfeld, J. K., Li, T. J.-J., & Zhang, T. (2024). SQLucid: Grounding Natural Language Database Queries with Interactive Explanations. Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, 1–20. https://doi.org/10.1145/3654777.3676368
Tian, Y., Zhang, Z., Ning, Z., Li, T. J.-J., Kummerfeld, J. K., & Zhang, T. (2024). Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations (No. arXiv:2305.07372). arXiv. https://doi.org/10.48550/arXiv.2305.07372