This repo contains test suite evaluation metric for 11 text-to-SQL tasks. Compared to other current metrics, test suite calculates a tighter upper-bound for semantic accuracy efficiently. It is ...
Developed with the assistance of GitHub Copilot (Claude / GPT models). AI assistance was used for code generation, refactoring, feature implementation, and documentation. All AI-generated code has ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results