Can natural language replace SQL? We benchmarked the SQL-writing ability of the top 19 LLMs to find out.

Tinybird is a real-time data analytics platform for building fast and scalable data pipelines, offering tools for data ingestion, processing, and visualization. From this blog users can learn about stream processing, data warehousing, and analytics techniques to unlock insights from their data and make data-driven decisions.

Tinybird

Tinybird's LLM SQL Generation Benchmark evaluates how 19 popular language models perform in generating SQL queries to filter and aggregate large datasets. Comparing models like OpenAI's GPT-4 Turbo and Anthropic's Claude, the benchmark measures accuracy, efficiency, and query latency, highlighting the challenges LLMs face in writing semantically correct SQL efficiently. The analysis shows humans leading in efficiency, while LLMs often struggle with contextual understanding and optimization opportunities.

Which LLM writes the best analytical SQL?

Explore the benchmark, contribute, and suggest models

Pick the shittest model from each provider and let’s rate LLMs based on it … cool
NOT A SINGLE THINKING MODEL? Dude… in may you should’ve at least done Claude 3.7 Thinking or o3-mini or R1
HELL, even o1-pro was released, it would’ve CRUSHED this bench

They didn’t test Codestral from Mistral.
In my case is the best in understanding complex schemas.

very good article and important guidance.

Nice. probably outdated by now lol