A research collaboration between UPenn and UCB explores using LLM-powered agents to tackle the classic database problem of join order optimization. Rather than placing an LLM in the hot path of a query optimizer, the prototype agent acts as an offline experimenter: given 50 iterations, it tests different join orderings using structured outputs and learns from observed runtimes. On the Join Order benchmark (JOB) with a scaled-up IMDb dataset, a frontier model achieved a 1.288x geomean latency improvement over the default optimizer, with P90 latency dropping by 41%. The agent outperformed perfect cardinality estimates and BayesQO. Key insight: LLMs excel at the iterative, exploratory tuning process that human experts perform manually, especially for queries with difficult predicates like LIKEs that confound traditional cardinality estimators.
Sort: