Python is where ideas start—but it isn't where portable, low-latency software ends. In this talk, I'll show how we use LLMs inside a constrained, verifiable compiler pipeline to turn plain Python functions into self-contained native binaries that run anywhere (cloud, desktop, mobile & web); and how our customers use this technology to run open-source AI models locally and in the cloud with the familiar OpenAI client experience.

Yusuf Olokoba is the founder of Muna, specializing in code generation for AI inference workloads. He previously co-founded a real estate technology startup, backed by First Round Capital and Bessemer, which was later acquired. He holds patents in computer vision and augmented reality, powering augmented reality experiences used by millions of users. Yusuf holds a B.A. in computer science from Dartmouth College and is an alumnus of South Park Commons.

---
Socials:
- LinkedIn: https://www.linkedin.com/in/olokobayusuf/
- X (Twitter): https://x.com/olokobayusuf
- GitHub: https://github.com/olokobayusuf
- Company: Muna (https://muna.ai)

AI Engineer

Explores building a Python compiler that converts AI inference code into self-contained binaries for cross-platform deployment. Covers symbolic tracing, type propagation from dynamic Python to statically-typed C++/Rust, and using LLMs to generate native code translations. Demonstrates compiling Google's Gemma embedding model into a binary accessible through an OpenAI-compatible client interface, enabling hybrid inference across cloud and edge devices.

Compilers in the Age of LLMs — Yusuf Olokoba, Muna