The AI Eval Flywheel: Scorers, Datasets, Production Usage & Rapid Iteration

Last week I attended the 2025 AI Engineer World’s Fair in San Francisco with a bunch of other founders from Seattle Foundations.

There were over 20 tracks on specific topics, and I went particularly deep on Evals, learning firsthand how companies like Google, Notion, Zapier, and Vercel build and deploy evals for their AI features.

While there were meaningful unique details in each talk, there was also surprising consistency on the general framework which I’m representing with this flywheel. — Read More

#strategy