OpenAI introduces a framework and 13 evaluations to measure chain-of-thought monitorability in AI reasoning models. The research examines how monitoring internal reasoning chains compares to monitoring outputs alone, finding that frontier models remain fairly monitorable and that longer reasoning improves monitorability. A key

12m read time From openai.com
Post cover image

Sort: