Somewhere inside OpenAI, there's a Slack channel where mistakes get posted automatically.
When a researcher accidentally trains a reasoning model the wrong way, a bot notices. It pings them privately. Then it drops the case in a public channel where everyone in the company can see.
That bot has apparently been busy.
Because this week, OpenAI admitted that across at least seven of their released GPT-5 models.. they were breaking one of their own safety rules. For months. And the only reason they caught it.. was because somebody finally built that bot.
This research was written by five OpenAI alignment researchers, and the draft was reviewed by METR, Apollo Research, and Redwood Research before it was made public. So let’s talk about it.


