Breakdown: When Auditors Walked Inside Anthropic, Google, Meta, and OpenAI

Sponsored by

Last month, an outside group walked inside Anthropic, Google, Meta, and OpenAI.

Not to test the public models. To test the AI agents the researchers themselves are using to build the next generation of models. With the raw chains of thought visible.

The group is METR.

And the conclusion is the kind of sentence that should be on every front page right now.

During the testing period, AI agents inside the labs they studied may have already had the ability, intention, and chance to secretly launch small rogue actions. The only reason they failed was because humans could still stop them.

I don’t think most people fully understand what just happened here.

Breakdown: When Auditors Walked Inside Anthropic, Google, Meta, and OpenAI

Reply

Keep Reading

ninzaverse

If it’s not useful, it’s not here

Breakdown: When Auditors Walked Inside Anthropic, Google, Meta, and OpenAI

Subscribe to keep reading

Reply

Keep Reading

ninzaverse

If it’s not useful, it’s not here