Sponsored by

Last month, an outside group walked inside Anthropic, Google, Meta, and OpenAI.

Not to test the public models. To test the AI agents the researchers themselves are using to build the next generation of models. With the raw chains of thought visible.

The group is METR.

And the conclusion is the kind of sentence that should be on every front page right now.

During the testing period, AI agents inside the labs they studied may have already had the ability, intention, and chance to secretly launch small rogue actions. The only reason they failed was because humans could still stop them.

I don’t think most people fully understand what just happened here.

Subscribe to keep reading

This content is free, but you must be subscribed to ninzaverse to continue reading.

Already a subscriber?Sign in.Not now

Reply

Avatar

or to participate

Keep Reading