MIT Studied 3,000 Work Tasks. Your Job Isn't as Safe as You Think.

Everyone has a theory about how AI will take over jobs.

Some people say it'll happen overnight. One day your job is fine, the next day an AI model update drops and suddenly your entire role is obsolete.

Others say it'll never really happen. AI is just a tool. It'll help, not replace.

But very few people actually go and measure it. With real tasks. Evaluated by real workers.

MIT just did.

A research team at MIT FutureTech published a paper that I think is one of the most important pieces of AI research this year.

They tested over 40 AI models on more than 3,000 real labor market tasks. Actual job tasks pulled from the U.S. Department of Labor's O*NET database.. the same system the government uses to define what people do at work.

And they didn't ask AI researchers to judge the outputs. They asked 17,000+ real workers with on-the-job experience in those exact roles.

The paper is called "Crashing Waves vs. Rising Tides."

And those two metaphors? They basically frame the entire future of how AI will affect your career.

Crashing Waves vs. Rising Tides

There have been two competing ideas in AI research about how automation hits the workforce.

The first idea is "crashing waves." AI suddenly becomes amazing at a specific set of tasks. Yesterday it couldn't do them. Today it nails them. Workers in those areas get blindsided. No warning. No transition period. Just.. gone.

A research group called METR published earlier work supporting this. They studied around 170 research and software engineering tasks and found exactly this pattern. Steep jumps. AI goes from nearly always failing to nearly always succeeding on certain tasks within a very short window.

The second idea is "rising tides." AI doesn't suddenly master one thing. It gets a little better at everything, all at once. The water rises equally everywhere. Slowly. Steadily.

Now here's the thing.

If crashing waves is the real pattern, you should be scared. Because it means you won't see it coming. Your specific job could be fine today and completely automatable tomorrow with a single model release.

But if rising tides is the real pattern, it's a totally different situation. You'd see AI slowly creeping into your tasks over months and years. You'd have time to adapt. Time to shift.

MIT tested which one is actually happening.

And the answer is the rising tide.

Here's how they did it. They took thousands of job tasks from the O*NET database. Things like preparing financial reports, writing project status presentations, assisting students with coursework, designing employee training programs.

Real stuff. Real jobs.

They filtered for tasks where AI could realistically save at least 10% of the time a human takes. That gave them over 11,000 tasks.

For each task, they created detailed, realistic work scenarios.. the kind of prompt that mirrors what a professional actually encounters on the job. Then they fed those prompts to 41 AI models. Everything from GPT-3.5 to GPT-5, Claude Opus 4.1, Gemini 2.5 Pro, DeepSeek R1, Llama 4.. the full lineup.

The outputs were judged by domain experts. People who actually do these jobs. They rated each response on a 1-9 scale.

A score of 7 or above meant "this is good enough to use without any edits." Like, a manager would look at it and go, "Yeah, ship it."

You'd expect AI to crush 5-minute tasks and completely fall apart on 8-hour tasks, right? That would be the crashing wave pattern. Good at easy stuff, terrible at hard stuff.

But MIT found the curve is almost flat.

AI models performed at roughly similar levels on quick tasks and long tasks. The success rate drops a little as tasks get longer.. obviously. But it's a gentle slope, not a cliff.

The average success rate across all models was about 60%. And when the task got ten times longer, the predicted success dropped by only about 7.6 percentage points.

That's not a crash. That's barely a dip.

And this pattern held across different model sizes and different time periods. Whether they looked at big models or small ones, newer models or older ones.. the curve stayed flat. The rising tide pattern showed up everywhere.

Now let me give you the numbers that actually matter here. Because I think the raw data tells a more honest story than any narrative.

In Q2 of 2024, frontier AI models could handle tasks that take humans about 3-4 hours with a 50% success rate.

By Q3 of 2025.. just over a year later.. that 50% success rate applied to tasks that take humans about a week.

The "doubling time".. meaning how many months it takes for AI to handle tasks twice as long at the same quality bar.. is 3.8 months.

Every 3.8 months, the complexity ceiling roughly doubles.

And the failure rate for AI on these tasks halves every 2.4 to 3.2 years depending on the task length. That corresponds to an improvement of about 8-11 percentage points per year.

If you extrapolate this (and the researchers were very careful to say this is an upper-bound, optimistic scenario), most text-based work tasks could reach 80-95% AI success rates by 2029.

Reaching near-perfect success rates.. like 99%.. would take considerably longer. Years longer. Because of how the math works. The closer you get to perfection, the harder each additional percentage point becomes.

And for a lot of jobs, "minimally sufficient" isn't good enough. A lawyer can't send out a contract that's "mostly correct." A doctor can't give a diagnosis that's "good enough 80% of the time."

So the tide is rising. Fast. But it's not a flood. Not yet.

There's another finding in the paper that I think is underrated.

They compared what happens when you make models bigger versus when you release newer models. And the difference is fascinating.

Bigger models (over 100 billion parameters) did much better on short, simple tasks compared to smaller models. But on longer, complex tasks? The advantage shrank. Making the model bigger mostly helped with the easy stuff.

But newer models.. regardless of size.. improved almost equally across all task durations. Short tasks, long tasks, everything got better by roughly the same amount.

This is important because it tells you something about the nature of AI progress right now. It's not just about throwing more compute at the problem. The algorithmic improvements, the training techniques, the architectural changes.. those are lifting the entire floor evenly.

That's the rising tide in action.

The researchers were also honest about what their data doesn't tell you.

High success rates on individual tasks don't mean those jobs are getting automated tomorrow. There's a massive gap between "AI can technically do this task in a controlled setting" and "a company actually replaces a human with AI for this task."

Their setup gave AI all the information it needed. In real life, gathering that information is often half the work.

And automating individual tasks is fundamentally different from automating entire jobs. A job is a bundle of tasks. Losing one task to AI might actually make you more valuable, not less, because the remaining tasks.. the ones AI can't do.. become the differentiator.

My Take

This paper doesn't have a dramatic conclusion. And that's exactly why I think it's important.

The "crashing wave" story is scarier. It makes better headlines. AI suddenly gets superhuman at X and everyone in that field is done. That's a viral tweet.

But the data says something quieter. AI is getting better at almost everything, slowly and steadily, across the board. No dramatic collapses in specific job categories. No sudden cliffs. Just a tide that keeps rising, evenly, across the entire economy.

And honestly.. I think the rising tide is actually more concerning than the crashing wave.

Here's why.

A crashing wave is visible. You can see it coming. You can prepare. You can move.

A rising tide? You don't notice it until your feet are wet.

You're not going to wake up one morning and find your job gone. But you might look back three years from now and realize AI quietly absorbed 40% of what you used to do. One task at a time. Without any single moment where everything changed.

The question isn't whether AI will automate your tasks. MIT just showed you the data. It will.

The question is what you do with the time it frees up.

Do you think the "rising tide" pattern makes AI automation more concerning or less concerning than a sudden "crashing wave"?

If you made it this far, you're not a casual reader. You actually think about this stuff.

So here's my ask. If this article made you think, even a little, share it with one person. Just one. Someone who's in the AI space. Someone who reads. Someone who would actually sit with these ideas instead of scrolling past them.

That's how this newsletter grows. Not through ads or algorithms. Through you sending it to someone and saying "read this."

And honestly? That means more to me than any metric.

MIT Studied 3,000 Work Tasks. Your Job Isn't as Safe as You Think.

Crashing Waves vs. Rising Tides

My Take

Reply

Keep Reading

ninzaverse

If it’s not useful, it’s not here