Loading
Loading
When the labs building frontier AI expect it to reach human level — and, just as important, what each step unlocks and what to do about it. Updated every time a new model moves the date.
“When I say AGI, I mean a machine that can do the sorts of cognitive things that people can typically do, possibly more.”
We anchor to Legg's definition because it's the most cited and the most testable: AGI is reached only when expert teams, with full access to the system, can no longer find a cognitive task where it falls below a typical human. Legg has put the odds at 50% by 2028 — an estimate he's held since his 2011 blog post.
Each dot is a model. Left → right is when it shipped; bottom → top is how long a task it finishes unattended at 50% reliability. Color marks the lab. The solid line is the trend; the dashed line continues it at the measured doubling.
GPT-4 handled ~4-minute tasks in 2023. Three years on, Claude Opus 4.6 handles ~12 hr at 50% reliability — though only ~1.2 hr at the stricter 80% bar. Doubling about every 4.2 months.
Log scale: each gridline is a 10× jump, so steady doubling shows up as a straight climb — the cleanest way to read the trend.
Why does the dashed line climb so fast? It isn't drawn by hand — it continues the measured pace. The 50% horizon doubles about every 4.2 months (METR, since 2023); hold that and it compounds — about 5.7 doublings in two years, roughly 51×. Compounding, not optimism, is what lifts the line.
Extrapolated, the curve reaches a work-month of autonomous work around Jun 2027 (range Mar 2027–Jan 2028) at 50% reliability. 50% means “succeeds about half the time”; 80% is the dependable bar and lands later — toggle the reliability above to compare. The 50% path lands on our end-2028 AGI call from the evidence side, not the opinion side.
The curve above shows how fast capability has grown. But when AI can do a work-month (≈167 work-hours) of expert work unattended depends entirely on what the rate does next. Three assumptions, anchored on the same measured point —Claude Opus 4.6 at ~12 hr — and how far apart they land.
Log scale: the trend is a straight line, the stalled path bends down, the RSI path bends up — so the three rate regimes read at a glance.
Same measured data; three guesses at the rate. The date AI could sustain a work-month of expert work swings from ~Jun 2027 (trend) to ~Sep 2032 if compounding stalls — years apart, entirely from how you extrapolate. Most people extrapolate linearly and land late; capability has compounded, and compounding lands early. That gap — between the linear intuition and the exponential reality — is where the edge is. The RSI path is the tail: if models start accelerating their own progress, even the trend date is conservative.
The curve says when. This says what changes when each bar falls — and where to stand before it does. Every rung is a measured date paired with our read on what it unlocks and how to position. The dates are METR's; the reads are ours.
The unit you delegate jumps from a question to a whole task. But at 50% reliability your job flips from doing the work to checking it, so the value migrates to whatever catches the other half — eval harnesses, diff review, sandboxes, replay.
Bet on the verification layer. Generation stopped being the bottleneck; trusting the output is the new one. Tooling priced on a human in every loop starts to look mispriced.
Now you delegate a project, not a task: an agent can hold a week-long goal, sequence its own sub-tasks, and recover from its own mistakes across days. The work that survives is framing the goal and judging the result — taste and specification, not execution.
The org chart becomes the product. Whoever turns one operator into a manager of agents takes the seat — and headcount-priced software wobbles hardest here, because one person now ships a team’s output.
The question stops being which tasks and becomes which jobs. Whole functions — a research desk, a junior dev team, a paralegal pool — can run as a service rather than a headcount.
When cognition is cheap, the constraint moves off cognition — onto what doesn’t scale with model quality: the compute and energy to run it, the trust and liability when it’s wrong, and proprietary data nobody can reproduce. Owning the scarce complement beats owning the model.
Every major capability event, with how it moved our range — and why.
Anthropic co-founder Jack Clark, after a multi-week internal-data review, publicly assigned a 60% probability to recursive self-improvement occurring before end of 2028. Pere's range — end of 2028 to end of 2030 — already absorbs Clark's call: he's a forcing function on the low side, not a new data point that moves the curve.
SourceSix weeks after a more conservative "five to 10 years" line at Davos, Hassabis pulled in to "AGI is on the horizon, maybe within the next five years" at the India AI Impact Summit. The shift narrows the public DeepMind position toward Legg's 2028 anchor and toward Pere's 2028–2030 range. We treat this as a meaningful pull-forward from the most cautious frontier-lab CEO.
SourceEach estimate above is a real, public claim — here's the verbatim quote, the date, and the link.
“I think there's a 50% chance that we have AGI by 2028. Now, it's just a 50% chance.”
“Now in 2026, we're at another threshold moment where AGI is on the horizon, maybe within the next five years.”
“My basic prediction is that powerful AI could come as early as 2026, though there are also ways it could take much longer.”
“I think there's a 60% chance that recursive self-improvement (RSI) will occur before the end of 2028.”
“It is possible that we will have superintelligence in a few thousand days; it may take longer, but I'm confident we'll get there.”
Every forecast presented is anchored on public, verified statements from leading AI research laboratories and industry figures. Verbatim citations, precise publication dates, and primary source links are documented for every entry.
The composite range represents a statistical consensus interval rather than a single target. Updates are applied systematically based on verified capability jumps, hardware milestones, and architectural breakthroughs.