From GPT-4 to AGI

Table of Contents

Article Description

Counting the Efficiency Increase

Follow the OOM to see a glimpse of the future

The data wall and how to smash through it

A counter argument to stalled progress

Unhobbling

Subscribe to get access to these posts, and every post

Link

Article Description

This article makes the case that a leap to AGI by 2027 is possible, predicting and explaining that we should see another leap equivalent to the leap from GPT2 to GPT4 over the next 3 years.

AI Trends

Counting the Efficiency Increase

The author has a theory, that the progress in AI will come from a combination of:

How much more powerful the computers are (compute).
How much smarter the algorithms are getting (algorithmic efficiencies), which makes the computers feel even more powerful (we call this "effective compute").
How fixing small issues can make AI even better (unhobbling gains).

They sum all of this up into a framework called OOM (Order of Magnitude) where gains happen in multiples of 10x.

So they expect an increase in 100,000 in effective compute scaleup over the next 4 years.

AI Speed of Progress

Look at the chart below for the improvment tha thas been made in a single year.

GPT Improvement

Over and over again, year after year, skeptics have claimed “deep learning won’t be able to do X” and have been quickly proven wrong.If there’s one lesson we’ve learned from the past decade of AI, it’s that you should never bet against deep learning.

Leopold Aschenbrenner

Follow the OOM to see a glimpse of the future

With each OOM of effective compute, models predictably, reliably get better. If we can count the OOMs, we can (roughly, qualitatively) extrapolate capability improvements. That’s how a few prescient individuals saw GPT-4 coming.

Leopold Aschenbrenner

We can break down the progress from GPT-2 to GPT-4 into three big improvements:

Compute: We're using much bigger and more powerful computers to train these AI models.
Algorithmic efficiencies: The algorithms are getting better and smarter, making the computers seem even more powerful. We can measure this as growing "effective compute."
Unhobbling gains: AI models have lots of potential, but they are held back by simple issues. By making small fixes like using human feedback, step-by-step thinking, and adding helpful tools, we can unlock a lot of hidden abilities and make the AI much more useful.

Efficiency Gain

The data wall and how to smash through it

There is a potentially important source of variance for all of this: we’re running out of internet data. That could mean that, very soon, the naive approach to pretraining larger language models on more scraped data could start hitting serious bottlenecks.

Leopold Aschenbrenner

But maybe there is a way to be more efficient with the data we have.

What a modern AI model (LLM) does during training is like skimming through a textbook really fast without much thinking.

When we read a math textbook, we do it slowly, think about it, discuss it with friends, and try practice problems until we understand. We wouldn't learn much if we just skimmed through it like the AI models do.

But, there are ways to help AI models learn better by making them do what we do: think about the material, discuss it, and keep trying problems until they get it. This is what synthetic data, self-play, and reinforcement learning approaches aim to achieve.

A common pattern in deep learning is that it takes a lot of effort (and many failed projects) to get the details right, but eventually some version of the obvious and simple thing just works. Given how deep learning has managed to crash through every supposed wall over the last decade, my base case is that it will be similar here.

Leopold Aschenbrenner

A counter argument to stalled progress

Moreover, it actually seems possible that cracking one of these algorithmic bets like synthetic data could dramatically improve models. Here’s an intuition pump. Current frontier models like Llama 3 are trained on the internet—and the internet is mostly crap, like e-commerce or SEO or whatever. Many LLMs spend the vast majority of their training compute on this crap, rather than on really high-quality data (e.g. reasoning chains of people working through difficult science problems). Imagine if you could spend GPT-4-level compute on entirely extremely high-quality data—it could be a much, much more capable model.

Leopold Aschenbrenner

AlphaGo, the first AI to beat world champions at Go, is a great example.

Step 1: AlphaGo learned by watching expert human Go games. This gave it a basic understanding.
Step 2: AlphaGo played millions of games against itself. This made it super good at Go, leading to moves like the famous move 37 against Lee Sedol, which was brilliant and unexpected. This self-play method allowed AlphaGo to explore new strategies and refine its skills beyond human capabilities. It shows how AI can advance rapidly by learning from its own experiences, potentially leading to breakthroughs in other fields as well.

As an aside, this also means that we should expect more variance between the different labs in coming years compared to today. Up until recently, the state of the art techniques were published, so everyone was basically doing the same thing. (And new upstarts or open source projects could easily compete with the frontier, since the recipe was published.) Now, key algorithmic ideas are becoming increasingly proprietary. I’d expect labs’ approaches to diverge much more, and some to make faster progress than others—even a lab that seems on the frontier now could get stuck on the data wall while others make a breakthrough that lets them race ahead. And open source will have a much harder time competing. It will certainly make things interesting.

Leopold Aschenbrenner

Unhobbling

Finally, let's talk about "unhobbling" - making AI models work better by removing simple limitations.

Imagine if you had to solve a hard math problem instantly, without working it out step-by-step. It would be really hard, right? That’s how we used to make AI solve math problems. But we figured out a better way: letting AI work through problems step-by-step, just like we do. This small change, called "Chain-of-Thought" prompting, made AI much better at solving difficult problems.

We've made big improvements in "unhobbling" AI models over the past few years:

Reinforcement Learning from Human Feedback (RLHF): This technique helps AI learn from human feedback, making it more useful and practical. It's not just about censoring bad words; it helps the AI understand and answer questions better. For example, a small AI model trained with RLHF can perform as well as a much larger model without it.
Chain of Thought (CoT): This technique lets AI think through problems step-by-step. It’s like giving the AI a scratchpad to work out math and reasoning problems, making it much more effective.
Scaffolding: This involves using multiple AI models together. One model plans how to solve a problem, another proposes solutions, and another critiques them. This teamwork approach can make even smaller models perform better than larger ones working alone.
Tools: Imagine if humans couldn't use calculators or computers. Similarly, giving AI models tools like web browsers or code execution capabilities helps them perform better. ChatGPT can now do things like browse the web and run code.
Context Length: Early models could only remember a small amount of information at once. Now, models can remember much more (from 2k tokens to over 1 million tokens). This helps them understand and work on bigger tasks, like understanding a large codebase or writing a long document.
Posttraining Improvements: Even after training, AI models can continue to improve. For example, the current GPT-4 has gotten much better at reasoning and other tasks compared to when it was first released.

By removing these limitations, we've made AI models much more powerful and useful.

Drivers of AI Progress

The possibilities here are enormous, and we’re rapidly picking low-hanging fruit here. This is critical: it’s completely wrong to just imagine “GPT-6 ChatGPT.” With continued unhobbling progress, the improvements will be step-changes compared to GPT-6 + RLHF. By 2027, rather than a chatbot, you’re going to have something that looks more like an agent, like a coworker.

Leopold Aschenbrenner

oom trends

From GPT-4 to AGI

Article Description

Counting the Efficiency Increase

Follow the OOM to see a glimpse of the future

The data wall and how to smash through it

A counter argument to stalled progress

Unhobbling

Company

Site Information

Fun Stuff