Deepmind’s AI Is Learning About the Art of Coding

Gregory Barber

In the field of computer science, there is perhaps no more fundamental task than to sort. Bubble, heap, merge—take your pick. The methods for reordering data inside a computer have been theorized to death, served as practice exercises for millions of novices, and been optimized for decades by expert developers. Type a sort() function in any programming language, and it's code you can rely on. Don't touch it. It already works great.

But last year, an AI system developed by engineers at Google's Deepmind improved on great by just enough to matter. The system, which Deepmind calls AlphaDev, was tasked with coming up with a new way to sort short sequences in numbers in C++, the popular coding language. It meant going under the hood and having the AI build new algorithms in assembly code—the instructions that bridge the gap between programming languages like C++ and computer hardware. When a C++ developer tells the computer to "sort," those commands are converted into machine-readable code that tells a computer's memory and processor exactly what to do: where to move data, and how to change it. It's where bits meet the metal.

The experiment worked. Since April of last year, C++ has been running slightly faster, thanks to a new set of AI-concocted sorting algorithms. But according to AlphaDev's engineers, who described the work today in Nature, that's just the first step. "We want to optimize the entire computing stack," says Daniel Mankowitz, a staff research scientist at Deepmind who led the sorting project. Mankowitz says that AlphaDev has already improved algorithms not just for sorting, but also for other basic tasks like hashing.

"I think this work is incredibly exciting," says Armando Solar-Lezama, an expert in program synthesis at MIT, who wasn't involved in the research. It's useful to have AI come up with a new sorting algorithm; it's a much bigger deal to build an AI that can learn how to write state-of-the-art code across a variety of tasks, he says. That means AlphaDev has started to learn something more fundamental about the art of coding itself.

That comes with significant constraints, of course. "These are tiny, tiny programs," he adds—totaling no more than a few dozen instructions in assembly code. But those tiny programs often represent big bottlenecks for computer performance, having been optimized as far as people can push them. Overall, AlphaDev's new C++ sorting algorithms are 1.7 percent more efficient than the prior methods when sorting long sequences of numbers, and up to 70 percent faster for five-item sequences. At scale, these improvements add up, Mankowitz says. Since the AI-written code was submitted to Libc++, a major open-source library for C++, he estimates the algorithms have been used trillions of times a day.

Those improvements are thanks to a technique called reinforcement learning, which is the same approach used to help Deepmind's AI master games like chess and Go. This type of AI learns by doing. It works by treating a given task—like writing an assembly program—as a game, in which the AI receives rewards for making smart moves that increase the program's efficiency. Over time, the system works to maximize this reward, resulting in a winning Go strategy or a quicker assembly program. This differs from the sort of AI found in large language models like GPT-4, which rely on huge amounts of data to learn how to write words or code. That's great for producing writing that mirrors the tone of the internet or producing common segments of code. But it's not so good at producing novel, state-of-the-art solutions to coding challenges the AI has never seen before.

Jeremy White

Kate Knibbs

Jeremy White

Gideon Lichfield

Like playing chess or Go, writing assembly code is a tricky, open-ended task, with many potential moves and lots of ways to screw up. Modern programming languages like C++ or Python mask the nitty-gritty of moving data around with short commands that mirror human language. In the 1950s, when such "high-level" languages debuted, some believed the problem of programming had basically been solved. Until then, programming was essentially just mucking around in assembly—so much so that Fortran, one of the first high-level languages, was initially marketed as the "Fortran automatic coding system," because its commands always translated into working assembly code. "Fortran was going to write code better than humans could and without bugs," says Solar-Lezama. "Today that sounds laughable. But it was true."

Improving a language like C++ or Fortran still often requires tinkering with the underlying assembly—finding ways to make it work faster, usually by cutting out extraneous steps. Because assembly lacks the formal structures and abstractions of higher-level programming, and because a single mistake can cause the algorithm to break, the game that the AI must play is no fun. "It will fail again and again and again," Solar-Lezama explains.

The innovation of AlphaDev is that it improves how the structure of a working assembly program is represented in the AI code. That allows its reward system to better narrow down the possibilities. The AI gets better, faster.

At a high level, the AI's sorting solution looks familiar. There are only so many ways to put a handful of numbers in ascending order. It manages to shave a few instructions off the assembly sequence using some unconventional instructions—ones that a human coder probably wouldn't think to try. Mankowitz compares these actions to move 37, the infamous hand played by AlphaGo in its 2016 exhibition match against grandmaster Lee Sedol. The move was so odd that observers initially thought the computer had botched the match. But it wound up being pivotal to the computer's victory, and it has since altered how the game is played.

The resulting code thus looks a little odd, in part because it needs to force the computer to move data around in highly specific ways. "It's definitely not an economical way of writing the code," says Nikolas Klauser, a Libc++ contributor who reviewed Deepmind's proposal last spring. This upped the stakes for actually putting the code into production, he says. It's risky to update fundamental algorithms like sorting that have worked just fine for years for what might only be a small improvement in efficiency. But in the end, it all checked out. The code update went through.

Mankowitz acknowledges that the current programs AlphaDev can produce are small, and that it will likely take new breakthroughs in AI development to generate larger, more complex algorithms that beat the best human attempts. But for code generation experts like Solar-Lezama, the research is an important step toward more generalized AI coding—one that has him reflecting back on Fortran and its "automated programming" system. Did it put programmers out of business? Not in the slightest. Did it change what it meant to be a coder? Completely.