James Hennessy, Intergration Engineer

There are many paths on the road to AGI. Nando de Freitas recently wrote that “Scale is all you need” following the release of Deepmind’s Gato paper. Larger and larger transformers, with language interfaces doing supervised learning on large datasets is one such touted path. While Gato trains and evaluates on data from physical robots, there is a larger paradigm on getting to AGI that fully bypasses embodiment or robotics. This is the path pursued by OpenAI/Anthropic, who once famously disbanded their robotics team. I recently found myself in a car ride with a friend who was a “member of technical staff” at OpenAI and he asked “Do you think solving embodied AI is required for AGI or is it enough to be digital?” I have thought a lot about that question since.

Language as the end-all of intelligence When it comes to conversations around AGI, language is often the start and end all of intelligence. From an evolutionary perspective, however, intelligence has not arised/existed outside of embodied agents and language itself evolved as a tool for multiagent communication. Language and structured coordination of agents helps with reward maximization in an environment, and could pose as an evolutionary byproduct of intelligence in line with the Reward is Enough hypothesis. Language certainly does encode a huge wealth of human knowledge and is a promising and defacto hammer to solve intelligence. But is it enough for AGI?

What is AGI? It all comes to answering the question of what constitutes an AGI. Andrej Karpathy says it is a feeling. Several Anthropic researchers define it, weirdly so, as ‘that which would make the world a weird place’. Nick Bostrom defines AGI as “an intellect that is much smarter than the best human brains in practically every field”. Wikipedia defines it as “the ability of an intelligent agent to understand or learn any intellectual task that a human being can.” Open Philanthrophy describes “transformative AI” with an economic definition as that which would 10x the Gross World Product by bringing about as much difference as agriculture or the industrial revolution. There’s also the idea, embraced in OpenAI circles, that if you can hire a remote worker and they’re as good as a human, that is AGI. If an AI can work like a physicist or researcher and make scientific breakthroughs and write papers without ever moving a literal finger, is that AGI? Steve Wozniak proposed a coffee test for AGI: a machine figures how to make coffee in an unseen human kitchen.

Ultimately, answering the question of whether embodiment is required for AGI depends on what definition of AGI you adopt. For this essay, I’ll pick that which is as good or better than a human in every aspect as a baseline. There are several useful things to do in the physical world and transforming those would require embodied AI.

Embodiment AI