Comments on "Machines of Loving Grace" by Dario Amodei
by manuhortet
Dario Amodei (CEO at Anthropic) just published a post, Machines of loving grace, discussing AI and the advent of AGI (or powerful AI, as he calls it). In my eyes, it has great value as a succinct and clear piece to understand where we are now, and creates a nice space to think how we may be from their definition of AGI.
The following text, motivated by Dario's post, is a commentary on the idea of AGI today and some thoughts on how far we seem to be from it:
Defining powerful AI
Back in 2017, while the transformer architecture (used in LLMs) was being presented to the world in the famous Attention Is All You Need paper, I was deeply impressed by running into the concepts of AGI and ASI for the first time, reading Superintelligence by Nick Bostrom.
Discussing AI capacities back then felt niche, nerdy, and had an intense sci-fi flavor to it.
Although those days are far gone and the discussion has reached the mainstream, it has done so mainly as the result of marketing efforts, that have prioritized hype narratives and overpromising over analytical discussion, managing to keep the sci-fi vibe and exaggerations at the core of the discussion.
The notion of AGI I was presented with back in 2017 stands still (a technology capable of performing better than humans at all relevant tasks), but the space to discuss it is now muddy, and mostly populated by convinced AI detractors or hyper optimist followers.
Trying to escape this saturated space, Dario avoids using the AGI term and refers to the idea as "powerful AI". In his own words:
I find AGI to be an imprecise term that has gathered a lot of sci-fi baggage and hype. I prefer "powerful AI" or "Expert-Level Science and Engineering" which get at what I mean without the hype
He then defines some general properties of this powerful AI - a great summary of contemporary expectations on AGI:
- Superior problem-solving capacities than those of the most capable humans
- Complete multi-modality: capacity to interact with the world not just through a chat, but through all interfaces we use
- Proactivity
- Absorbs information and generates actions 10x-100x faster than humans
- Millions of instances can run separately and collaborate when needed
How far are we?
As a quick reality check against our current technology, let's go through the presented properties and discuss how far we are. Keep in mind OpenAI's o1-preview / o1-mini are generally considered the top models today, and AI agents are becoming mainstream and starting to reach the market:
-
Problem-solving capacities: we are not there. Models fail to solve problems outside of their training sets, and appear to only be able to implicitly reason over parametric knowledge through grokking, i.e., extended training far beyond overfitting. To put it simply, we aren't sure if the transformer architecture will suffice here, and don't know what the next conventional approach will be.
-
Complete multi-modality: partially feasible with current technology. Except for the interfaces that strictly require high-functioning robotic bodies, AI agents can already interact with interfaces past the original ChatGPT-like chat. Imagine ordering products online, testing code, designing and publishing images or video... Products like this are already in the market, and we expect them to inevitably become mainstream as big players like Apple or Microsoft keep pushing them into their interfaces.
-
Proactivity: we are not there. We can already engineer systems that "simulate proactivity", by running agents when certain real-world conditions are met. But even if this may be enough for some "proactivity features", it fails to sincerely meet what Dario is proposing. A real "always hearing" agent would probably require us to run models locally, and to have better solutions for the context problem.
-
10x-100x faster than humans: for longer tasks, like creating software systems, this may be feasible with current technology. Although because other factors will definitely limit their capacity of action for most tasks, this may not be as relevant as the other points.
-
Millions of instances and collaboration: feasible with current technology. We know the energy and hardware limits we expect to hit in the coming years and efforts to overcome them are ongoing. Running millions of the smaller models is already doable / done. Collaboration between instances also sounds clearly feasible if we consider they can already interact with internet interfaces, which allows for communication to happen.
How do we imagine the path from powerful AI to ASI?
In Superintelligence, I remember the Singularity being discussed as an inevitable effect of AGI. I feel this notion has also now reached the mainstream, and those who consider AGI probable understand ASI as the unavoidable next step.
The rationale is simple: when machines can create better and faster machines, those new machines will create even better and faster machines; and progress will become exponential, as steps forward will be taken faster and faster.
That is of course simplistic and unrealistic. It is obvious by now that other limits will appear. Dario approaches this topic by listing the key factors that will act as limits in this progression line. In my opinion they can be summarised as physical limits: some experiments need a set amount of time, moving objects can only be done so fast, computation needs to consume a minimum amount of energy, and so on.
This point is critical to imagine realistic scenarios, and needs to be integrated into the mainstream understanding. Independently of how optimist you are, the move between something resembling AGI and ASI can't be imagined as immediate.
Conclusion
The space to discuss AI potentiality has been compromised (commodified), and it may be more difficult to explore now than it was before. But that doesn't mean the materialistic and analytical discussion has dissipated. Now that we seem to be much closer to it, we start to understand better how a real AGI could look like, and this text by Dario is a good space to think about it.
When going through his expectations on AGI, some of them sound feasible, even with current technolog... except for probably the most important one: the pure intelligence level of the models. Although Dario famously mentioned we can still expect relevant improvements from scaling up our current models in this interview, we can't prove transformers will be able to bring such intelligence levels, meaning new breakthroughs would be needed to start building something resembling AGI.
Everything else Dario has to say in the post is also of high interest. Go read it!