A demo video featuring an AI programmer called Devin went viral recently. The video depicts an AI agent that takes instructions, formulates a plan, writes code, fixes bugs, and iterates its way toward a shippable solution.
Even more recently, evidence came to light that this demo was staged – and that the actual capabilities of Devin are a lot less impressive than depicted.
I’m a lousy programmer who uses AI all the time to hack away on small projects, build prototypes, and write useful scripts. Back in the day, we used to call people like me “script kiddies.”
But I know people who are actually skilled programmers – friends and colleagues – who are also finding a ton of value from LLMs like GPT-4 and Claude for coding.
So even though the internet is having fun with another oversold, overhyped AI product, I’m actually quite bullish on AI agents like Devin.
In fact, I’m working with one such company right now – Checksum. Checksum is much more specialized than Devin: it writes and maintains end-to-end tests for web apps. But even in these early paces, Checksum’s orchestra of models does its job extremely well.
More importantly, the pace of progress has been staggering and doesn’t show signs of slowing. I’m lucky enough to see this from the inside at Checksum and other businesses I work with – but I think it’s pretty obvious to any outsiders who are paying attention, too.
Building software is an obviously valuable target for an AI agent. But other core business functions, like customer acquisition, customer support, and general purpose activities like browsing the internet and sending emails are also getting the AI treatment.
So – how far away are we from a fully-AI run business?
It seems like we’re pretty close.
Can an AI agent build and maintain a simple software product? Years away.
Can an AI agent do customer support? Already here.
Can an AI agent browse the internet? Already here.
Can an AI agent pay for stuff? I don’t see why not.
Can an AI read and respond to email? Already here.
Can an AI agent build and maintain customer acquisition channels? Years away.
(This one is particularly interesting to me. Digital channels that are text- or image-based will likely be first up.)
Can an AI agent take information from one system and give instructions to another system? Years away.
(The generalized LLM part of this is here, but the generalized action-model part may be years away.)
So what?
Emotionally, I’m not sure how I feel about all this.
But it sure as heck looks like it’s coming either way – and it’s been hard to think about anything else lately.
posted 15 apr 2024