The release of Devin, an autonomous AI agent, has taken the AI world by storm.

Cognition just released:

Autonomous AI Agents — Andrew Best — AI Growth Guys

Devin — The first AI Software Engineer

Check out their blog post for specific examples of what Devin can do.

This thing has been going viral, and for good reason.

Devin is the best illustration the world has ever seen of autonomous AI agents.

The term "autonomous AI agent" is thrown around a lot, but I'm pretty sure most people haven't stopped to think what it really means, and why it could be a game changer.

What are autonomous AI agents?

My understanding is like this: Currently, when you are using an LLM like ChatGPT, you do one prompt, and you get one output in return.

When you use autonomous AI agents like Devin, you do one prompt, and the AI agents will keep working in the background until they solve everything. The agents will have a lengthy back-and-forth conversation with themselves, until they give you their final answer.

Let me give you a specific example:

If you currently said to ChatGPT:

"Make me a 10 page website, with beautiful photos on each page. Write 6 articles and have them linked together. Make sure the website is mobile friendly. Make the website topic about "How to make money with AI. Search for a good domain name and use that when you find it."

Obviously, ChatGPT would fail big time at this task, when given as a single prompt.

The idea is that autonomous AI agents could do this with a single prompt.

They would break the task down into small steps. Then they would search the internet to look for domain names. Then it would start to code the website. Then it would create images, and make sure everything looked good.

If the code was buggy, it would search again for an error, and fix the code by itself…

The point is, all you would need to do is give one single prompt, and then wait for the finished product.

You could give a prompt, go to the bathroom, or make a sandwich, and then come back and see the result.

AI autonomous agents promise to do everything alone, and take the human completely out of the loop.

Look at the software engineering performance of Devin compared with other LLM's:

Devin from Cognition

Here is some of what Devin was able to do by itself.

These are quotes from the Cognition blog:

  • Devin can learn how to use unfamiliar technologies. After reading a blog post, Devin runs ControlNet on Modal to produce images with concealed messages for Sara
  • Devin can build and deploy apps end to end. Devin makes an interactive website which simulates the Game of Life! It incrementally adds features requested by the user and then deploys the app to Netlify.
  • Devin can autonomously find and fix bugs in codebases. Devin helps Andrew maintain and debug his open source competitive programming book.
  • Devin can train and fine-tune its own AI models. ‍Devin sets up fine-tuning for a large language model given only a link to a research repository on GitHub.
  • Devin can address bugs and feature requests in open source repositories. Given just a link to a GitHub issue, Devin does all the setup and context gathering that is needed.
  • Devin can contribute to mature production repositories. ‍ This example is part of the SWE-bench benchmark. Devin solves a bug with logarithm calculations in the SymPy Python algebra system. Devin sets up the code environment, reproduces the bug, and codes and tests the fix on its own.
  • We even tried giving Devin real jobs on Upwork and it could do those too! ‍Here, Devin writes and debugs code to run a computer vision model. Devin samples the resulting data and compiles a report at the end.

Insane!

We are living in insane times.

Check out my website to learn more about AI and prompt engineering.

Also, sign up for my free AI newsletter to stay in the loop.

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go: