Breaking Down AutoGPT

AutoGPT has taken the world by storm and has even surpassed ChatGPT itself. So, get ready to dive into the exciting world of Auto-GPT.



XXXXX
Image by Author

 

Chatgpt has created quite a buzz in the world of AI. We have been witnessing numerous other models with incremental improvements. But none of them focused on improving the interaction between humans and AI. You still need to give it an excellent prompt to get your desired results. This is where AutoGPT stands out.  It can “Self-Prompt” and reviews its work critically. Are you curious to know about it? How does it work, and what makes it unique? And perhaps most importantly, what are its limitations? Don't worry, we've got you covered. Let's explore all of these questions in this article. Join me as we delve into the topic together.

 

What is AutoGPT?

 

AutoGPT is an open-source application developed by  Toran Bruce Richards ( Game Developer and Founder of  Significant Gravitas). It uses GPT-3.5 or GPT-4 APIs to create fully autonomous AI agents. It stands out because you don’t need to steer the model based on your understanding. You just provide the task along with the list of objectives and it handles the rest. Unlike ChatGPT it can also access external resources to make its decision. Did you know that it has obtained more stars than Pytorch (A famous open-source ML Library) within a few weeks of its release? Here is a graph showing its star history.
 

XXXXX
Image Generated by Star-History

 

How Does AutoGPT Work?

 

XXXXX
Image by Author

 

AutoGPT combines the power of the GPT-4 and personal assistant to generate, execute and prioritize the tasks autonomously. Being an autonomous system, it creates AI agents to perform specific tasks. These agents also communicate with one another. Here are the steps that describe how AutoGPT works:

 

Step 01: Input from the User

 

Firstly, the user needs to enter the following three inputs: AI Name, AI Role, and up to 5 goals. For example, I can create an AI named MarketResearchGPT and its role will be to conduct the market analysis of different items. I can set goals like Performing market research for different phones, Getting the list of top 5 with their pros and cons, Arranging them in ascending order of their prices, Summarizing their user reviews, and Terminating the process when done.

 

Step 02: Task Creation Agent

 

Once the user has entered the input, the task creation agent understands the goal, generates the list of tasks, and mentions the steps to achieve them. Then the resultant set of tasks is passed to the task prioritization agent. 

 

Step 03: Task Prioritization Agent

 

The task prioritization agent reviews the sequence of the tasks to ensure that it logically makes sense. Because we don’t want to enter a deadlock situation where our current task depends on the result of the task that has not been executed yet.

 

Step 04: Task Execution Agent

 

Task Execution Agent as the name suggests makes use of GPT-4, the Internet, and other resources to perform these tasks.

 

Step 05: Communication Between Agents

 

Agents can communicate with each other to reach the user-defined goal. For example, if the unsatisfactory results were generated then it can communicate with Task Creation Agent to generate a new list of tasks. Hence, it becomes an iterative process.

 

Step 06: Final Result

 

The actions of these agents are visible on the user end in the following form:

Thoughts: AI agent share their thoughts after completing the action

Reasoning: It explains its choices of why is it choosing a particular course of action

Plan: The plan includes the new set of tasks

Criticism: Critically review the choices by identifying the limitations or concerns 

It also uses external memory to keep track of history and learn from its past experiences to generate more precise results.

 

How does it differ from ChatGPT?

 

Although AutoGPT and ChatGPT are built on top of the same technology which is GPT API, we can pinpoint some key differences that are as follows:

 

Access to Real-Time Data

 

ChatGPT uses the latest model of GPT-4 that is trained up to September 2021 which means that we cannot extract the real-time insights. AutoGPT has access to external resources and incorporates the latest trends into its responses.

 

Autonomous Functionality

 

Unlike ChatGPT, which requires constant prompts from the user, AutoGPT is autonomous in this regard and doesn’t require constant prompting. It really helps in idea generation.

 

Memory Management

 

ChatGPT has memory limitations in the form of context windows of LLMs like GPT-4 while AutoGPT uses vector databases and is suitable for both short and long-term memory management.

 

Image and Speech Functionalities

 

ChatGPT is limited to only textual data while you can generate images and convert text to speech using AutoGPT. 

 

How to use AutoGPT?

 

You will need an OpenAI API key as AutoGPT is built on top of GPT. If you don’t have one, you can sign up for a free account to get some free credits. Follow the steps below to set up AutoGPT on your local computer.

 

Requirements 

 

 

Setting it Up

 

Clone the GitHub repository in your local directory using the following command:

git clone https://github.com/Significant-Gravitas/Auto-GPT.git

 

Navigate to the project directory using the following command:

cd Auto-GPT

 

Run the following command to download the required dependencies:

pip install -r requirements.txt

 

Locate the “.env.template” file in your Auto-GPT folder. Kindly check the hidden files too if you are not able to find them. Create a copy of this file using:

cp .env.template .env

 

Open the .env file and replace the OPENAI_API_KEY with the key that you generated from your account. Save and close the .env file.

Run the below command to start AutoGPT:

python -m autogpt

 

And if you are using GPT-3.5 then you can run:

python -m autogpt --gpt3only

 

You are good to go now. In case of any issues please refer to the official documentation: Auto-GPT Setup

 

Limitations

 

Although AutoGPT can generate content with minimal human intervention, it has some major downsides such as high costs, limited functionality, inadequate understanding of context, data bias, limited creativity, and security risks. It is not yet capable of achieving the AGI (Artificial General Intelligence) due to data quality, generalization, and explainability issues. Despite the shortcomings, it has huge potential to revolutionize our daily lives and the way we work. I hope you enjoyed reading the article and do let me know in the comment section about what you think about AutoGPT.
 
 
Kanwal Mehreen is an aspiring software developer with a keen interest in data science and applications of AI in medicine. Kanwal was selected as the Google Generation Scholar 2022 for the APAC region. Kanwal loves to share technical knowledge by writing articles on trending topics, and is passionate about improving the representation of women in tech industry.