How Autonomous AI Agents Leverage Text Completion Capabilities

This is the fourth post in a series that looks at how autonomous agents use generative AI models.

Introduction

The previous post showed how programmatic interactions with Open AI’s language models function.

In this post, we’ll look at how two different autonomous agents (BabyAGI and Auto-GPT) leverage these capabilities. Agents like these provide a way to carry-out multi-step operations. When running, they write their own prompts and then incorporate the responses back into themselves. This creates a loop that attempts to drive towards a stated outcome or objective.

In theory, this would let you provide an agent with a broader objective (ex. “Identify community development grants available in Indiana and draft a submission proposal for the top five”) and let an agent work on your behalf.

[2025 update] As agentic approaches have gained more traction, there are a plethora of other techniques and tools available. These two early forays into autonomous agents are still in use and useful to highlight approaches to task-based autonomous agents.

BabyAGI

BabyAGI is an “AI-powered task management system”. At a high-level, the system works by invoking an AI model to perform a series of tasks. After a task completes, the agent builds a new list of tasks and re-prioritizes the next work.

[Source: “Task-driven Autonomous Agent Utilizing GPT-4, Pinecone, and LangChain for Diverse Applications”]

Interactions with BabyAGI start by specifying an objective and an initial task. The example initial task is “Develop a task list” which is a reasonable starting point. Conceptually from that point, the agent performs the following tasks in a loop:

Take the first task in the task list
Execute the task based on some context
Develop a new task list
Re-prioritize the new task list

In each step, the prompt to the AI model includes:

A description of the role of the AI model for the interaction (ex. “You are a task prioritization AI tasked with cleaning the formatting of and re-prioritizing the following tasks…”)
An instruction to stay grounded towards an objective (ex. “Consider the ultimate objective of…”)
Specific instructions for the step (ex. “Do not remove any tasks. Return the result as a numbered list”)

You can watch the interplay of these steps from the following example (trimmed down to the illustrative parts):

*****OBJECTIVE*****

Write a short summary history of neural nets

Initial task: Develop a task list

Using results storage: Chroma (Default)

*****TASK LIST*****

• Develop a task list

*****NEXT TASK*****

Develop a task list

*****TASK RESULT*****

1. Research the origins of neural nets
2. Identify key figures in the development of neural nets
3. Outline the major milestones in the history of neural nets
4. Describe the different types of neural nets that have been developed over time
5. Analyze the impact of neural nets on various fields, such as computer science, neuroscience, and artificial intelligence
6. Summarize the current state of neural net research and development
7. Edit and revise the summary for clarity and accuracy.

****TASK CREATION AGENT RESPONSE****
1. Research the origins of neural nets
2. Identify key figures in the development of neural nets
3. Outline the major milestones in the history of neural nets
4. Describe the different types of neural nets that have been developed over time
5. Analyze the impact of neural nets on various fields, such as computer science, neuroscience, and artificial intelligence
6. Summarize the current state of neural net research and development
7. Edit and revise the summary for clarity and accuracy.

****TASK PRIORITIZATION AGENT RESPONSE****
1. Research the origins of neural nets
2. Identify key figures in the development of neural nets
3. Outline the major milestones in the history of neural nets
4. Describe the different types of neural nets that have been developed over time
5. Analyze the impact of neural nets on various fields such as computer science, neuroscience, and artificial intelligence
6. Summarize the current state of neural net research and development

Five cycles snipped for brevity…

****TASK CREATION AGENT RESPONSE****
1. Research and summarize the contributions of Warren McCulloch and Walter Pitts to the development of neural nets.
2. Investigate the reasons for the decline in interest in neural nets in the 1960s and 1970s.
3. Analyze the impact of backpropagation algorithm on the development of neural nets.
4. Compare and contrast the architectures of feedforward, recurrent, and convolutional neural networks.
5. Investigate the potential of neural nets in the field of robotics.
6. Research and summarize the current state of research on explainable AI in neural nets.
7. Analyze the potential ethical implications of using neural nets in autonomous weapons.
8. Investigate the potential of neural nets in the field of drug discovery.
9. Research and summarize the current state of research on adversarial attacks on neural nets.
10. Analyze the potential of neural nets in the field of climate modeling.

****TASK PRIORITIZATION AGENT RESPONSE****
1. Research and summarize the contributions of Warren McCulloch and Walter Pitts to the development of neural nets
2. Investigate the reasons for the decline in interest in neural nets in the 1960s and 1970s
3. Analyze the impact of backpropagation algorithm on the development of neural nets
4. Compare and contrast the architectures of feedforward recurrent and convolutional neural networks
5. Summarize the current state of neural net research and development
6. Research and summarize the different types of neural networks such as feedforward recurrent and convolutional neural networks
7. Investigate the limitations of neural nets and potential solutions to these limitations
8. Compare and contrast the performance of neural nets with other machine learning algorithms such as decision trees and support vector machines
9. Investigate the role of neural nets in natural language processing and speech recognition
10. Research and summarize the current applications of neural nets in industry and academia
11. Analyze the potential future developments and advancements in neural net research and technology
12. Explore the ethical implications of using neural nets in various industries such as healthcare and finance
13. Investigate the potential of neural nets in the field of robotics
14. Research and summarize the current state of research on explainable AI in neural nets
15. Analyze the potential ethical implications of using neural nets in autonomous weapons
16. Investigate the potential of neural nets in the field of drug discovery
17. Research and summarize the current state of research on adversarial attacks on neural nets
18. Analyze the potential of neural nets in the field of climate modeling
19. Analyze the impact of neural nets on various fields such as computer science, neuroscience, and artificial intelligence

Note: Task 19 is listed last as it is more of a broad overview and can be completed after the other tasks have been completed.

This sort of example shows how BabyAGI folds task results back into itself and generates new tasks based on the work completed so far.

Auto-GPT

Auto-GPT uses a similar approach of generating its own prompts and feeding results back into itself. However, Auto-GPT goes further in providing additional operation types for the agent to use. In additional to text completion models, the agent can:

Read / write to files on disk
Perform Google searches
Browse websites
Analyze code
Execute code
Transcribe audio to text
…and many more (the list is growing quickly)

You provide a role to the AI agent and a list of tasks. The agent picks what it believes is the most appropriate next action, performs it, evaluates the result, then looks for the next action.

At each step, the agent produces write-up of:

Thoughts – The current step of what the agent thinks should occur
Reasoning – Explanation of why it wants to do what it wants to do
Plan – The plan (task list) of what it needs to do
Criticism – A list of self-criticism and quality rules (that it doesn’t seem to always take into account)

As an example:

THOUGHTS: I think we should start by searching for recent software product topics to include in our ‘What’s happening?’ segment. This will help us stay up-to-date with the latest trends and developments in the industry.

REASONING: By starting with a ‘What’s happening?’ segment, we can provide our listeners with valuable insights into the current state of the software product industry. This will also help us identify potential interviewees and topics for future episodes.

PLAN:
– Use the ‘Google Search’ command to search for recent software product topics
– Save the most relevant articles to a file for future reference
– Use the information gathered to inform our ‘What’s happening?’ segment

CRITICISM: I need to make sure that the articles I select are relevant and informative. I also need to be careful not to spend too much time on this task, as we have other important tasks to complete.

Unless run in continuous mode (a questionable idea given the types of operations the agent can perform), you are prompted before each action occurs to decide whether you want the agent to perform an action (read a file, browse a website, start a new sub-agent, etc.).

Results in testing

Both of these agents (and their cousins) are currently best described as experimental prototypes. They have limitations that get in the way of them providing what you may fully want from them.

They often struggle to stay focused on an objective. You can get specific results if you provide a very narrow and highly constrained objective. However, the use cases that are more interesting for these types of agents seem to be those that are more ill-defined.

These agents often get stuck in a loop. They may perform action, incorporate the results, and then later insert the same previously execute task back into the task list.

The agents inherit both the strengths and weaknesses of the underlying language models. So areas of weakness in the chat models (ex. logic or math tasks) will influence the response to chat completion requests.

This same style of limitations shows up in other people’s testing as well.

The future

While these agents are currently limited, it’s important to note that these agents are still very new (a bit over a month old at the time of this writing). Advancements in this development are occurring rapidly (and speeding up). A brief recent history includes:

August, 2017 – Transformer architecture
June, 2018 – GPT-1
February, 2019 – GPT-2
May, 2020 – GPT-3
October, 2022 – GPT-3.5
November, 2022 – ChatGPT
March, 2023 – GPT-4
March, 2023 – BabyAGI
April, 2023 – Auto-GPT
April, 2023 – ChatGPT plugins
May, 2023 – Auto-GPT plugins

It seems all but certain that we’ll continue to see rapid advancement both autonomous agents and the underlying building blocks. While these particular agents may or may not be the ones that develop into the more capable agents in the future, they are good examples of the current capability state.

Unlock the Power of AI Engineering

From optimizing manufacturing materials to analyzing and predicting equipment maintenance schedules, see how we’re applying custom AI software solutions.

Explore AI-Enablement at SEP »