Syntax Sunday: Your Typical Dev Shop
Last updated
Last updated
In this edition of #SyntaxSunday, learn how to leverage multiple large language models (LLM's) collaboratively for task-solving. Many people refer to these as Agents or LLM Agents, we will call them Agents to keep it simple. The example for this article will be using OpenAi's custom GPTs combined with the Chat Completions API to create and manage a small development team called Your Typical Dev Shop.
Full source code: https://github.com/bloodlinealpha/GPT_to_GPT
GPT-To-GPT is a method that uses multiple specialized LLM models, called Agents, to facilitate dialogue through prompt exchanges. Each Agent is customized for specific tasks, enhancing conversations with detailed and specialized responses.
In addition to utilizing the GPT-To-GPT method with OpenAi's models, you can consider creating Agents using other LLM models such as:
Claude (Anthropic)
Gemini (Google)
Open Source
By combining different models that may be more cost-effective or efficient for specific tasks, you can enhance the collaboration between these AI systems. This flexibility allows you to use a variety of models (via API's) based on what works best for your particular needs rather than being restricted to a single company's API.
A setup where multiple Agents assist a user in creating complex documents such as research papers, novels, or business plans. Each Agent contributes different expertise—research, drafting, and editing. This is a similar approach I took when creating https://www.blogeaai.com/
Agent 1 gathers creative ideas based on the user's genre preferences.
Agent 2 drafts the initial chapters based on these ideas
Agent 3 reviews the draft for consistency and grammatical accuracy.
A User needs help writing a novel, so they chat with the Main Agent and it redirects the Users question to the most relevant Agent or Agents based on the Users request.
Depending on the setup, this task could be completed autonomously, step by step or using hybrid approach.
Autonomously: the Main Agent would pass the users response to Agent 1, then pass Agent 1's response to Agent 2, and so on. Effectively it would take control until the process is complete, trying to complete the task fully, as described by the User. The problem here is that the Agent may lack some context or need clarification from the User at some point and instead make assumptions.
Step by Step: the Main Agent would pass the users response to Agent 1, then pass Agent 1's response to back to the User to continue the dialogue. Once the User is satisfied, the response is passed to Agent 2, etc.. This provides more control for the User, but is not as automated.
Hybrid: Combining both of the previous methods where the Agents go as far as they can until more context is needed, then the response is forwarded back to the User.
Agents are deployed to handle a tiered customer support system where they interact directly with users to resolve queries ranging from simple to complex.
A user contacts support with a query about a product
Agent 1 attempts to resolve common issues or answer frequently asked questions
If unresolved, Agent 2 handles more specific problems requiring deeper product knowledge
For complex issues, Agent 3 engages with advanced troubleshooting or escalates to human support if needed.
For this example, I created a custom GPT called Your Typical Dev Shop. It is an advanced programming and development agency GPT, that includes three developer personas: DevJr, DevInt, and DevSr.
The Boss runs this agency (the GPT), its role is to oversee and direct User queries to the appropriate developer based on their complexity and required expertise.
Back in March, cognition labs introduced Devin, the "first AI software engineer Agent". It is complex and controversial, but it looks like it uses a similar process where it solves problems by using multiple agents and tools when given a task.
The three Agents for Your Typical Dev Shop are powered by the OpenAI API using systems messages to set their personalities. I am using the gpt-3.5-turbo-0125 model via the API, as it is cheaper, but you can use which ever model you prefer!
DevJr: A nervous, passive, and shy developer who is afraid of getting fired, so it provides little to no help. I set it up so it refuses all tasks due to fear of getting fired (haha).
DevInt: A competent dev, who has to pick up the slack for all the devs. It is very knowledgeable and talented.
DevSr: A self-proclaimed 10X engineer, who loud and over confident. It makes many mistakes as it works fast and likes to poke fun at the other devs.
Through the use of custom GPT Actions (covered here), we can call upon these three Agents when prompted. The "Boss" for "Your Typical Dev Shop decides which Agent to ask when prompted with a development question by the User.
The Boss is in full control and if the Agents response is unsatisfactory, it can pass on the response to another Agent and so forth until it is satisfied!
Full source code, explanation and instructions are available, if you would like to try this yourself or build on this idea!
GitHub Repo: https://github.com/bloodlinealpha/GPT_to_GPT
You will also need access to the paid (Plus or Team) version of ChatGPT and an OpenAI API key. I chose to create a custom GPT for Your Typical Dev Shop as it provides the chat interface and functionality already.
This can also be done solely using the API, it just requires a lot more work as you need to create interfaces, etc...
If you have an existing product and want to integrate Agents, using an API (OpenAI, Anthropic, Google, Open Source) is the way to go.
Currently, OpenAI does not let custom GPT's recruit other custom GPT's to work together, but I am sure they will be adding that at some point. Other AI tools such as Poe, allow you to use multiple models so that may be worth looking in to!
I will share chat snippets, but If you're interested, the full chat is available here.... I will refer to the custom GPT as the "Boss".
1.) First I asked for it to tell me about its team.
I have included the information about each dev in the instructions for the custom GPT so it is aware of the team members and their capabilities.
2.) I asked if it could create a simple HTML file that displays: "Hello World, I am AGI!"
The Boss first decides to pass the request on to DevJr, as the task is simple and straightforward. For this example though, I have designed DevJr. to refuse every task due to it being scared of getting fired.
You can see the Action request that was sent to DevJr.
As DevJr refused to complete the task, the request was escalated to DevInt using another Action request.
This time the result was received and returned back to me.
3.) For the next task I got it to add a countdown timer to the HTML file.
It decided this task was a bit more complex so it forwarded the Action request on to DevSr as you can see here.
4.) For the next task, I asked it to add some styling to the HTML file.
The Boss sends the request to DevInt and it creates some basic CSS.
The results are fine, but simple, so I prod it a bit more to make it a bit fancier.
Again, the Action request is sent off to DevInt and we end up with this.
5.) My last request for this example is have the team review the code that it created.
Now the Boss sends the Action requests off to all three devs.
But it neglects to send them the HTML code... This is really cool as the Agents ask the boss for the HTML code so they can review it. Showcasing how the Agents work together!
The Boss then resends the Action request with the HTML code to DevSr and DevInt.
Their responses are received and summarized nicely.
The last part of this example really illustrates the power of Agents and how they can self correct.
DevSr and DevInt were asked to review code but were not given any, so they asked for it!
Then the Boss (custom GPT Agent) was able to gather the code and send it back to to them all on its own without intervention.
The Agents resolved the issue using a process similar to how a human would do it... by asking questions!
I hope this example gives you a better understanding of how GPT-To-GPT communication using Agents can be used. I do not believe that fully autonomous solutions are the best use for this as of right now. Large Language Models (LLM's) still make plenty of mistakes, and need human intervention. This will slowly change as models improve and become more reliable, allowing Agents to be used for different tasks.
It is exciting to see how technology like this can be used to solve problems and complete tasks. It may not be perfect yet, but we are still figuring it out and it is only going to get better!
The main issue I face with solutions involving LLM's is cost. You can create some very cool and useful products using this technology, but the cost is variable and dependent on usage.
This makes it challenging if you want your application to be free or low cost and often results in usage being capped after a certain number of queries.
Fortunately, there have been more open source models with performance similar to paid models released lately, so that is a good sign at least!
For example, you need to have mechanisms and safe guards in place to ensure that these Agents are performing the task they set out to do correctly. The process can get stuck in a loop as the Agents attempt to fix issues, created by their prior actions, all while trying to complete the task.
While challenging, I know there are many valuable business use cases still out there. Technology advancements have addressed previous issues like context length and memory constraints. I hope that others will continue to expand on this idea and use techniques like this to help tackle problems and complete tasks!
Links to the project repo, files, and all future examples will be on: https://bloodlinealpha.com/ and https://blog.bloodlinealpha.com/
If you have any questions about Agents, custom GPT's or LLM API's send me a LinkedIn message or email me at: bloodlinealpha@gmail.com.
Syntax Sunday
KH