Building agents is a self-knowledge practice.
One of the many amazing things about LLMs is their capacity for reasoning. With this, we can automate tasks that were practically impossible several years ago.
The most amazing type of task to automate is expert tasks, which require deep knowledge and contextual understanding. Coding is the flagship example.
LLMs have consumed a lot of information and they are very good at small reasoning tasks. But, they still tend to fail on larger expert tasks. You can get around this by building an agent.
I want to stop here and define “agent.” An agent is a system that automates tasks through a combination of repeated LLM reasoning and pre-determined system logic. It is not just the large language model. Chat-bots are not agents in this sense unless they can reason about and perform actions on behalf of humans.
In order to take a complex task and build an agent a couple things are required:
- The task needs to be something that can be automated. Meaning it can be broken down into a series of steps that only operate on computer-controlled actions.
- The task to be automated needs to be well understood.
Turns out a lot of things fit the first condition. But, the second condition is a little more interesting and I want to explore it.
To build an agent you are assembling a system to reason about the task and perform necessary actions. This system needs to make many calls to the LLM to perform multiple reasoning steps, it needs to interact with other parts of the system, manage context, and it needs its own expert orchestration for the task.
Expert orchestration requires understanding the task.
Only an expert can provide this. But it also requires that understanding to be systematized. That requires an expert who can systematize the task, a separate ability.
The ability to systematize a task requires reflection and articulation of knowledge. You not only need to know how to perform the task. You need to know how to explain it. Because, well, you need to explain it to the agent in prompts and codify it in a programming language.
A lot of experts know how to perform tasks but lack the self-understanding to fully explain it. It lives mostly as tacit knowledge from experience. They can explain parts of their knowledge but that doesn’t mean you can follow their steps and perform the task like an expert.
I’ve seen attempts at building agents that fail at some part of this. Either an expert tries to build it but doesn’t quite understand the technical approaches for managing agents. Maybe they can’t quite explain the task accurately. Or, a developer tries to automate a task they don’t fully understand. All lead to lackluster, faulty systems.
So, building an agent requires:
- The technical knowledge of the abilities and limitations of LLMs and agentic systems.
- Expertise in the task to be automated.
- The ability to articulate and systemize the expertise.
It’s no wonder that coding agents have come so far. AI programmers naturally have the first two. If they have the third, they can build coding agents.
Final thoughts.
You can build a system that learns and evolve the expertise. This isn’t a bad idea but will start off inept. It happened with coding agents even though experts are building them. Progressive improvement is always a part of software development.
On top of that, LLMs have a second advantage at coding because coding is inherently a language task. But, LLM improvements alone are not the only driver of coding agent improvements. A lot of the agent quality depends on the agent logic which is why two running the same model can differ so much.
For an agent to start strong, you need an expert in the task who can articulate it clearly. And you need to pair them with a developer that knows how to build agents.
Coding agents are automating the second part. So, we’re going to see more and more expert agents.
Member discussion