2 March 2026·5 min readAI agentsAutomation

AI agents vs automation: when to use which

Agents and automations solve different problems. Using an agent where a rule belongs - or a rule where an agent belongs - is how most production AI systems fail quietly.

Ajay Dhillon

Founder

A lot of production AI work comes down to one decision made early, usually badly: should this be an agent, or an automation? The two look similar from outside. A trigger fires, something happens, a result lands somewhere. But the difference in how they succeed and fail matters a lot, and choosing wrong is the most common reason AI work quietly underperforms.

This piece is a practical guide to that decision. Not a taxonomy - a decision rule you can use in the next scoping call.

The short version

Use a workflow automation when the path from input to output is knowable in advance. Invoice received → classify → route → approve under a threshold → notify. The logic is explicit. The model, if used at all, is used for one specific step (extraction, classification) and the rest is deterministic code.

Use an AI agent when the path from input to output is determined at runtime. "Research the company, assemble a brief, and draft outreach based on what you find" - the exact sequence of tool calls, information gathered, and drafting choices can't be written down ahead of time. The agent decides.

If you can write the workflow as a flowchart on a whiteboard in under ten minutes, it is probably an automation. If you cannot, it is probably an agent.

Why the distinction matters

The cost curves are different. The failure modes are different. The governance is different.

Automations are cheap to run (no reasoning-tier model, minimal tokens), easy to monitor (each step is explicit), and straightforward to governance-audit (the logic is readable code). They break in boring, predictable ways: a schema change, a rate limit, a queued retry that wasn't retrying idempotently.

Agents are expensive to run (many tool calls per task, often a reasoning tier model), hard to monitor naively (the path changes by invocation), and harder to govern (the logic is partly in the prompt and partly in the tools the model has access to). They fail in interesting ways: hallucinating a tool that doesn't exist, taking a plausible wrong path, or degrading silently when a tool dependency changes.

Putting an agent on a problem that wanted an automation is expensive and unreliable. Putting an automation on a problem that wanted an agent is brittle and hits its ceiling quickly.

A decision framework

A set of questions we run through in scoping:

1. How deterministic is the happy path?

If more than 80% of invocations follow one or two predictable paths, automation. If there are a long tail of plausible paths, each rare but collectively common, agent.

2. How often does the task need to branch on natural-language judgement?

"Is this email a complaint or a request?" is natural-language judgement, and one model call inside an automation covers it. "Read this email, research the customer, synthesise a response appropriate to their tier, and route to the right specialist" is recursive judgement, and that's agent territory.

3. What is the cost of a wrong output?

Automations with a human-in-the-loop at the decision step can be aggressive. Agents without a strong evaluation harness should be conservative. Any customer-facing agent in a regulated industry needs both.

4. What is the volume?

At scale, inference cost matters. A 100,000-task-per-day automation with a small classifier is affordable; the same volume through an agent with six tool calls per task is not. For high-volume work, lean toward automation with one model call per task; reserve agents for the cases where volume is lower and complexity is higher.

5. How will you evaluate it?

Automations evaluate the model call inside them with standard classification or extraction metrics. Agents evaluate end-to-end task success, which is harder and less standardised. If your team has no plan for evaluating an agent, don't ship one yet.

Hybrid patterns that work

The best production systems we see are neither pure automation nor pure agent. They are automations with a small number of delegated sub-tasks where an agent is appropriate.

For example, a claims workflow:

Receive claim → automation.
Extract fields from documents → single model call inside automation.
Validate against policy → rule engine.
Detect edge cases that don't fit policy → classifier inside automation.
Research and summarise a disputed case for the human reviewer → agent, delegated by the automation.
Human reviewer decision → human.
Policy update, customer notification → automation.

This hybrid gets the cost profile of an automation with the flexibility of an agent where it's needed, and it is the pattern most well-designed production systems eventually converge to.

Common mistakes

Agent-ifying a workflow for novelty. The team is excited about agents, so everything becomes an agent. Inference costs balloon, evaluation becomes difficult, and outcomes regress. We see this more often than any other mistake.

Automating a genuinely branchy task. The team is risk-averse about agents and tries to write rules for every edge case. The rule engine grows for a year and a half and eventually collapses under its own weight. The task is taken off the team's roadmap and quietly returned to manual.

Unbudgeted evaluation. Neither automations nor agents work without evaluation. Automations get regression tests on the model calls; agents get end-to-end task-level evaluation. Skipping this is how a working system becomes a broken system, and you find out from a customer rather than a dashboard.

Frequently asked

What is the difference between an AI agent and AI automation? An AI automation follows a predetermined workflow with model calls embedded at specific decision points. An AI agent decides the sequence of actions at runtime, choosing which tools to call in what order based on the task and what it finds.

Are AI agents more expensive to run than automations? Yes, typically 5–20× more expensive per task because agents make multiple model calls per invocation, often at reasoning-tier pricing. Volume choices should reflect this.

When should I use an AI agent instead of a workflow automation? Use an agent when the task requires multi-step reasoning whose sequence changes per invocation - research, drafting, triage of novel cases. Use an automation when the workflow is known in advance and only specific steps need a model.

Can agents and automations work together? Yes, and the best production systems combine them. Automations provide the reliable backbone; agents are delegated to the narrow steps that genuinely require runtime reasoning.

If you want a practical, vendor-neutral read on whether your next build should be an agent or an automation, our AI strategy engagement covers exactly that question.

Written by

Ajay Dhillon · Founder