КОД</>БЕЗ МЕЖ
← All articles

What is a multi-agent system

For 80% of businesses, one AI agent is enough. For the other 20%, one agent becomes a 2,000-line prompt that breaks every Friday and nobody can debug. That is when you need a multi-agent system. Here is what it actually is, when you cross the threshold, and the stack to build one.

Definition: what a multi-agent system actually is

A multi-agent system is two or more LLM-powered agents that hand work between each other to complete a job no single agent does well alone. Concretely:

Think of it like an organization: a manager (orchestrator) decides what gets done, and specialists (analyst, salesperson, support) do the focused work.

When one agent is enough

Honest signals that you should NOT build multi-agent:

90% of my clients ship a single-agent version first, even when they will eventually need multi-agent. It is faster, cheaper, and proves the business case.

When you actually need multi-agent

Five signals that one agent is no longer enough:

  1. Tool count crosses 15-20. One agent with 30 tools picks the wrong one ~25% of the time. Splitting into specialists, each with 5-8 tools, brings that back to 5%.
  2. Parallel work is required. "While the research agent is gathering competitor data, the writer agent drafts the intro." One agent does these in series — multi-agent does them at once.
  3. Different agents need different models. Reasoning done by Claude Opus, fast tool use by GPT-5 Mini, sensitive data handled by Hermes self-hosted. One process orchestrates all three.
  4. Domains are too different. Sales agent needs a friendly closer-tone prompt. Compliance agent needs a strict, conservative prompt. Mixing them in one prompt — neither works well.
  5. Long-running workflows. "Monitor inbox, draft reply, get human approval, send." Hours or days. Multi-agent with state persistence is the natural fit.

Real example: B2B sales pipeline

$23,500 / 10 weeks for a Warsaw SaaS client. The system:

Outcome: MQL→SQL conversion 3× higher, +$340,000 quarterly revenue. Could not have shipped this as one agent — the qualifier prompt alone is 900 tokens of ICP-specific rules.

Real example: research bot

Internal tool for a consulting firm. Input: "summarize the European market for X in the last 6 months". Output: a 4-page briefing with sources.

Single-agent version of this exists — it takes 25 minutes per brief and misses 30-40% of sources. Multi-agent version: 4 minutes, near complete coverage. Five agents, one orchestrator, one Postgres for shared state.

Real example: operations dashboard

Multi-agent reporting pipeline for a Berlin fintech. Every morning at 8 AM:

Result: −40 hours/week of analyst work, anomalies caught 4 days earlier on average. Payback in 2.5 months on a $36,800 build.

The stack: how I actually build this

Orchestration layer

Memory layer

Model layer

Almost always heterogeneous. Orchestrator on Claude Sonnet 4.5, latency-critical specialists on GPT-5 Mini, compliance specialist on Hermes self-hosted. See the model decision matrix →

Observability layer

Cost reality

Pitfalls I have learned the hard way

  1. Do not let agents talk to each other freely.Always go through the orchestrator. Free chatter between agents causes infinite loops and runaway token bills.
  2. Specialist agents need narrow tool lists. Give each specialist only the tools it needs. Sharing tools across agents kills the accuracy gain.
  3. State must be explicit. Implicit "the agents will figure it out" never works. Define every handoff payload.
  4. Eval each agent independently. Then eval the whole system. Two pass rates: per-agent and end-to-end.
  5. Start with 2 agents, not 6. Most multi-agent systems I see in the wild have 3-4 too many agents. Each agent adds latency and a failure mode.

Should you build multi-agent?

Honest test: if a single agent with a 1,500-token prompt does not do your job, you might need multi-agent. If you have not tried that yet, build the single agent first and measure where it fails.

I have shipped both. I am quick to recommend the simpler one. Book a call and I will tell you which side of the threshold you are on — usually within 30 minutes.

Message @tribeofdanel →