The Orchestration Mental Model: Why One Agent Isn't Enough

The single-agent assumption is comfortable. You have one context window, one execution thread, one agent you can reason about. It is a clean mental model.

It is also wrong for any non-trivial system.

A 24/7 content pipeline — research, write, generate images, publish, notify — executed by a single agent hits three ceilings in rapid succession: context, serialization, and specialization. These are not incidental limitations. They are structural constraints of the single-agent architecture. Understanding them is the prerequisite to understanding why fleets exist.

The Three Ceilings

Ceiling 1: Context. Every agent has a finite context window. As a workflow grows — more files to read, more tool outputs to process, more history to track — the context fills. What gets dropped first? Usually the early reasoning, the original instructions, the accumulated state. A single agent running a long workflow degrades as the run progresses because the beginning of the run falls out of its window.

A fleet solves this structurally. Each subagent gets a fresh context loaded with exactly what it needs for its slice of work. No degradation. No context bleed from unrelated steps.

Ceiling 2: Serialization. A single agent executes sequentially. Even if three tasks are independent — research on Topic A, research on Topic B, research on Topic C — a single agent does them one after another. The total time is the sum of all three.

A trading signal fleet running on Tesseract does not research six markets sequentially. It spawns six parallel researchers, each isolated, each running simultaneously. Total time: the duration of the longest single research job. The architecture converts serial time into parallel time.

Ceiling 3: Specialization. A single agent given both "research the market" and "write the code" and "review the code" must context-switch. Each role has different priorities, different tool access patterns, different quality criteria. A researcher optimizes for coverage. A builder optimizes for correctness. A reviewer optimizes for defect detection. Giving one agent all three jobs means it does all three with mediocre specialized judgment.

The Orchestrator Role

The orchestrator is not a super-agent that knows everything. It is a coordinator that knows who should do what.

Its job:

Receive the top-level intent
Decompose it into subtasks
Route each subtask to the appropriate specialist
Track what has completed and what is pending
Collect and merge results
Handle failures by re-routing or escalating

What the orchestrator does not do: execute the work itself. The moment an orchestrator starts doing the research and writing the code and reviewing the output, it is no longer an orchestrator. It is a single agent with extra steps.

The orchestrator's system prompt should be focused on routing and coordination logic, not domain expertise. The domain expertise lives in the specialists.

The Fleet as a System

The parallel to fleet design: before you spawn agents, design the system. Define roles. Map dependencies. Establish the state layer. Decide how results merge. Only then dispatch.

Fleets that get designed after the fact — "let me add another agent to handle this problem" — accumulate coordination debt. Each unplanned agent creates coupling that was not anticipated. The state layer was not designed for the new agent's output format. The orchestrator's routing logic does not account for the new role. The fleet becomes hard to reason about.

The correct sequence: identify the ceiling your single-agent workflow is hitting, design the multi-agent architecture that resolves that specific ceiling, then build it.

Tools vs. Agents: The Decision

Not every discrete action needs an agent. The test:

Use a tool when: the action is discrete, deterministic, and returns a clear result without its own reasoning loop. A web search call. A file read. A database query. These are tools.

Use an agent when: the work requires its own planning, multiple tool calls, context-aware reasoning, and could plausibly fill a context window on its own. Writing a complete article. Debugging a failing test suite. Auditing a codebase for security issues. These are agent-sized tasks.

The wrong decision in both directions costs you. Over-agentifying simple tasks adds orchestration overhead to things that did not need it. Under-agentifying complex tasks jams everything into a single context and hits the ceilings above.

Lesson Drill

Map your current most complex workflow. For each step, ask:

Could this step fill its own context window on a long run?
Could this step run in parallel with any other step without shared state conflicts?
Does this step require a different quality lens than the step before it?

Mark every step that answers "yes" to any of these. That is your agent decomposition map. Each marked step is a candidate for its own agent. The unmarked steps are tools.

Bottom Line

One agent is not a fleet. The ceiling is not a model problem. It is an architecture problem.

Context fills. Execution serializes. Specialization competes. The fleet resolves all three by distributing work across agents that each own a clean window, a specific role, and an isolated execution environment.

The orchestration layer is the scaffolding that holds the fleet together — routing intent, tracking state, merging results. Get that right, and the fleet runs.