Ask Knox

A single Managed Agent can accomplish a lot. It has web search, bash execution, file operations, and a full Claude model behind it. For many production pipelines, one agent is the right architecture.

Multi-agent orchestration adds coordination overhead, introduces new failure modes, and increases the complexity of observability. It earns its cost in exactly two scenarios: tasks that are genuinely parallelizable, and tasks where deep specialization in distinct domains produces measurably better results than a single general agent.

Know which scenario you are in before you build the orchestration layer.

The multiagent Coordinator Pattern

Orchestration in Managed Agents uses the top-level multiagent field configured at agent creation. Declaring {"type": "coordinator"} with an agents roster makes the agent an orchestrator that can dispatch to any agent in the roster. Roster entries are agent ID strings, version-pinned references ({"type": "agent", "id": ..., "version": ...}), or {"type": "self"} — up to 20 per coordinator.

# Create specialist agents first
market_research_agent = client.beta.agents.create(
    name="market-research-agent",
    model="claude-sonnet-4-6",
    system="You are a market research specialist...",
    tools=[
        {
            "type": "agent_toolset_20260401",
            "default_config": {"enabled": False},
            "configs": {"web_search": {"enabled": True}}
        }
    ]
)

financial_data_agent = client.beta.agents.create(
    name="financial-data-agent",
    model="claude-sonnet-4-6",
    system="You are a financial data analyst...",
    tools=[
        {
            "type": "agent_toolset_20260401",
            "default_config": {"enabled": False},
            "configs": {
                "web_search": {"enabled": True},
                "bash": {"enabled": True}
            }
        }
    ]
)

# Create orchestrator with both specialists in its roster
orchestrator = client.beta.agents.create(
    name="intelligence-pipeline-orchestrator",
    model="claude-sonnet-4-6",
    system="""You are a competitive intelligence pipeline orchestrator.
When given a company to research:
1. Dispatch a market research task to market-research-agent
2. Dispatch a financial analysis task to financial-data-agent
3. Wait for both to complete
4. Synthesize their outputs into a unified intelligence report

Dispatch tasks in parallel when possible.""",
    multiagent={
        "type": "coordinator",
        "agents": [
            market_research_agent.id,
            financial_data_agent.id
        ]
    }
)

When the orchestrator's session runs, it can call the specialist agents directly by name. Each call creates a new thread within the parent session. Note the distinction: the multiagent roster stores agent references for API authorization — it is how the platform knows which dispatches are permitted — while the orchestrator's system prompt references agents by their human-readable name, which is how the model knows to route a given task.

Thread Isolation

Each subagent call runs in its own isolated thread within the orchestrator's session. Thread isolation means:

Each subagent has its own context window — it does not see the orchestrator's conversation history or the other subagents' outputs unless explicitly passed
Tool calls in one thread do not affect the execution environment of other threads
Failures in one thread do not automatically terminate other threads or the parent session

Thread isolation is a feature, not a limitation. It is what enables safe parallel execution. The orchestrator must explicitly pass context to subagents — they do not inherit it.

Session: intelligence-pipeline
  Thread: main (orchestrator)
    → Dispatches task to market-research-agent
    → Dispatches task to financial-data-agent
    
  Thread: market-research-1
    [market-research-agent executes]
    → Returns findings to orchestrator thread
    
  Thread: financial-data-1
    [financial-data-agent executes]
    → Returns findings to orchestrator thread
    
  Thread: main (orchestrator)
    [Receives both results]
    [Synthesizes unified report]

Events from all threads flow through the parent session's event stream, tagged with session_thread_id. Your event consumer needs to track thread IDs to attribute output to the correct agent.

Parallelism in Practice

Parallel subagent dispatch is the clearest case for multi-agent architecture. If two tasks are independent and each takes 10 minutes, running them in parallel takes 10 minutes total instead of 20.

For the orchestrator to dispatch in parallel, its system prompt must instruct it to do so, and the tasks must be genuinely independent. If task B requires the output of task A, they cannot run in parallel — the orchestrator must sequence them.

The parallel dispatch pattern in the system prompt:

When you have multiple independent research tasks, dispatch them simultaneously:
1. Call market-research-agent with the market research subtask
2. Without waiting for a response, call financial-data-agent with the financial analysis subtask
3. After both calls are initiated, wait for both to return
4. Synthesize the combined outputs

Specialization vs Prompt-Level Roles

Before building a multi-agent system, validate that specialization is actually necessary. Prompt-level role separation — giving a single agent a detailed system prompt that covers multiple roles — can achieve significant quality improvement without orchestration overhead.

Multi-agent specialization earns its cost when:

The specialists need fundamentally different tools (one needs bash, one needs web search only)
The task domains require significantly different system prompts that would conflict in a single agent
Each specialist's task is long enough that separate context windows improve quality
You need to run specialists in parallel and the wall-clock time reduction matters

Prompt-level role separation is sufficient when:

The tasks are sequential and the context carries naturally between them
The tools needed overlap significantly
The total task complexity fits comfortably in a single context window

The Orchestrator's Model Selection

The orchestrator model choice matters. The orchestrator must:

Understand the full task and decompose it correctly
Pass precise context to each subagent
Evaluate subagent outputs and identify when they are insufficient
Synthesize outputs coherently

This is reasoning-intensive work. Using a weaker model for the orchestrator to save cost while using stronger models for workers typically degrades the overall pipeline quality — the orchestrator makes the high-level decisions that determine whether the workers' effort is well-directed.

Starting point: orchestrators at Sonnet or above. Workers can be tiered based on their specific task complexity.

No Nested Orchestration

Delegation is one level deep, and the platform enforces it: a coordinator dispatches to its workers, and any delegation deeper than that is ignored. A subagent cannot act as a mid-level orchestrator dispatching to its own workers — even if that subagent has a multiagent roster of its own, dispatches from inside a delegated thread do not happen.