Multiagent Orchestration
Orchestrating multiple Managed Agents requires understanding thread isolation, the callable_agents pattern, and when specialization actually earns its coordination overhead.
A single Managed Agent can accomplish a lot. It has web search, bash execution, file operations, and a full Claude model behind it. For many production pipelines, one agent is the right architecture.
Multi-agent orchestration adds coordination overhead, introduces new failure modes, and increases the complexity of observability. It earns its cost in exactly two scenarios: tasks that are genuinely parallelizable, and tasks where deep specialization in distinct domains produces measurably better results than a single general agent.
Know which scenario you are in before you build the orchestration layer.
The callable_agents Pattern
Orchestration in Managed Agents uses the callable_agents field configured at agent creation. An orchestrator can call any agent listed in its callable_agents list.
# Create specialist agents first
market_research_agent = client.beta.agents.create(
name="market-research-agent",
model="claude-sonnet-4-6",
system="You are a market research specialist...",
tools=[{"type": "web_search"}]
)
financial_data_agent = client.beta.agents.create(
name="financial-data-agent",
model="claude-sonnet-4-6",
system="You are a financial data analyst...",
tools=[{"type": "web_search"}, {"type": "bash"}]
)
# Create orchestrator with access to both specialists
orchestrator = client.beta.agents.create(
name="intelligence-pipeline-orchestrator",
model="claude-sonnet-4-6",
system="""You are a competitive intelligence pipeline orchestrator.
When given a company to research:
1. Dispatch a market research task to market-research-agent
2. Dispatch a financial analysis task to financial-data-agent
3. Wait for both to complete
4. Synthesize their outputs into a unified intelligence report
Dispatch tasks in parallel when possible.""",
tools=[{"type": "bash"}],
callable_agents=[
market_research_agent.id,
financial_data_agent.id
]
)
When the orchestrator's session runs, it can call the specialist agents directly by name. Each call creates a new thread within the parent session.
Thread Isolation
Each subagent call runs in its own isolated thread within the orchestrator's session. Thread isolation means:
- Each subagent has its own context window — it does not see the orchestrator's conversation history or the other subagents' outputs unless explicitly passed
- Tool calls in one thread do not affect the execution environment of other threads
- Failures in one thread do not automatically terminate other threads or the parent session
Thread isolation is a feature, not a limitation. It is what enables safe parallel execution. The orchestrator must explicitly pass context to subagents — they do not inherit it.
Session: intelligence-pipeline
Thread: main (orchestrator)
→ Dispatches task to market-research-agent
→ Dispatches task to financial-data-agent
Thread: market-research-1
[market-research-agent executes]
→ Returns findings to orchestrator thread
Thread: financial-data-1
[financial-data-agent executes]
→ Returns findings to orchestrator thread
Thread: main (orchestrator)
[Receives both results]
[Synthesizes unified report]
Events from all threads flow through the parent session's event stream, tagged with session_thread_id. Your event consumer needs to track thread IDs to attribute output to the correct agent.
Parallelism in Practice
Parallel subagent dispatch is the clearest case for multi-agent architecture. If two tasks are independent and each takes 10 minutes, running them in parallel takes 10 minutes total instead of 20.
For the orchestrator to dispatch in parallel, its system prompt must instruct it to do so, and the tasks must be genuinely independent. If task B requires the output of task A, they cannot run in parallel �� the orchestrator must sequence them.
The parallel dispatch pattern in the system prompt:
When you have multiple independent research tasks, dispatch them simultaneously:
1. Call market-research-agent with the market research subtask
2. Without waiting for a response, call financial-data-agent with the financial analysis subtask
3. After both calls are initiated, wait for both to return
4. Synthesize the combined outputs
Specialization vs Prompt-Level Roles
Before building a multi-agent system, validate that specialization is actually necessary. Prompt-level role separation — giving a single agent a detailed system prompt that covers multiple roles — can achieve significant quality improvement without orchestration overhead.
Multi-agent specialization earns its cost when:
- The specialists need fundamentally different tools (one needs bash, one needs web search only)
- The task domains require significantly different system prompts that would conflict in a single agent
- Each specialist's task is long enough that separate context windows improve quality
- You need to run specialists in parallel and the wall-clock time reduction matters
Prompt-level role separation is sufficient when:
- The tasks are sequential and the context carries naturally between them
- The tools needed overlap significantly
- The total task complexity fits comfortably in a single context window
The Orchestrator's Model Selection
The orchestrator model choice matters. The orchestrator must:
- Understand the full task and decompose it correctly
- Pass precise context to each subagent
- Evaluate subagent outputs and identify when they are insufficient
- Synthesize outputs coherently
This is reasoning-intensive work. Using a weaker model for the orchestrator to save cost while using stronger models for workers typically degrades the overall pipeline quality — the orchestrator makes the high-level decisions that determine whether the workers' effort is well-directed.
Starting point: orchestrators at Sonnet or above. Workers can be tiered based on their specific task complexity.
Nested Orchestration
Subagents can themselves be orchestrators. A top-level orchestrator can dispatch to a mid-level orchestrator that dispatches to leaf workers. This enables hierarchical pipelines, but the complexity compounds with each level.
Practical limit: two levels of orchestration (orchestrator → worker) covers nearly all production use cases. Three or more levels are usually a sign that the task decomposition should be redesigned, not that the hierarchy should be extended.