Subagent Configuration Deep Dive

Most developers use the Agent tool with a single configuration: tell it what to do and let it run.

That is using 20% of the system.

Subagents have a complete configuration surface: model selection, thinking budget, tool restrictions, isolation modes, skill inheritance, and spawn control. Getting these right is the difference between agents that work efficiently and agents that burn tokens doing tasks a cheaper model could handle.

Agent Definition Files

The cleanest way to configure reusable subagents is through agent definition files at .claude/agents/<name>.md. These Markdown files use frontmatter to define the agent's capabilities and a body that serves as its system prompt.

.claude/
  agents/
    code-reviewer.md
    test-writer.md
    docs-writer.md
    security-auditor.md

A minimal agent definition:

---
name: test-writer
description: Writes pytest tests for Python modules. Targets 90% coverage floor.
model: claude-sonnet-4-6
tools:
  - Read
  - Write
  - Bash
mcpServers: []
---

You are a test-writing specialist. Your only job is to write comprehensive pytest tests.

Rules:
- Write tests for every public function and method
- Target 90% line coverage minimum
- Use pytest fixtures for shared setup
- Mock all external I/O (network calls, file system, database)
- Never modify source files — only write to test files

The frontmatter controls configuration. The body controls behavior. Keep the body focused — subagents with bloated system prompts suffer the same context overhead as any other large prompt.

The --agent CLI Flag

For subagents you use frequently, skip the overhead of the parent agent delegating. Run the agent directly:

claude --agent test-writer "Write tests for src/services/payment.py"

This spawns the test-writer agent directly from its definition file, bypassing the main agent entirely. Faster startup, lower token overhead, same capabilities. (Confirm the exact flag name with claude --help on your build — CLI surfaces change between versions.)

Useful for:

Scripted pipelines that always delegate to the same agent
Quick one-off tasks where you know exactly which agent to use
Testing an agent definition without wiring it into a larger workflow

Configuration Fields Reference

`model` — Which Claude Powers This Agent

---
model: claude-haiku-4-5
---

Model	Use Case
`claude-haiku-4-5`	Simple tasks: file listing, regex matching, format conversion
`claude-sonnet-4-6`	Most coding tasks: test writing, refactoring, code review
`claude-opus-4-8`	Complex reasoning: architecture decisions, cross-system analysis

Haiku runs at roughly 5x lower cost than Opus ($1/$5 vs $5/$25 per million tokens, input/output). If your test-writing agent runs Opus by default, you are paying around 5x more than necessary for tasks that Haiku can handle.

Effort — Reasoning Depth (Session-Level, Not Frontmatter)

Effort controls how much reasoning the model applies. It is a session-level control — set per request or at spawn time, not a confirmed agent-definition frontmatter field. The scale is low / medium / high / xhigh / max.

Effort	Description	Use For
`low`	Fast, shallow reasoning	Mechanical tasks: format conversion, simple transforms
`medium`	Balanced	Routine, cost-sensitive tasks
`high`	Deep reasoning (the default)	Most coding tasks
`xhigh` / `max`	Maximum reasoning depth	Architecture, security review, complex debugging

High effort on a Haiku model has limited benefit — the model ceiling constrains reasoning quality regardless of budget. Match effort to model: the deep-reasoning levels are meaningful on Sonnet or Opus, not on Haiku.

`tools` — Restrict Available Tools

---
tools:
  - Read
  - Bash
---

By default, subagents inherit the full tool set available to the parent. Restricting tools does two things: reduces the attack surface (an agent that cannot write files cannot corrupt your codebase), and reduces the cognitive load on the agent (fewer tools means more focused behavior).

Common restricted configurations:

Read-only analyst:

---
tools:
  - Read
  - Bash
---

File transformer (no bash execution):

---
tools:
  - Read
  - Write
---

Full access (explicitly stated for clarity):

---
tools:
  - Read
  - Write
  - Edit
  - Bash
  - Agent
---

Note that Agent in the tools list controls whether the subagent can itself spawn further subagents. Omitting it prevents runaway agent chains.

`mcpServers` — Lean MCP Configuration

---
mcpServers:
  - name: memory-service
---

Every MCP server adds tool options and context overhead. A code review agent does not need image generation tools. Be explicit about which MCP servers each agent needs.

For most coding-focused subagents, an empty mcpServers: [] is correct. Reserve full MCP access for agents that genuinely need cross-platform capabilities.

Skills — No Implicit Inheritance

Subagents do not inherit skills from the parent. A subagent starts a fresh session with only what its own definition and spawn prompt provide. If a subagent needs a procedure that lives in one of your skills, give it explicitly — name the skill in the spawn prompt, or fold the relevant steps into the agent definition's body:

Spawn a test-writer agent. Follow the test-runner skill's protocol:
run the suite with coverage, enforce the 90% floor, report uncovered lines.

The design is intentional. Implicit inheritance would mean every subagent inherits every skill, including skills that might conflict with the subagent's restricted scope.

`isolation: "worktree"` — Safe Experimentation (Spawn-Time)

Isolation is a parameter on the Agent tool call — the parent sets it when spawning, it is not agent-file frontmatter:

Agent(agent_type="test-writer", prompt="...", isolation="worktree")

With worktree isolation, the agent works in a temporary copy of the repository. Its changes do not affect your main working tree until you review and merge them.

Behavior:

Agent receives a git worktree add copy of the current branch
Agent works in that copy freely
If the agent makes no changes, the worktree is automatically cleaned up
If the agent makes changes, they exist on an isolated branch ready for review

Use worktree isolation for agents doing broad refactors, experimental changes, or any task where you want a review gate before the changes land in your working tree.

`run_in_background: true` — Non-Blocking Execution (Spawn-Time)

Also a spawn-time Agent tool parameter, not frontmatter. When run_in_background: true, the spawning agent does not wait for this subagent to complete. It fires the subagent and continues. The subagent's results are delivered when it finishes.

Use this for:

Long-running tasks that should not block the main workflow
Parallel agent dispatch (spawn 3 agents with run_in_background: true, then collect all three)
Fire-and-forget documentation generation while coding continues

A related spawn-time control: a model passed on the Agent tool call takes precedence over the agent definition's model frontmatter — useful for a one-off upgrade or downgrade without editing the definition file.

Practical Examples

Example 1: Code Review Agent (Lean and Fast)

---
name: code-reviewer
description: Reviews changed files for bugs, style issues, and security problems.
model: claude-sonnet-4-6
tools:
  - Read
  - Bash
mcpServers: []
---

You are a code reviewer. Your job is to review the diff provided and identify:
1. Bugs and logic errors
2. Security issues (injection, auth bypass, credential exposure)
3. Style violations against project conventions
4. Missing error handling

Output format:
- CRITICAL: [issue] — must fix before merge
- MAJOR: [issue] — should fix before merge
- MINOR: [issue] — optional improvement

Do not rewrite code. Only identify issues.

This agent uses Sonnet (sufficient for code review), read-only tools (it should never write), no MCP servers, and has a focused system prompt. Per token it costs roughly 40% less than the same task on Opus (Sonnet $3/$15 vs Opus $5/$25 per million tokens), and the lean tool surface compounds the savings.

Example 2: Architecture Agent (Full Power)

---
name: architect
description: Cross-system architecture analysis and design decisions.
model: claude-opus-4-8
tools:
  - Read
  - Bash
  - Agent
mcpServers:
  - name: memory-service
---

You are a senior software architect. You analyze systems holistically — across multiple
repos, services, and data flows. You identify architectural issues that are invisible
at the individual-file level.

You have access to the knowledge base for historical context.
You can spawn subagents to parallelize analysis across separate codebases.

When making recommendations, explain tradeoffs explicitly. Do not prescribe a single
solution without acknowledging alternatives.

Opus for genuinely complex reasoning — spawn it at high effort. MCP access to historical context. Agent tool enabled because this agent needs to delegate sub-analysis. Spawn it without run_in_background because the parent needs the results before proceeding.

Subagents vs. Agent Teams

Subagents cannot communicate with each other. Agent A spawned by the parent cannot send a message to Agent B spawned by the same parent. The parent is the only coordination layer.

Agent Teams are different: team members can message each other directly via peer-to-peer messaging. Team members share context within the team's conversation.

Pattern	When to Use
Single subagent	Simple delegation, one clearly bounded task
Multiple subagents (parallel)	Independent tasks with no shared state
Agent Team	Tasks requiring peer communication and shared reasoning
Sequential subagents	Pipeline stages where each depends on the prior

For most cases, multiple independent subagents dispatched with run_in_background: true and collected afterward is simpler than a full team. Use teams when the agents genuinely need to deliberate together.

Token Overhead of Subagents

Every subagent call has overhead:

Spawning context: ~2K tokens to instantiate the subagent
Tool definitions: ~1K tokens per tool in the agent's tool set
MCP server tools: ~500 tokens per server, more for servers with many tools
System prompt: whatever you wrote in the agent definition body

A fully loaded subagent with 10 MCP servers and all tools can cost 15K tokens before it does anything. A lean subagent with 3 tools and no MCP servers might cost 3K tokens.

For agents that run frequently in automated pipelines, this overhead multiplies. 10 lean agents at 3K tokens each is 30K total overhead. 10 loaded agents at 15K tokens each is 150K tokens — a 5x difference in overhead alone.

Lean configuration is not premature optimization. It is responsible resource management.

Lesson Drill

Create .claude/agents/ in your current project
Write a code-reviewer.md definition using the example above as a template
Write a test-writer.md definition scoped to your project's test framework
Run claude --agent code-reviewer "Review the last 3 files I edited" and verify the correct model is used
Spawn your test-writer with isolation: "worktree" on the Agent tool call and run a test-writing task — verify changes appear on a separate branch, not in your working tree

When your agents run on the right models for each task, with only the tools they need, you have moved from default configuration to intentional architecture.