Subagent Configuration Deep Dive
Subagents are more configurable than most people realize. Model selection, effort levels, skill inheritance, isolation, and spawn restrictions — here's the full picture.
Most developers use the Agent tool with a single configuration: tell it what to do and let it run.
That is using 20% of the system.
Subagents have a complete configuration surface: model selection, thinking budget, tool restrictions, isolation modes, skill inheritance, and spawn control. Getting these right is the difference between agents that work efficiently and agents that burn tokens doing tasks a cheaper model could handle.
Agent Definition Files
The cleanest way to configure reusable subagents is through agent definition files at .claude/agents/<name>.md. These Markdown files use frontmatter to define the agent's capabilities and a body that serves as its system prompt.
.claude/
agents/
code-reviewer.md
test-writer.md
docs-writer.md
security-auditor.md
A minimal agent definition:
---
name: test-writer
description: Writes pytest tests for Python modules. Targets 90% coverage floor.
model: claude-sonnet-4-6
tools:
- Read
- Write
- Bash
mcpServers: []
---
You are a test-writing specialist. Your only job is to write comprehensive pytest tests.
Rules:
- Write tests for every public function and method
- Target 90% line coverage minimum
- Use pytest fixtures for shared setup
- Mock all external I/O (network calls, file system, database)
- Never modify source files — only write to test files
The frontmatter controls configuration. The body controls behavior. Keep the body focused — subagents with bloated system prompts suffer the same context overhead as any other large prompt.
The --agent CLI Flag
For subagents you use frequently, skip the overhead of the parent agent delegating. Run the agent directly:
claude --agent test-writer "Write tests for src/services/payment.py"
This spawns the test-writer agent directly from its definition file, bypassing the main agent entirely. Faster startup, lower token overhead, same capabilities.
Useful for:
- Scripted pipelines that always delegate to the same agent
- Quick one-off tasks where you know exactly which agent to use
- Testing an agent definition without wiring it into a larger workflow
Configuration Fields Reference
model — Which Claude Powers This Agent
---
model: claude-haiku-4-5
---
| Model | Use Case |
|---|---|
claude-haiku-4-5 | Simple tasks: file listing, regex matching, format conversion |
claude-sonnet-4-6 | Most coding tasks: test writing, refactoring, code review |
claude-opus-4-6 | Complex reasoning: architecture decisions, cross-system analysis |
Haiku at approximately 10x lower cost than Opus. If your test-writing agent runs Opus by default, you are paying 10x more than necessary for tasks that Haiku can handle.
effort — Thinking Budget
---
effort: low
---
Controls how much reasoning the agent applies. Maps to the thinking token budget internally.
| Effort | Description | Use For |
|---|---|---|
low | Fast, shallow reasoning | Mechanical tasks: format conversion, simple transforms |
medium | Balanced (default) | Most coding tasks |
high | Deep reasoning, extended thinking | Architecture, security review, complex debugging |
effort: high on a Haiku model has limited benefit — the model ceiling constrains reasoning quality regardless of budget. Match effort to model: high effort is meaningful on Sonnet or Opus, not on Haiku.
tools — Restrict Available Tools
---
tools:
- Read
- Bash
---
By default, subagents inherit the full tool set available to the parent. Restricting tools does two things: reduces the attack surface (an agent that cannot write files cannot corrupt your codebase), and reduces the cognitive load on the agent (fewer tools means more focused behavior).
Common restricted configurations:
Read-only analyst:
---
tools:
- Read
- Bash
---
File transformer (no bash execution):
---
tools:
- Read
- Write
---
Full access (explicitly stated for clarity):
---
tools:
- Read
- Write
- Edit
- Bash
- Agent
---
Note that Agent in the tools list controls whether the subagent can itself spawn further subagents. Omitting it prevents runaway agent chains.
mcpServers — Lean MCP Configuration
---
mcpServers:
- name: memory-service
---
Every MCP server adds tool options and context overhead. A code review agent does not need image generation tools. Be explicit about which MCP servers each agent needs.
For most coding-focused subagents, an empty mcpServers: [] is correct. Reserve full MCP access for agents that genuinely need cross-platform capabilities.
skill — Explicit Skill Inheritance
Subagents do not inherit skills from the parent. Skills are skills.json-defined behaviors — test-running hooks, quality gates, specialized toolsets. If a subagent needs a skill, you must pass it explicitly.
---
skill: test-runner
---
Or pass multiple:
---
skills:
- test-runner
- coverage-gate
---
The design is intentional. Implicit inheritance would mean every subagent inherits every skill, including skills that might conflict with the subagent's restricted scope.
isolation: worktree — Safe Experimentation
---
isolation: worktree
---
With worktree isolation, the agent works in a temporary copy of the repository. Its changes do not affect your main working tree until you review and merge them.
Behavior:
- Agent receives a
git worktree addcopy of the current branch - Agent works in that copy freely
- If the agent makes no changes, the worktree is automatically cleaned up
- If the agent makes changes, they exist on an isolated branch ready for review
Use worktree isolation for agents doing broad refactors, experimental changes, or any task where you want a review gate before the changes land in your working tree.
background — Non-Blocking Execution
---
background: true
---
When background: true, the spawning agent does not wait for this subagent to complete. It fires the subagent and continues. The subagent's results are delivered via a callback when it finishes.
Use this for:
- Long-running tasks that should not block the main workflow
- Parallel agent dispatch (spawn 3 agents with
background: true, then wait for all three) - Fire-and-forget documentation generation while coding continues
Practical Examples
Example 1: Code Review Agent (Lean and Fast)
---
name: code-reviewer
description: Reviews changed files for bugs, style issues, and security problems.
model: claude-sonnet-4-6
effort: medium
tools:
- Read
- Bash
mcpServers: []
---
You are a code reviewer. Your job is to review the diff provided and identify:
1. Bugs and logic errors
2. Security issues (injection, auth bypass, credential exposure)
3. Style violations against project conventions
4. Missing error handling
Output format:
- CRITICAL: [issue] — must fix before merge
- MAJOR: [issue] — should fix before merge
- MINOR: [issue] — optional improvement
Do not rewrite code. Only identify issues.
This agent uses Sonnet (sufficient for code review), read-only tools (it should never write), no MCP servers, and has a focused system prompt. It costs roughly 3x less than the same task on Opus with full tool access.
Example 2: Architecture Agent (Full Power)
---
name: architect
description: Cross-system architecture analysis and design decisions.
model: claude-opus-4-6
effort: high
tools:
- Read
- Bash
- Agent
mcpServers:
- name: memory-service
background: false
---
You are a senior software architect. You analyze systems holistically — across multiple
repos, services, and data flows. You identify architectural issues that are invisible
at the individual-file level.
You have access to the knowledge base for historical context.
You can spawn subagents to parallelize analysis across separate codebases.
When making recommendations, explain tradeoffs explicitly. Do not prescribe a single
solution without acknowledging alternatives.
Opus with high effort for genuinely complex reasoning. MCP access to historical context. Agent tool enabled because this agent needs to delegate sub-analysis. Background false because the parent needs the results before proceeding.
Subagents vs. Agent Teams
Subagents cannot communicate with each other. Agent A spawned by the parent cannot send a message to Agent B spawned by the same parent. The parent is the only coordination layer.
Agent Teams are different: team members can message each other directly via peer-to-peer messaging. Team members share context within the team's conversation.
| Pattern | When to Use |
|---|---|
| Single subagent | Simple delegation, one clearly bounded task |
| Multiple subagents (parallel) | Independent tasks with no shared state |
| Agent Team | Tasks requiring peer communication and shared reasoning |
| Sequential subagents | Pipeline stages where each depends on the prior |
For most cases, multiple independent subagents dispatched with background: true and collected afterward is simpler than a full team. Use teams when the agents genuinely need to deliberate together.
Token Overhead of Subagents
Every subagent call has overhead:
- Spawning context: ~2K tokens to instantiate the subagent
- Tool definitions: ~1K tokens per tool in the agent's tool set
- MCP server tools: ~500 tokens per server, more for servers with many tools
- System prompt: whatever you wrote in the agent definition body
A fully loaded subagent with 10 MCP servers and all tools can cost 15K tokens before it does anything. A lean subagent with 3 tools and no MCP servers might cost 3K tokens.
For agents that run frequently in automated pipelines, this overhead multiplies. 10 lean agents at 3K tokens each is 30K total overhead. 10 loaded agents at 15K tokens each is 150K tokens — a 5x difference in overhead alone.
Lean configuration is not premature optimization. It is responsible resource management.
Lesson 184 Drill
- Create
.claude/agents/in your current project - Write a
code-reviewer.mddefinition using the example above as a template - Write a
test-writer.mddefinition scoped to your project's test framework - Run
claude --agent code-reviewer "Review the last 3 files I edited"and verify the correct model is used - Add
isolation: worktreeto your test-writer and run a test-writing task — verify changes appear on a separate branch, not in your working tree
When your agents run on the right models for each task, with only the tools they need, you have moved from default configuration to intentional architecture.