Task Decomposition: Breaking Work Into Agent-Sized Pieces
Decomposition is the orchestrator's primary skill. What makes a task agent-sized? How do you map dependencies? What does a handoff contract look like? The fan-out/fan-in pattern and the dependency graph that keeps your fleet coherent.
Task decomposition is the skill the orchestrator runs on. Without it, a fleet is just a collection of agents with no coordination. With it, a fleet becomes a system that converts complex work into parallel execution with predictable output.
The blog-autopilot pipeline I run every other morning is a decomposition exercise at its core. The top-level intent: "produce a published article on Topic X." That single intent decomposes into: research the topic, write the article, generate a hero image, open the PR. Four discrete subtasks. Some sequential, some parallel. Each with a defined output that the next stage reads. The fleet executes the decomposition — not the original intent as a monolithic task.
What Makes a Task Agent-Sized?
An agent-sized task has four properties:
-
Bounded scope. It fills roughly one focused context window. Not so small that spawning an agent adds more overhead than the task itself. Not so large that the agent's context degrades mid-execution.
-
Clear inputs. The agent can start immediately given the provided context, without asking clarifying questions. This is the completeness test. If you have to answer a question before the agent can begin, the spec is incomplete.
-
Structured output. The task produces a specific artifact in a specific format that the next stage can consume. "Research the topic and return your findings" is not a structured output. "Research the topic and return a JSON object with fields: summary (string, 200 words max), sources (array of URLs), key_claims (array of strings)" is.
-
Self-verifiable completion. The agent can determine when the task is done without external input. "Write a 1000-word article with H2 headings and an intro callout" is self-verifiable. "Write a good article" is not.
Tasks that fail any of these four criteria need further decomposition — or better specification — before spawning.
Decomposition Patterns
Fan-out / fan-in — The workhorse pattern. One task decomposes into N parallel subtasks, each runs independently, results merge at a collection point. Used when subtasks share no dependencies. Example: researching five market sectors simultaneously before synthesizing a market report.
Sequential pipeline — Each stage depends on the previous stage's output. Stage 2 cannot start until Stage 1 delivers. Used when outputs chain: research → write (needs research) → review (needs draft) → publish (needs approved draft). The Apollo content pipeline uses this pattern end-to-end.
Hybrid — Some stages parallel, some sequential. The common real-world case. In a content pipeline: research three topics in parallel (fan-out), synthesize into one outline (fan-in), write three sections in parallel (fan-out), final edit pass (fan-in). Most production pipelines are hybrid.
The Dependency Graph
Before spawning any agents, map the dependency graph. This is the step most builders skip, and it is the source of most fleet failures at merge time.
A dependency graph has two types of edges:
- Sequential edge (A → B): B cannot start until A completes. B requires A's output as input.
- Independence (A || B): A and B can run simultaneously. They share no inputs or outputs with each other.
Mapping the graph:
- List all subtasks
- For each pair, ask: "Does Task X need any output from Task Y to begin?" If yes, draw a sequential edge. If no, they are independent.
- Identify the critical path — the longest sequential chain. That chain determines your minimum fleet execution time.
- Parallelize everything that is not on a sequential edge.
The mistake I see consistently: builders assume independence without checking. "Research Topic A" and "Research Topic B" look independent. They are, unless Topic B's research requires confirming something discovered in Topic A's research. That makes them sequential. Map first, spawn second.
Handoff Contracts
A handoff contract is the interface between two stages. It defines exactly what the producing stage outputs and exactly what the consuming stage expects as input. Both sides of the contract must agree before a single agent is spawned.
The contract has three components:
- Schema: the exact structure of the output (JSON schema, markdown template, file naming convention)
- Validation: how the consuming stage verifies the output is complete and correctly formed
- Failure protocol: what happens if the output is malformed or missing
In the blog-autopilot pipeline, the handoff contract between research and write:
- Schema:
research.jsonwith fieldssummary,sources[],key_claims[],angle - Validation: the write agent checks for all required fields before proceeding. Missing fields → alert and halt, do not attempt to write with incomplete context
- Failure protocol: if research fails, the write stage does not spawn. The orchestrator logs the failure and notifies via Discord
The Apollo Pipeline as a Decomposition Walkthrough
The Apollo daily video pipeline — research → script → voice synthesis → video assembly → Discord notification — is a decomposition exercise with real production constraints.
Top-level intent: "Produce a daily YouTube video for @TheCodeWhispererKnox."
Decomposed:
- Research (independent): gather trending AI/engineering topics via web search. Output:
topics.json - Script (sequential, depends on 1): write a 5-minute video script for the selected topic. Output:
script.md - Voice (sequential, depends on 2): send script to ElevenLabs API. Output:
audio.mp3 - Visuals (parallel with 3 after 2 completes): generate thumbnail image with Gemini/Leonardo. Output:
thumbnail.png - Assembly (sequential, depends on 3+4): HeyGen assembles video with digital twin. Output: video URL
- Notify (sequential, depends on 5): Discord notification with video link
Steps 3 and 4 are parallel — audio synthesis and image generation share no dependencies after the script exists. Steps 1 and 5 are on the critical path. Every other step is optimization.
The parallel to decomposition: map what your workflow has in abundance (independent work that can parallelize) and where it is constrained (sequential dependencies that force serialization). Optimize the constrained path. Parallelize the abundant path.
Lesson 66 Drill
Take a workflow you run manually or want to automate. Write out every step as a discrete subtask. For each pair of subtasks, answer: does step B need step A's output to begin?
Draw the dependency graph. Identify the critical path. Identify the parallel opportunities.
For each subtask, write the handoff contract: what does this subtask produce, in what format, verified how?
That graph and those contracts are your fleet blueprint. Building the agents comes after.
Bottom Line
Decomposition before dispatch. Every time.
The orchestrator's job begins before the first agent spawns: map the work, identify dependencies, define handoff contracts, mark the critical path. Agents execute decompositions — they do not create them.
Get the decomposition right and the fleet runs predictably. Get it wrong and you find out at merge time, when the damage is already done.