Ask Knox

There is a single concept that separates functional agentic systems from broken ones. It is not model selection. It is not prompt engineering. It is not the choice of framework. It is whether your code correctly handles what comes back from the API.

The Claude API returns a stop_reason field with every response. When that field says "tool_use", the model is requesting that you execute a tool and give it the result. When it says "end_turn", the model is done. Your loop continues or terminates based entirely on this value. Nothing else. Not the text content. Not an iteration counter. Not a keyword you told the model to say when it finishes. The stop_reason field. That is the control signal.

This is Domain 1, Task Statement 1.1 of the CCA exam. It is the foundation of the largest domain, which is 27% of the exam. Get this wrong and the rest of the architecture does not matter.

The Loop, Step by Step

The agentic loop has five steps. They repeat until the model decides it is done.

Step 1: Send the request. You send a messages array and a tools array to the Claude API. The messages array contains the conversation history — user messages, assistant messages, and critically, any prior tool results. The tools array describes what the model can call.

Step 2: Inspect stop_reason. The response comes back with a stop_reason field. If it is "tool_use", the model wants you to execute one or more tools. If it is "end_turn", the model is finished. These are the two values that drive the loop — the complete decision tree below covers the rest.

Step 3: Execute the tool. When stop_reason is "tool_use", the response content will contain one or more tool_use blocks. Each block has a name (which tool to call), an id (to match the result back), and input (the arguments). Your code executes each tool and collects the results.

Step 4: Append the result. You append the assistant's response to the messages array (including the tool_use blocks), then append a tool_result message for each tool call. The tool_use_id field links each result back to the request that generated it.

Step 5: Call the API again. With the updated messages array — now containing the tool results — you send another request. The model sees what the tools returned and decides what to do next: call another tool, call multiple tools, or finish.

The Code

Here is the minimal agentic loop in TypeScript. This is not pseudocode. This is the structure that every production agent uses, regardless of how many layers of abstraction sit on top.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Define your tools
const tools: Anthropic.Tool[] = [
  {
    name: "get_weather",
    description: "Get the current weather for a given location.",
    input_schema: {
      type: "object" as const,
      properties: {
        location: { type: "string", description: "City and state, e.g. 'San Francisco, CA'" },
      },
      required: ["location"],
    },
  },
];

// Execute a tool call — your implementation goes here
function executeTool(name: string, input: Record<string, unknown>): string {
  if (name === "get_weather") {
    return JSON.stringify({ temp: 62, condition: "foggy", location: input.location });
  }
  return JSON.stringify({ error: "Unknown tool" });
}

// The agentic loop
async function agentLoop(userMessage: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      tools,
      messages,
    });

    // THE critical check — this is the entire control flow
    if (response.stop_reason === "end_turn") {
      // Model is done — extract and return the text
      const textBlock = response.content.find((b) => b.type === "text");
      return textBlock ? textBlock.text : "";
    }

    if (response.stop_reason === "pause_turn") {
      // Server-side tool loop paused — append the assistant turn and re-send.
      // The server resumes automatically; no tool results, no extra user message.
      messages.push({ role: "assistant", content: response.content });
      continue;
    }

    if (response.stop_reason !== "tool_use") {
      // max_tokens, refusal, model_context_window_exceeded, stop_sequence —
      // none of these are tool requests. Surface them instead of looping
      // with an empty tool_results turn.
      throw new Error(`Unhandled stop_reason: ${response.stop_reason}`);
    }

    // stop_reason is "tool_use" — execute all requested tools
    // First, append the assistant's full response to messages
    messages.push({ role: "assistant", content: response.content });

    // Then append tool results for each tool_use block
    const toolResults: Anthropic.ToolResultBlockParam[] = [];
    for (const block of response.content) {
      if (block.type === "tool_use") {
        const result = executeTool(block.name, block.input as Record<string, unknown>);
        toolResults.push({
          type: "tool_result",
          tool_use_id: block.id,
          content: result,
        });
      }
    }
    messages.push({ role: "user", content: toolResults });
  }
}

Study this code until you can write it from memory. The structure does not change whether you have one tool or fifty. The while (true) loop. The stop_reason check. The tool execution. The message array append. The next API call. That is it.

stop_reason: The Complete Decision Tree

The current API can return seven stop_reason values. Two drive the agentic loop. One requires a re-send. The rest are exceptional conditions your code must not mistake for either.

"tool_use" — The model wants tools executed. This is your continue signal. Extract every tool_use block from the response content, execute each one, append the results, and call the API again. Do not skip this step. Do not partially execute. Every tool_use block in the response must get a corresponding tool_result.

"end_turn" — The model is finished. This is your stop signal. Extract the text content from the response and return it. The loop is over. Do not second-guess this decision. Do not check whether "enough" tools were called. The model determined it had enough information to produce a final answer.

"pause_turn" — The model paused mid-turn, most commonly when server-side tools (web search, code execution) hit the server-side iteration limit. This is a re-send signal, not an error: append the assistant's response to the messages array and call the API again — the server resumes where it left off. Do not inject a "continue" user message. Cap continuations (e.g., 5) so a stuck turn cannot loop forever.

"max_tokens" — The response was truncated because it hit the token limit. This is not a normal loop event. It means your max_tokens parameter is too low for the task, or the model is generating an unusually long response. Handle it as an error or implement continuation logic — never feed it back into the loop as if tools were requested.

"refusal" — The request was declined for safety reasons. This arrives as a successful HTTP 200 with stop_reason: "refusal" (check stop_details for the category). Surface it to the user or route to a fallback; do not retry the same prompt in a loop.

"model_context_window_exceeded" — The model hit the context window limit, distinct from max_tokens (the requested output cap). The fix is structural: compact or truncate the conversation history before retrying.

"stop_sequence" — A custom stop sequence was triggered. This is rare in agentic loops and only relevant if you explicitly configured stop sequences. Most agentic implementations do not use this.

The Anti-Patterns (CCA Exam Distractors)

The CCA exam guide explicitly calls out three anti-patterns for agentic loop termination. These are the wrong answers that will appear as distractors on the exam. Know them so you can eliminate them immediately.

This means checking the model's text output for keywords like "DONE", "COMPLETE", or "I have finished." The model's text content is output for the user, not a control signal for your code. The model might use the word "done" in the middle of explaining what it has done so far. It might finish without ever saying "done." The text is unreliable as a control mechanism because it was never designed to be one.

Setting maxIterations = 5 and breaking when the counter hits the limit. Some tasks need one iteration. Some need fifteen. A hard cap either terminates the loop before the model is done (producing incomplete results) or wastes cycles running the loop when the model has already signaled completion via end_turn. Safety caps are fine as a guardrail — but stop_reason must be the primary termination signal.

Claude can return both text and tool_use blocks in the same response. The presence of text does not mean the model is done. The model might explain what it is about to do (text) and then request the tool call (tool_use) in the same response. If your code sees text and stops, it will miss the tool call and return an incomplete result.

The anti-patterns all share the same root cause: they add friction to the decision cycle. The model already has a clean signal for "I am done" — stop_reason: "end_turn". Every alternative mechanism adds noise, ambiguity, and failure modes.

Tool Results and Conversation History

The second critical concept in Domain 1 is how tool results flow through the conversation. This is not incidental plumbing. It is the mechanism that gives the model the ability to reason about what tools returned and decide what to do next.

When the model returns a tool_use block, your code executes the tool and produces a result. That result must be appended to the messages array as a tool_result block. On the next API call, the model sees the full conversation history including the tool result. This is how the model learns what the tool returned. It does not have access to your tool's output through any other channel. If you do not append the result, the model cannot see it.

// The model's response contains tool_use blocks
// You MUST append them as the assistant's message first
messages.push({ role: "assistant", content: response.content });

// Then append the results as a user message with tool_result blocks
messages.push({
  role: "user",
  content: [
    {
      type: "tool_result",
      tool_use_id: "toolu_abc123",  // matches the tool_use block's id
      content: JSON.stringify({ customer: "Jane", status: "verified" }),
    },
  ],
});

The model reasons about tool results the same way it reasons about any other conversation content. It can compare results from multiple tools. It can decide a tool result is insufficient and call another tool for more information. It can synthesize results from several tools into a final answer. All of this happens because the full conversation history — including tool results — is available in the messages array on every iteration.

Model-Driven vs. Pre-Configured Decisions

The CCA exam draws a sharp line between two approaches to agentic control flow.

Model-driven decisions: The model looks at the conversation context, the available tools, and the task requirements, and decides which tool to call next. This is the default behavior in an agentic loop. The model might call tool A, look at the result, decide it needs tool B, call tool B, then synthesize both results into a final answer. The sequence was not predetermined — the model reasoned its way through it.

Pre-configured sequences: Your code decides which tools to call and in what order. This is prompt chaining — step 1 calls tool A, step 2 calls tool B, step 3 synthesizes. The model does not choose the sequence. Your code does.

Both patterns are valid. The exam tests whether you know when to use which. Model-driven decisions are appropriate when the task is open-ended and the optimal tool sequence depends on intermediate results. Pre-configured sequences are appropriate when the workflow is predictable and must follow a specific order for compliance or correctness reasons.

Subagents and Context Isolation

The Agent SDK provides the Task tool for spawning subagents. This is how a coordinator agent delegates work to specialized agents. The critical concept for the CCA exam: subagents do not inherit the coordinator's conversation history.

When a coordinator spawns a subagent via the Task tool, it must explicitly pass all relevant context in the subagent's prompt. The subagent starts with an empty conversation. If the coordinator has gathered information from previous tool calls that the subagent needs, that information must be included in the subagent's task description.

// Coordinator spawning a subagent — context must be explicit
const subagentTask = {
  description: `Analyze the following customer data and determine refund eligibility.

Customer ID: ${customerId}
Order history: ${JSON.stringify(orderHistory)}
Complaint: ${complaintText}

Return a JSON object with: eligible (boolean), reason (string), amount (number).`,
  allowedTools: ["lookup_refund_policy", "calculate_refund"],
};

The allowedTools field controls what the subagent can access. A synthesis agent should not have web search tools. A search agent should not have refund processing tools. Scoped tool access prevents cross-specialization misuse — one of the key principles the exam tests in Domain 2.

Lesson Drill

Before moving to the next lesson, prove your understanding by building:

Implement a minimal agentic loop in TypeScript (or Python) with two tools — do not use a framework, use the raw API
Add logging that prints stop_reason on every iteration — run a task and observe the sequence of values
Intentionally break the loop by removing the tool_result append step — observe what happens
Add a safety cap of 20 iterations as a guardrail (not primary control) — log when it triggers
Modify the loop to handle max_tokens truncation by continuing the conversation
Spawn a subagent from within the loop — explicitly pass context and verify the subagent cannot see the parent's conversation history

Bottom Line

The agentic loop is twelve lines of logic wrapped around a while (true) and a stop_reason check. It is the simplest important thing in the entire Agent SDK. Its simplicity is why it is easy to overcomplicate and why the CCA exam dedicates 27% of its weight to the domain that starts here.

stop_reason is "tool_use"? Execute the tools, append the results, loop again. stop_reason is "end_turn"? Return the response. That is the entire control flow. Every anti-pattern in agentic architecture comes from adding logic that contradicts or bypasses this signal.

The model drives the loop. Your code executes tools and checks stop_reason. Maintain that separation and the architecture stays clean. Violate it and you are debugging problems that should not exist.

The Agent SDK Loop — stop_reason, Tool Results, and Agentic Control Flow