10 Anti-Patterns That Fail the CCA Exam

The CCA exam is not a knowledge test. It is a judgment test. Every question has four options, and three of them look reasonable to someone who has read the documentation but not built the systems. The distractors are not random — they are drawn from the specific anti-patterns that practitioners encounter in real production deployments.

These are the 10 mistakes that appear most frequently as wrong answers. Know what each one looks like, why it fails, and what the correct approach is. The exam does not reward recognition — it rewards the ability to distinguish between plausible and correct.

Anti-Pattern 1: Natural Language Parsing for Loop Termination

What it looks like: Checking whether the assistant's response contains phrases like "task complete," "I'm done," or "here is the final answer" to determine when the agentic loop should stop.

Why it fails: Natural language is probabilistic. The model might say "I've completed the analysis" in the middle of a multi-step task, triggering premature termination. Or it might never produce the expected phrase, causing an infinite loop. Regex and string matching on assistant text are fragile, version-dependent, and untestable.

The correct approach: Use stop_reason from the API response. Continue when stop_reason is "tool_use" (the model wants to call a tool). Terminate when stop_reason is "end_turn" (the model considers the task complete).

// WRONG: parsing natural language
while (textOf(response).includes("task complete")) {
  response = await anthropic.messages.create({ model, max_tokens, tools, messages });
}

// RIGHT: using stop_reason
while (response.stop_reason === "tool_use") {
  const toolUseBlocks = response.content.filter((b) => b.type === "tool_use");
  const toolResults = await Promise.all(toolUseBlocks.map(runTool)); // -> tool_result blocks
  messages.push({ role: "assistant", content: response.content });
  messages.push({ role: "user", content: toolResults });
  response = await anthropic.messages.create({ model, max_tokens, tools, messages });
}
// Loop exits when stop_reason === "end_turn"

Anti-Pattern 2: Arbitrary Iteration Caps

What it looks like: Setting max_iterations = 10 as the primary mechanism to prevent infinite loops, terminating the agent after 10 tool calls regardless of task completion status.

Why it fails: The correct number of iterations depends on the task. A simple lookup takes 2 iterations. A complex research task might take 15. An arbitrary cap either terminates legitimate work early or is set so high it provides no protection.

The correct approach: Use stop_reason === "end_turn" as the primary termination signal. Use iteration caps only as a safety net — a secondary failsafe with a generous limit, not the primary control mechanism.

Anti-Pattern 3: Prompt-Based Business Rules

What it looks like: Enforcing financial limits, compliance rules, or safety constraints through system prompt instructions: "Never process a refund exceeding $500."

Why it fails: Prompt compliance is probabilistic. A model following instructions 99% of the time will violate the rule 1% of the time. When that rule is "do not process refunds over $500," the 1% failure has financial consequences. Prompt instructions are guidance, not enforcement.

The correct approach: Use Agent SDK hooks for deterministic enforcement.

The Agent SDK hook system intercepts agent actions at defined lifecycle points. Available hooks include preToolCall (fires before any tool is executed), postToolCall (fires after a tool returns), and onMessage (fires on each incoming message). Each hook receives the relevant context — the tool name, input arguments, and call metadata — and can return a blocked decision that prevents the action from proceeding, or a redirect that reroutes it to a different handler. This interception happens at the SDK layer, making it deterministic and independent of the model's output.

// WRONG: prompt-based enforcement
const systemPrompt = "Never process refunds over $500.";

// RIGHT: programmatic hook enforcement
const hooks = {
  preToolCall: async (toolCall) => {
    if (toolCall.name === "process_refund"
        && toolCall.input.amount > 500) {
      return {
        blocked: true,
        reason: "Refund exceeds $500 limit",
        redirect: "escalate_to_human"
      };
    }
  }
};

Anti-Pattern 4: Self-Reported Confidence for Escalation

What it looks like: Having the model output a confidence score (1-10) and routing to human review when confidence falls below a threshold.

Why it fails: LLM self-reported confidence is poorly calibrated. The model that incorrectly handles a complex policy exception does not report low confidence — it reports high confidence, because it believes its reasoning is correct. The agent that needs escalation most is the one most likely to report it needs escalation least.

The correct approach: Define explicit escalation criteria with few-shot examples. Escalate when: the customer explicitly requests a human, the policy is ambiguous or silent on the request, the agent cannot make meaningful progress after two attempts, or tool results return multiple matches requiring disambiguation.

Anti-Pattern 5: Sentiment = Complexity

What it looks like: Using sentiment analysis to determine escalation decisions. Angry customer equals complex case equals escalation.

Why it fails: Sentiment does not correlate with complexity. An angry customer with a straightforward return needs resolution, not escalation. A polite customer requesting a competitor price match — when the policy only addresses own-site adjustments — needs escalation regardless of their tone. Conflating sentiment with complexity misroutes in both directions.

The correct approach: Acknowledge frustration while offering resolution when the issue is within the agent's capability. Escalate only when the customer reiterates their preference for a human or when the issue genuinely exceeds the agent's authority.

Anti-Pattern 6: Generic Error Messages

What it looks like: Returning "An error occurred" or "search unavailable" when a tool call fails.

Why it fails: Generic errors strip the coordinator of the context it needs to make recovery decisions. Was the failure transient (retry might work) or permanent (retrying is pointless)? Were there partial results worth using? What alternative approaches exist? Without this information, the coordinator either retries blindly or gives up entirely.

The correct approach: Return structured error context.

// WRONG: generic error
return { error: "Operation failed" };

// RIGHT: structured error context
return {
  error: {
    category: "transient",     // transient | validation | permission
    isRetryable: true,
    attempted: "search_web('AI healthcare')",
    partialResults: firstThreeResults,
    alternatives: ["try narrower query", "use cached results"],
    humanMessage: "Web search timed out after 30s"
  }
};

Anti-Pattern 7: Silent Error Suppression

What it looks like: Catching exceptions inside a subagent and returning an empty result set marked as successful. The coordinator sees a successful response with no data and assumes the search found nothing.

Why it fails: A successful query that returns no results is fundamentally different from a failed query. The first means "we looked and found nothing." The second means "we could not look." Silent suppression makes the coordinator treat failures as valid empty results, producing reports that silently omit entire domains without any indication that source material was unavailable.

The correct approach: Distinguish access failures from valid empty results. Propagate errors with context. Include what was attempted and partial results when available.

Anti-Pattern 8: Tool Overload

What it looks like: Giving a single agent access to 18+ tools because it "might need any of them."

Why it fails: Tool selection reliability degrades as the number of available tools increases. An agent with 18 tools must make more complex selection decisions, leading to misrouting — a synthesis agent with search tools will attempt searches instead of synthesis. Agents with tools outside their specialization tend to misuse them.

The correct approach: Restrict each subagent to 4-5 tools relevant to its role. If a synthesis agent occasionally needs fact verification, give it a scoped verify_fact tool — not the full search toolkit. Route complex cross-role needs through the coordinator.

Anti-Pattern 9: Same-Session Self-Review

What it looks like: Having the model review its own output within the same conversation session, or using extended thinking as a substitute for independent review.

Why it fails: The model retains its reasoning context from generation. The same assumptions, the same interpretation of requirements, the same blind spots that produced the original output are still active when the model "reviews" it. It is asking the same brain to check its own work without clearing the reasoning that led to the work.

The correct approach: Use a separate, independent Claude instance for review — one that has never seen the reasoning that produced the original output. For code review in CI/CD, use a different session than the one that generated the code.

// WRONG: self-review in same session
const code = await session.generate("Write a function...");
const review = await session.generate("Review the code above.");

// RIGHT: independent review instance
const code = await generatorSession.generate("Write a function...");
const review = await reviewerSession.generate(
  `Review this code for bugs and security issues:\n${code}`
);
// reviewerSession has no access to generatorSession's reasoning

Anti-Pattern 10: Aggregate Accuracy Metrics

What it looks like: Reporting "97% accuracy" across all document types and using this aggregate number to justify automating high-confidence extractions.

Why it fails: Aggregate metrics mask segment-specific failures. Your system might achieve 99% accuracy on invoices and 60% accuracy on handwritten receipts. The 97% aggregate hides the segment where the system is actively unreliable. Automating based on aggregate accuracy means automating the 60% segment alongside the 99% segment.

The correct approach: Analyze accuracy by document type and field segment before reducing human review. Implement stratified random sampling of high-confidence extractions for ongoing error rate measurement. Validate each segment independently meets the automation threshold.

Lesson Drill

For each anti-pattern, verify your systems do not exhibit it:

Search your codebase for any string matching or regex applied to assistant response text for control flow decisions
Check whether your agentic loops use stop_reason as the primary termination signal, not iteration counters
Audit every business rule that appears in system prompts — determine which ones require deterministic enforcement via hooks
Review your escalation logic: does it use explicit criteria or does it rely on sentiment or self-reported confidence?
Inspect your error handling: does every tool return structured error context with category, retryability, and partial results?
Count the tools assigned to each agent — flag any with more than 6

Bottom Line

The CCA exam is a judgment test, not a knowledge test. Every distractor is built from one of these anti-patterns — approaches that look correct if you have only read the documentation, but fail when you have built the systems. Natural language parsing for control flow. Prompts for business rules. Sentiment for escalation. Aggregate metrics for automation decisions.

The correct answers share a common thread: use structured signals over natural language signals. Use deterministic enforcement over probabilistic guidance. Use segment-level measurement over aggregate measurement. Use explicit criteria over inferred criteria.

Every anti-pattern in this lesson is a version of the same mistake: substituting something approximate for something precise. The CCA exam exists to test whether you know the difference.