Context Precision, Information Provenance, and CCA Practice Scenarios

The research agent synthesized 14 sources into a market analysis. The report was fluent, well-structured, and cited every claim. The client flagged three problems within an hour.

Problem one: a statistic cited as "$12.4 billion" in the report actually read "$12.4 million" in the source. Problem two: two sources disagreed on a growth rate — the synthesis silently picked one without mentioning the conflict. Problem three: one source was from 2022 and another from 2025, but the report treated both as current data.

None of these were model capability failures. They were context management failures. The information was in the context window. The model processed it. But the architecture did not enforce provenance tracking, conflict annotation, or temporal awareness.

Part 1: Context Management

The "Lost in the Middle" Effect

This is not a theoretical concern. It is a measured, replicated phenomenon across every major LLM architecture. When you place information in a long context window, the model's attention is not uniform.

The start of the context gets strong attention — the model processes it first, anchors on it. The end of the context gets strong attention — recency bias, the model's "working memory" is freshest here. The middle of the context — tokens 20K through 40K in a 100K window — is the dead zone. Information placed there is significantly more likely to be missed, misquoted, or silently dropped.

Progressive Summarization Risks

Every summarization pass loses detail. This is not a failure — it is the nature of compression. The problem is when the loss is invisible.

Pass 1: "Revenue grew 23.7% year-over-year in Q3 2025, from $847M to $1.048B."

Pass 2: "Revenue grew approximately 24% year-over-year."

Pass 3: "Revenue showed strong growth."

Each pass is individually reasonable. The chain produces information destruction. Specific numbers become approximations become qualitative judgments. Dates vanish. Source attribution disappears. By the time the synthesis agent processes the output, it is working with ghosts of the original data.

You cannot synthesize what you have already lost. The defense against progressive summarization is structural: extract transactional facts — amounts, dates, order numbers, percentages — into a persistent structured block that bypasses the summarization pipeline entirely.

// Anti-pattern: summarizing everything through the same pipeline
const summary = await summarize(longDocument); // loses specific numbers

// Correct: extract structured facts separately
const facts = await extractFacts(longDocument);
// { revenue_q3: 1048000000, yoy_growth: 0.237, period: "Q3 2025" }

const narrative = await summarize(longDocument);
// "Revenue showed strong year-over-year growth..."

// Synthesis gets BOTH: structured facts + narrative context
const synthesis = await synthesize({ facts, narrative, source_url, pub_date });

Context Extraction and Scratchpad Files

When a tool returns 40+ fields per record and you only need 5, those 35 irrelevant fields accumulate in context. After 10 tool calls, you have consumed thousands of tokens with data that actively harms performance by diluting attention away from relevant information.

Part 2: Information Provenance

Every claim in a synthesis output must chain back to a verifiable source. This is not a nice-to-have. It is a CCA-tested requirement and a production necessity.

Claim-Source Mappings

The architecture requires subagents to output structured claim-source mappings — not just text summaries. Each finding carries:

The claim — what is being asserted
The evidence excerpt — the specific text from the source that supports the claim
The source URL — where the evidence came from
The publication date — when the source was published or data was collected

class ClaimMapping:
    claim: str
    evidence_excerpt: str
    source_url: str
    publication_date: str
    confidence: str  # "direct_quote" | "paraphrase" | "inference"

# The synthesis agent MUST preserve these mappings
# Not: "The market is growing rapidly"
# But: ClaimMapping(
#     claim="AI agent market projected to reach $65B by 2028",
#     evidence_excerpt="valued at $65.5 billion by 2028, CAGR of 43.8%",
#     source_url="https://example.com/ai-agents-report",
#     publication_date="2025-11-14",
#     confidence="direct_quote"
# )

Temporal Data Handling

Data has a timestamp. A market size from 2022 and a market size from 2025 are not conflicting data points — they are sequential data points. Without dates, the synthesis agent cannot distinguish temporal progression from factual disagreement.

Conflict Annotation

When sources disagree, the correct behavior is preservation with attribution — not silent selection.

// Wrong: silently pick one value
{ market_size: "$47.3B" }

// Correct: preserve both with attribution
{
  market_size: {
    values: [
      { amount: "$47.3B", source: "Gartner Q2 2025", date: "2025-07" },
      { amount: "$52.1B", source: "McKinsey Annual 2025", date: "2025-11" }
    ],
    conflict_detected: true,
    likely_explanation: "Different measurement methodologies and timeframes"
  }
}

Coverage Gap Reporting

Equally important: explicitly noting what the research did NOT find. If three of five research questions returned no relevant sources, the synthesis should say so — not silently produce a report that appears comprehensive because it only covers what was found.

Part 3: CCA Practice Scenarios

Five scenario-based questions, one per domain. These mirror CCA exam format: a realistic production context, one correct answer, three distractors that each represent a common misconception.

Domain 1 — Agentic Architecture (27%)

Scenario: Your customer support agent skips the identity verification step in 8% of cases, proceeding directly to order lookup using the customer's stated name.

Question: What change most effectively addresses this reliability issue?

A) Add stronger wording to the system prompt emphasizing verification is mandatory.

B) Add few-shot examples showing the agent always calling get_customer first.

C) Implement a programmatic prerequisite that blocks lookup_order until get_customer returns a verified ID.

D) Reduce the agent's temperature to decrease randomness in tool selection.

Domain 2 — Tool Design & MCP (18%)

Scenario: Your multi-agent system has a synthesis agent with access to 18 tools including web search, database queries, and file operations. The agent frequently misuses tools outside its specialization.

Question: What is the most effective change?

A) Add detailed "do not use" instructions for each irrelevant tool in the system prompt.

B) Restrict the synthesis agent's tool set to only the 4-5 tools relevant to its role.

C) Implement a routing classifier that approves each tool call before execution.

D) Increase the model size to improve tool selection accuracy.

Domain 3 — Claude Code Workflows (20%)

Scenario: You need to add a code review step to your CI/CD pipeline. The review should produce structured JSON findings that are posted as inline PR comments.

Question: Which approach is correct?

A) Run Claude Code interactively and parse its terminal output into JSON.

B) Use -p flag with --output-format json and --json-schema to produce structured findings.

C) Have Claude Code write findings to a file, then parse the file in a subsequent CI step.

D) Use the Assistants API instead of Claude Code for CI integration.

Domain 4 — Prompt Engineering & Structured Output (20%)

Scenario: Your extraction pipeline processes invoices. 15% of invoices lack a tax amount field. When the schema marks tax_amount as required, the model fabricates plausible tax values.

Question: What schema change addresses this?

A) Add a prompt instruction saying "do not fabricate values."

B) Make tax_amount nullable so the model can return null when the field is absent.

C) Remove tax_amount from the schema entirely.

D) Add a validation step that checks tax amounts against a lookup table.

Domain 5 — Context Management & Reliability (15%)

Scenario: Your research agent synthesizes findings from 14 sources into a report. Two sources report conflicting growth rates for the same metric: 23% (from a 2024 report) and 31% (from a 2025 report).

Question: What is the correct synthesis behavior?

A) Use the more recent value (31%) as it is more current.

B) Average the two values (27%) to produce a balanced estimate.

C) Preserve both values with source attribution and publication dates, flagging the conflict.

D) Exclude both values since they conflict, and note that the data is unreliable.

Lesson Drill

Run through these five exercises to pressure-test your CCA readiness:

Take a 10-page document and split it into three context positions (start, middle, end). Ask the same extraction question for each position. Measure which position produces the most accurate extraction.
Build a claim-source mapping for any research output you have produced. Can every claim chain back to a dated, verifiable source?
Find two sources that disagree on a statistic. Write the synthesis that preserves both with attribution. Then write the synthesis that silently picks one. Compare what is lost.
Write a system prompt for a customer support agent with a programmatic prerequisite for identity verification. Then write one that relies only on prompt instructions. Test both against adversarial inputs.
Review your own code review prompts. Are they using same-session self-review? Redesign them to use independent instances.

Bottom Line

Context management is not about fitting more tokens into the window. It is about positioning information where the model attends to it, preserving specific facts through summarization pipelines, maintaining provenance chains from claim to source, and annotating conflicts instead of silently resolving them.

The CCA tests these as architectural decisions, not prompt tricks. The model that processed 14 sources and silently dropped the conflict between two growth rates did not fail because it was incapable. It failed because the architecture did not require conflict annotation. The model that rounded "$47.3 billion" to "approximately $50B" did not fail because of a hallucination. It failed because the pipeline allowed progressive summarization to compress specific data into vague qualitative language.

Build the architecture that prevents these failures. The CCA tests whether you know how.