ASK KNOX
beta
LESSON 246

CEO Triage & Authority Delegation

Structured reports, the deterministic rules engine that replaces LLM judgment in routing, auto-resolve vs escalate decision thresholds, authority ceilings that define what each tier can do, and exactly when to involve the human.

12 min read

In a well-wired organization, work flows through a hierarchy. Specialist agents handle their domains. Teams execute defined skills. But above all of this sits the coordination layer that decides what gets worked on, in what order, by whom, and what gets escalated to the human.

This is the CEO agent: not a decision-maker about business strategy, but a triage and delegation engine that keeps the org running efficiently and escalates appropriately.

The CEO Agent's Job

The CEO agent is not a general-purpose AI. It is a triage system with one job: receive incoming information, apply deterministic rules, and produce one of three outcomes:

  1. Auto-resolve — the situation is understood, the response is standard, dispatch to the appropriate specialist
  2. Escalate internally — the situation requires higher-authority handling within the agent org
  3. Involve the human — the situation exceeds the org's authority ceiling or requires judgment the org cannot provide

Every incoming item — a specialist report, an alert, a scheduled event, a human directive — passes through this triage. The CEO agent does not execute work. It routes work.

Structured Reports: The Input to Triage

For triage to work, reports from specialists must be structured. A free-form status update that requires reading comprehension to interpret is not triageable at scale.

The structured report format:

@dataclass
class AgentReport:
    # Identity
    agent_id: str
    session_id: str
    report_type: str       # "status", "alert", "completion", "anomaly"
    timestamp: datetime

    # Summary (for fast scanning)
    headline: str          # 1 sentence max
    status: str            # "ok", "warning", "error", "critical"
    confidence: float      # 0.0 - 1.0

    # Findings
    findings: list[Finding]  # categorized by severity

    # Recommended action
    recommendation: str    # what the agent thinks should happen
    auto_resolvable: bool  # agent's assessment of whether this needs humans
    blast_radius: str      # "none", "single-agent", "team", "org-wide"

    # Supporting data
    evidence: dict         # key data that supports the findings
    session_log_url: str   # link to full session audit log


@dataclass
class Finding:
    severity: str          # "info", "warning", "error", "critical"
    category: str          # "performance", "security", "correctness", "cost"
    description: str
    impact: str
    suggested_action: str

When a specialist completes work or encounters an issue, it files a structured report:

# Trading agent detects an anomaly
report = AgentReport(
    agent_id="trading-agent-01",
    session_id=session.id,
    report_type="anomaly",
    timestamp=datetime.utcnow(),
    headline="Foresight open position count +40% above baseline",
    status="warning",
    confidence=0.91,
    findings=[
        Finding(
            severity="warning",
            category="risk",
            description="Current open positions: 18. Baseline: 12-13.",
            impact="Increased capital exposure, reduced ability to enter new positions",
            suggested_action="Review and close 3-5 lowest-conviction positions"
        )
    ],
    recommendation="Review open positions within 2 hours",
    auto_resolvable=False,
    blast_radius="single-portfolio",
    evidence={"open_positions": 18, "baseline": 12.6, "deviation_pct": 42.8}
)

await broker.route(Message(
    sender_id="trading-agent-01",
    type="report",
    payload=report
))

The structure makes triage fast: the CEO agent reads status, confidence, auto_resolvable, and blast_radius to make the routing decision in milliseconds, without reading the full findings.

The Deterministic Rules Engine

The triage rules engine applies a priority-ordered set of rules to each incoming report. No LLM calls. Every decision is a rule match.

class TriageEngine:
    """
    Deterministic triage — 12 rules, evaluated in priority order.
    No LLM reasoning in the routing hot path.
    """

    RULES: list[TriageRule] = [
        # Rule 1: Critical + high confidence → immediate human escalation
        TriageRule(
            id="R01",
            condition=lambda r: (
                r.status == "critical" and r.confidence >= 0.85
            ),
            action="escalate_human_immediate",
            priority=1
        ),

        # Rule 2: Critical + low confidence → escalate for investigation
        TriageRule(
            id="R02",
            condition=lambda r: (
                r.status == "critical" and r.confidence < 0.85
            ),
            action="escalate_human_investigate",
            priority=2
        ),

        # Rule 3: Org-wide blast radius → always involve human
        TriageRule(
            id="R03",
            condition=lambda r: r.blast_radius == "org-wide",
            action="escalate_human_immediate",
            priority=3
        ),

        # Rule 4: Warning + auto-resolvable + high confidence → auto-resolve
        TriageRule(
            id="R04",
            condition=lambda r: (
                r.status == "warning"
                and r.auto_resolvable
                and r.confidence >= 0.90
                and r.blast_radius in ("none", "single-agent")
            ),
            action="auto_resolve",
            priority=4
        ),

        # Rule 5: Completion report → acknowledge and archive
        TriageRule(
            id="R05",
            condition=lambda r: r.report_type == "completion",
            action="acknowledge_and_archive",
            priority=5
        ),

        # Rule 6: Warning + not auto-resolvable → route to VP level
        TriageRule(
            id="R06",
            condition=lambda r: (
                r.status == "warning" and not r.auto_resolvable
            ),
            action="escalate_vp",
            priority=6
        ),

        # Rules 7-12: cost threshold, anomaly patterns, etc.
    ]

    def triage(self, report: AgentReport) -> TriageDecision:
        for rule in sorted(self.RULES, key=lambda r: r.priority):
            if rule.condition(report):
                return TriageDecision(
                    action=rule.action,
                    rule_applied=rule.id,
                    report=report,
                    timestamp=datetime.utcnow()
                )
        return TriageDecision(
            action="escalate_human_investigate",
            rule_applied="R00-fallback",
            report=report,
        )

Authority Ceilings: What Each Tier Can Do

Authority is not binary. It is a four-tier hierarchy that defines what each agent can do without escalation.

AUTHORITY_TIERS = {
    1: {
        "name": "Read-Only",
        "permitted": [
            "read_any_data",
            "query_memory",
            "generate_reports",
            "send_alerts",
        ],
        "forbidden": ["write_any_data", "execute_any_action"],
    },
    2: {
        "name": "Write",
        "permitted": [
            # All of Tier 1
            "write_code",
            "commit_to_feature_branch",
            "create_pr",
            "run_tests",
            "schedule_content",
            "update_config",
        ],
        "forbidden": [
            "deploy_to_production",
            "merge_to_main",
            "modify_production_database",
            "change_strategy_params",
        ],
    },
    3: {
        "name": "Deploy",
        "permitted": [
            # All of Tier 2
            "deploy_to_staging",
            "run_migrations_staging",
            "approve_pr",
            "merge_to_main_with_ci_pass",
        ],
        "forbidden": [
            "deploy_to_production",
            "modify_production_database",
            "change_live_financial_params",
        ],
    },
    4: {
        "name": "Production",
        "permitted": [
            # All of Tier 3
            "deploy_to_production",
            "run_migrations_production",
            "modify_live_params",
        ],
        "forbidden": [
            "wallet.private_key",
            "database.production.drop",
            "foresight.shared_modules.modify",
            # Hard blocks — no tier can do these
        ],
    },
}

An agent's authority ceiling is its tier plus its requires_approval list. Even a Tier 3 agent escalates before merging code that touches trading infrastructure — not because it lacks the technical authority, but because the blast radius warrants human review.

def check_authority(agent: AgentCard, action: str) -> AuthorityDecision:
    tier = AUTHORITY_TIERS[agent.authority_tier]

    # Hard blocks — no authority overrides
    if any(action.startswith(block) for block in HARD_BLOCKS):
        return AuthorityDecision(
            permitted=False,
            reason="hard-block",
            escalation_required=True
        )

    # Requires approval — escalate before executing
    if action in agent.requires_approval:
        return AuthorityDecision(
            permitted=False,
            reason="requires-approval",
            escalation_required=True
        )

    # Tier check
    if action in tier["forbidden"]:
        return AuthorityDecision(
            permitted=False,
            reason="tier-ceiling",
            escalation_required=True
        )

    return AuthorityDecision(permitted=True)

Auto-Resolve vs Escalate

The auto-resolve vs escalate threshold is the most consequential calibration decision in an agent operations platform. Set it too aggressive and you get autonomous operation that surprises you. Set it too conservative and you get a notification firehose that requires constant human attention.

The conservative default:

def should_auto_resolve(report: AgentReport, org_context: OrgContext) -> bool:
    """
    Returns True only when ALL conditions are met.
    Conservative by design — when in doubt, escalate.
    """
    # Confidence must be very high
    if report.confidence < 0.90:
        return False

    # Blast radius must be narrow
    if report.blast_radius not in ("none", "single-agent"):
        return False

    # Must have a known resolution pattern
    if report.category not in org_context.known_resolutions:
        return False

    # Must not involve financial action
    if report.involves_financial_action:
        return False

    # Must not be a first-time occurrence of this pattern
    prior_resolutions = org_context.resolution_history.count(
        category=report.category,
        pattern=report.pattern_id
    )
    if prior_resolutions < 3:
        return False

    return True

The five conditions before auto-resolving anything:

  1. Confidence above 0.90 (not 0.70, not 0.80 — 0.90)
  2. Blast radius is narrow (single agent, not team or org-wide)
  3. The resolution pattern is known (not a first-time scenario)
  4. No financial action involved
  5. The same resolution has been validated at least 3 times before

The asymmetry is deliberate: a false escalation costs the human 5 minutes. A false auto-resolution can cost hours of incident response.

When to Involve the Human

The human is not in the loop for everything. They are in the loop for:

HUMAN_ESCALATION_TRIGGERS = [
    "status == 'critical'",
    "blast_radius == 'org-wide'",
    "involves_production_database",
    "involves_financial_transaction_above_threshold",
    "authority_ceiling_reached",
    "first_occurrence_unknown_pattern",
    "confidence < 0.70",
    "agent_health_critical",
    "hard_block_attempted",
]

Everything else is handled autonomously. The human receives a summary digest at configured intervals showing everything that was auto-resolved — not to review it, but to maintain situational awareness.

# Daily digest to human operator
async def send_daily_digest(org_state: OrgState) -> None:
    digest = DailyDigest(
        period="last_24h",
        directives_total=org_state.directives_count,
        auto_resolved=org_state.auto_resolved_count,
        escalated_internal=org_state.internal_escalations,
        escalated_human=org_state.human_escalations,
        pending_review=org_state.pending_human_review,
        anomalies=org_state.anomalies_detected,
        cost_total=org_state.total_cost_usd,
    )
    await notify_human(digest)

The goal is a system where the human wakes up to a digest, not a firehose. Most days: N items auto-resolved, zero or one escalations, costs within budget. The escalations that do arrive are genuine — they were not caught by the rules engine because they actually require human judgment.

Summary

  • The CEO agent is a triage and delegation engine — it routes work, it does not execute work
  • Structured reports enable fast triage: status, confidence, blast radius, and auto_resolvable are the key fields
  • The triage rules engine is deterministic — 12 rules, no LLM in the hot path
  • Authority tiers define what each agent can do autonomously; the requires_approval list adds action-level granularity
  • Auto-resolve thresholds are conservative: confidence > 0.90, narrow blast radius, known pattern, no financial action, 3+ prior validations
  • Human escalation is reserved for genuinely uncertain or high-stakes situations — not everything

What's Next

The final piece of operational safety: what happens when agents go wrong? The next lesson covers behavioral health monitoring — detecting hallucination, doom spirals, stale execution, and the circuit breakers that prevent small failures from becoming large ones.