CEO Triage & Authority Delegation
Structured reports, the deterministic rules engine that replaces LLM judgment in routing, auto-resolve vs escalate decision thresholds, authority ceilings that define what each tier can do, and exactly when to involve the human.
In a well-wired organization, work flows through a hierarchy. Specialist agents handle their domains. Teams execute defined skills. But above all of this sits the coordination layer that decides what gets worked on, in what order, by whom, and what gets escalated to the human.
This is the CEO agent: not a decision-maker about business strategy, but a triage and delegation engine that keeps the org running efficiently and escalates appropriately.
The CEO Agent's Job
The CEO agent is not a general-purpose AI. It is a triage system with one job: receive incoming information, apply deterministic rules, and produce one of three outcomes:
- Auto-resolve — the situation is understood, the response is standard, dispatch to the appropriate specialist
- Escalate internally — the situation requires higher-authority handling within the agent org
- Involve the human — the situation exceeds the org's authority ceiling or requires judgment the org cannot provide
Every incoming item — a specialist report, an alert, a scheduled event, a human directive — passes through this triage. The CEO agent does not execute work. It routes work.
Structured Reports: The Input to Triage
For triage to work, reports from specialists must be structured. A free-form status update that requires reading comprehension to interpret is not triageable at scale.
The structured report format:
@dataclass
class AgentReport:
# Identity
agent_id: str
session_id: str
report_type: str # "status", "alert", "completion", "anomaly"
timestamp: datetime
# Summary (for fast scanning)
headline: str # 1 sentence max
status: str # "ok", "warning", "error", "critical"
confidence: float # 0.0 - 1.0
# Findings
findings: list[Finding] # categorized by severity
# Recommended action
recommendation: str # what the agent thinks should happen
auto_resolvable: bool # agent's assessment of whether this needs humans
blast_radius: str # "none", "single-agent", "team", "org-wide"
# Supporting data
evidence: dict # key data that supports the findings
session_log_url: str # link to full session audit log
@dataclass
class Finding:
severity: str # "info", "warning", "error", "critical"
category: str # "performance", "security", "correctness", "cost"
description: str
impact: str
suggested_action: str
When a specialist completes work or encounters an issue, it files a structured report:
# Trading agent detects an anomaly
report = AgentReport(
agent_id="trading-agent-01",
session_id=session.id,
report_type="anomaly",
timestamp=datetime.utcnow(),
headline="Foresight open position count +40% above baseline",
status="warning",
confidence=0.91,
findings=[
Finding(
severity="warning",
category="risk",
description="Current open positions: 18. Baseline: 12-13.",
impact="Increased capital exposure, reduced ability to enter new positions",
suggested_action="Review and close 3-5 lowest-conviction positions"
)
],
recommendation="Review open positions within 2 hours",
auto_resolvable=False,
blast_radius="single-portfolio",
evidence={"open_positions": 18, "baseline": 12.6, "deviation_pct": 42.8}
)
await broker.route(Message(
sender_id="trading-agent-01",
type="report",
payload=report
))
The structure makes triage fast: the CEO agent reads status, confidence, auto_resolvable, and blast_radius to make the routing decision in milliseconds, without reading the full findings.
The Deterministic Rules Engine
The triage rules engine applies a priority-ordered set of rules to each incoming report. No LLM calls. Every decision is a rule match.
class TriageEngine:
"""
Deterministic triage — 12 rules, evaluated in priority order.
No LLM reasoning in the routing hot path.
"""
RULES: list[TriageRule] = [
# Rule 1: Critical + high confidence → immediate human escalation
TriageRule(
id="R01",
condition=lambda r: (
r.status == "critical" and r.confidence >= 0.85
),
action="escalate_human_immediate",
priority=1
),
# Rule 2: Critical + low confidence → escalate for investigation
TriageRule(
id="R02",
condition=lambda r: (
r.status == "critical" and r.confidence < 0.85
),
action="escalate_human_investigate",
priority=2
),
# Rule 3: Org-wide blast radius → always involve human
TriageRule(
id="R03",
condition=lambda r: r.blast_radius == "org-wide",
action="escalate_human_immediate",
priority=3
),
# Rule 4: Warning + auto-resolvable + high confidence → auto-resolve
TriageRule(
id="R04",
condition=lambda r: (
r.status == "warning"
and r.auto_resolvable
and r.confidence >= 0.90
and r.blast_radius in ("none", "single-agent")
),
action="auto_resolve",
priority=4
),
# Rule 5: Completion report → acknowledge and archive
TriageRule(
id="R05",
condition=lambda r: r.report_type == "completion",
action="acknowledge_and_archive",
priority=5
),
# Rule 6: Warning + not auto-resolvable → route to VP level
TriageRule(
id="R06",
condition=lambda r: (
r.status == "warning" and not r.auto_resolvable
),
action="escalate_vp",
priority=6
),
# Rules 7-12: cost threshold, anomaly patterns, etc.
]
def triage(self, report: AgentReport) -> TriageDecision:
for rule in sorted(self.RULES, key=lambda r: r.priority):
if rule.condition(report):
return TriageDecision(
action=rule.action,
rule_applied=rule.id,
report=report,
timestamp=datetime.utcnow()
)
return TriageDecision(
action="escalate_human_investigate",
rule_applied="R00-fallback",
report=report,
)
Authority Ceilings: What Each Tier Can Do
Authority is not binary. It is a four-tier hierarchy that defines what each agent can do without escalation.
AUTHORITY_TIERS = {
1: {
"name": "Read-Only",
"permitted": [
"read_any_data",
"query_memory",
"generate_reports",
"send_alerts",
],
"forbidden": ["write_any_data", "execute_any_action"],
},
2: {
"name": "Write",
"permitted": [
# All of Tier 1
"write_code",
"commit_to_feature_branch",
"create_pr",
"run_tests",
"schedule_content",
"update_config",
],
"forbidden": [
"deploy_to_production",
"merge_to_main",
"modify_production_database",
"change_strategy_params",
],
},
3: {
"name": "Deploy",
"permitted": [
# All of Tier 2
"deploy_to_staging",
"run_migrations_staging",
"approve_pr",
"merge_to_main_with_ci_pass",
],
"forbidden": [
"deploy_to_production",
"modify_production_database",
"change_live_financial_params",
],
},
4: {
"name": "Production",
"permitted": [
# All of Tier 3
"deploy_to_production",
"run_migrations_production",
"modify_live_params",
],
"forbidden": [
"wallet.private_key",
"database.production.drop",
"foresight.shared_modules.modify",
# Hard blocks — no tier can do these
],
},
}
An agent's authority ceiling is its tier plus its requires_approval list. Even a Tier 3 agent escalates before merging code that touches trading infrastructure — not because it lacks the technical authority, but because the blast radius warrants human review.
def check_authority(agent: AgentCard, action: str) -> AuthorityDecision:
tier = AUTHORITY_TIERS[agent.authority_tier]
# Hard blocks — no authority overrides
if any(action.startswith(block) for block in HARD_BLOCKS):
return AuthorityDecision(
permitted=False,
reason="hard-block",
escalation_required=True
)
# Requires approval — escalate before executing
if action in agent.requires_approval:
return AuthorityDecision(
permitted=False,
reason="requires-approval",
escalation_required=True
)
# Tier check
if action in tier["forbidden"]:
return AuthorityDecision(
permitted=False,
reason="tier-ceiling",
escalation_required=True
)
return AuthorityDecision(permitted=True)
Auto-Resolve vs Escalate
The auto-resolve vs escalate threshold is the most consequential calibration decision in an agent operations platform. Set it too aggressive and you get autonomous operation that surprises you. Set it too conservative and you get a notification firehose that requires constant human attention.
The conservative default:
def should_auto_resolve(report: AgentReport, org_context: OrgContext) -> bool:
"""
Returns True only when ALL conditions are met.
Conservative by design — when in doubt, escalate.
"""
# Confidence must be very high
if report.confidence < 0.90:
return False
# Blast radius must be narrow
if report.blast_radius not in ("none", "single-agent"):
return False
# Must have a known resolution pattern
if report.category not in org_context.known_resolutions:
return False
# Must not involve financial action
if report.involves_financial_action:
return False
# Must not be a first-time occurrence of this pattern
prior_resolutions = org_context.resolution_history.count(
category=report.category,
pattern=report.pattern_id
)
if prior_resolutions < 3:
return False
return True
The five conditions before auto-resolving anything:
- Confidence above 0.90 (not 0.70, not 0.80 — 0.90)
- Blast radius is narrow (single agent, not team or org-wide)
- The resolution pattern is known (not a first-time scenario)
- No financial action involved
- The same resolution has been validated at least 3 times before
The asymmetry is deliberate: a false escalation costs the human 5 minutes. A false auto-resolution can cost hours of incident response.
When to Involve the Human
The human is not in the loop for everything. They are in the loop for:
HUMAN_ESCALATION_TRIGGERS = [
"status == 'critical'",
"blast_radius == 'org-wide'",
"involves_production_database",
"involves_financial_transaction_above_threshold",
"authority_ceiling_reached",
"first_occurrence_unknown_pattern",
"confidence < 0.70",
"agent_health_critical",
"hard_block_attempted",
]
Everything else is handled autonomously. The human receives a summary digest at configured intervals showing everything that was auto-resolved — not to review it, but to maintain situational awareness.
# Daily digest to human operator
async def send_daily_digest(org_state: OrgState) -> None:
digest = DailyDigest(
period="last_24h",
directives_total=org_state.directives_count,
auto_resolved=org_state.auto_resolved_count,
escalated_internal=org_state.internal_escalations,
escalated_human=org_state.human_escalations,
pending_review=org_state.pending_human_review,
anomalies=org_state.anomalies_detected,
cost_total=org_state.total_cost_usd,
)
await notify_human(digest)
The goal is a system where the human wakes up to a digest, not a firehose. Most days: N items auto-resolved, zero or one escalations, costs within budget. The escalations that do arrive are genuine — they were not caught by the rules engine because they actually require human judgment.
Summary
- The CEO agent is a triage and delegation engine — it routes work, it does not execute work
- Structured reports enable fast triage: status, confidence, blast radius, and auto_resolvable are the key fields
- The triage rules engine is deterministic — 12 rules, no LLM in the hot path
- Authority tiers define what each agent can do autonomously; the requires_approval list adds action-level granularity
- Auto-resolve thresholds are conservative: confidence > 0.90, narrow blast radius, known pattern, no financial action, 3+ prior validations
- Human escalation is reserved for genuinely uncertain or high-stakes situations — not everything
What's Next
The final piece of operational safety: what happens when agents go wrong? The next lesson covers behavioral health monitoring — detecting hallucination, doom spirals, stale execution, and the circuit breakers that prevent small failures from becoming large ones.