Ask Knox

Model tier routing caps what models agents can use. Budget enforcement caps how much they can spend. These are complementary controls — both are necessary.

This lesson covers the BudgetEnforcer implementation: how per-agent budgets are defined, how the check works, and the one exception that allows critical sessions to bypass agent budgets without bypassing the global ceiling.

The Budget Map

Every agent in the principal-broker system has a daily spending allocation:

DAILY_BUDGETS_USD = {
    "openclaw": 3.00,
    "advisory-system": 2.00,
    "content-pipeline": 2.00,
    "analyst-system": 1.50,
    "vp-trading": 1.00,
    "foresight": 1.00,
    "sports-agent": 0.75,
    "political-agent": 0.75,
    "vp-engineering": 0.50,
    "vp-content": 0.50,
    "chief-of-staff": 0.50,
    "security-council": 0.50,
    "perpetuals-bot": 0.25,
    "platform-monitor": 0.25,
    "doc-syncer": 0.25,
    "cfo": 0.25,
    "cmo": 0.25,
    "cto": 0.25,
    "weather-agent": 0.50,
    "vp-product": 0.50,
}

The allocations reflect actual cost expectations for each agent's role. The OpenClaw, which runs the most sessions and handles the most complex orchestration work, gets $3.00. The multi-agent advisory system, which runs deep market analysis sessions, gets $2.00. Utility agents like the documentation sync engine, the CFO reporter, and the CMO brief agent get $0.25 — they run well-defined tasks with predictable, low token counts.

Total of all allocations: approximately $14.50. The global ceiling is $25.00. The gap is intentional headroom for overrides during incidents and legitimately high-activity days.

Default budget for unknown agents. The get_effective_budget() method falls back to $0.50 for any agent not in the map. This is not generous — it is a conservative default that requires explicit allocation for any new agent that needs more than $0.50/day.

def get_effective_budget(self, agent_id: str) -> float:
    """Return the effective budget for an agent (override takes precedence)."""
    if agent_id in self._overrides:
        return self._overrides[agent_id]
    return DAILY_BUDGETS_USD.get(agent_id, 0.50)

The Budget Check

The core of the enforcement layer is check_budget():

def check_budget(
    self, agent_id: str, is_critical: bool = False
) -> BudgetCheckResult:
    """
    Check if an agent can make an LLM call.
    Critical sessions (incident response) bypass daily budget.
    """
    agent_spend = self.cost_tracker.get_agent_spend(agent_id)
    global_spend = self.cost_tracker.get_total_spend()
    agent_budget = self.get_effective_budget(agent_id)

    # Global ceiling — applies to ALL sessions including critical
    if global_spend >= GLOBAL_DAILY_CEILING:
        return BudgetCheckResult(
            allowed=False,
            reason=(
                f"Global ceiling reached: ${global_spend:.2f} "
                f"/ ${GLOBAL_DAILY_CEILING:.2f}"
            ),
            agent_spend=agent_spend,
            agent_budget=agent_budget,
            global_spend=global_spend,
        )

    # Critical sessions bypass agent budget
    if is_critical:
        return BudgetCheckResult(
            allowed=True,
            agent_spend=agent_spend,
            agent_budget=agent_budget,
            global_spend=global_spend,
        )

    # Agent budget check
    if agent_spend >= agent_budget:
        return BudgetCheckResult(
            allowed=False,
            reason=(
                f"Agent budget exhausted: ${agent_spend:.2f} "
                f"/ ${agent_budget:.2f}"
            ),
            agent_spend=agent_spend,
            agent_budget=agent_budget,
            global_spend=global_spend,
        )

    # Warning at 80%
    warning = agent_spend >= (agent_budget * ALERT_AT_PCT)

    return BudgetCheckResult(
        allowed=True,
        warning=warning,
        reason=(
            f"Budget warning: ${agent_spend:.2f} / ${agent_budget:.2f} "
            f"({agent_spend / agent_budget * 100:.0f}%)"
            if warning
            else None
        ),
        agent_spend=agent_spend,
        agent_budget=agent_budget,
        global_spend=global_spend,
    )

The check order matters. It is not arbitrary:

Global ceiling first — this check blocks everything, including critical sessions
Critical bypass second — only reached if global ceiling has not been hit
Agent budget check third — only for non-critical sessions
Warning check last — only if allowed

This ordering means the global ceiling is genuinely inviolable. An incident response session flagged as critical can burn through its agent budget — but it cannot push global spend past $25.00.

The BudgetCheckResult

The return value carries everything the caller needs to act:

@dataclass
class BudgetCheckResult:
    allowed: bool
    warning: bool = False
    reason: Optional[str] = None
    agent_spend: float = 0.0
    agent_budget: float = 0.0
    global_spend: float = 0.0

allowed — the binary enforcement decision. False means block the LLM call.

warning — True when spend is between 80% and 100% of budget. The call proceeds but the broker fires a notification event.

reason — human-readable explanation. Populated for warnings and blocks. Useful in operator dashboards and Discord alerts.

agent_spend / agent_budget / global_spend — the raw numbers for downstream display. A Discord notification that says "Foresight has spent $0.83 of its $1.00 daily budget" is more actionable than one that says "budget warning."

The Two-Level Alert Pattern

The 80% warning is not just a courtesy notification — it is a trigger for investigation. When an agent hits 80% of its daily budget before the day is half over, that is a signal worth examining:

Is a session running longer than expected?
Is a loop condition developing that has not yet triggered the loop detector?
Was a one-time batch job run that consumed unusual budget?
Is the agent type doing more work than its allocation anticipated?

The warning gives the operator a window to intervene before the hard stop. A hard stop mid-session can leave an agent in a bad state — conversation history truncated, task incomplete, resources locked. The warning allows graceful handling: finish the current turn, checkpoint state, and notify the operator.

In practice, most agents never hit the hard stop. The warning fires, the operator checks, the daily counter resets at UTC midnight, and normal operation resumes. The hard stop is the backstop for the cases where the warning was ignored or where spend accelerated too fast to catch at 80%.

Critical Session Override

Incident response creates a genuine dilemma for budget systems. When production is down — a trading bot is misbehaving, a critical service has crashed, customer data may be at risk — you do not want a budget system to block the agents trying to fix it.

The is_critical=True parameter handles this:

# Critical sessions bypass agent budget
if is_critical:
    return BudgetCheckResult(
        allowed=True,
        agent_spend=agent_spend,
        agent_budget=agent_budget,
        global_spend=global_spend,
    )

A critical session can spend beyond its daily agent budget. It cannot spend beyond the global ceiling. This is the deliberate design: the global ceiling is the absolute backstop even in emergencies.

The concern with any "critical override" mechanism is abuse. If agents can flag themselves as critical to bypass budget checks, the budget system is meaningless. The principal-broker implementation addresses this by:

The is_critical flag is set by the broker's incident state machine, not by the agent itself
When is_critical=True is used, a warning is logged
The cost is still recorded fully — critical session spend is visible in the CFO report
No agent card grants the authority to set its own is_critical flag

The override is a real escape hatch. It is narrow, logged, and not self-grantable.

Budget Overrides

Beyond the critical session path, there is a second override mechanism: set_override(). This allows the operator to manually change an agent's daily budget for situations where the default allocation is insufficient:

def set_override(self, agent_id: str, budget: float) -> None:
    """Override the daily budget for a specific agent."""
    self._overrides[agent_id] = budget

def clear_override(self, agent_id: str) -> None:
    """Remove budget override for a specific agent, reverting to default."""
    self._overrides.pop(agent_id, None)

Overrides are in-memory and persist until explicitly cleared or the service restarts. They are exposed through the admin REST API (Lesson 209) with mandatory reason logging and an audit trail.

A typical use case: the advisory system is running a quarterly review analysis that will legitimately exceed its $2.00 daily budget. The operator sets a $5.00 override for the day, the analysis runs, the override is cleared. The CFO report shows the elevated spend with the override reason attached.

The Global Ceiling as System-Level Control

The $25.00 global ceiling is worth examining as an architectural choice. Why not simply rely on per-agent budgets?

Per-agent budgets are calibrated to individual agent expected behavior. They do not account for correlated failures — a situation where multiple agents hit loops simultaneously, or a system bug causes many agents to spin up unexpectedly.

The global ceiling is the last defense against systemic failures. If five agents each have a $1.00 daily budget and all five enter loops simultaneously, the per-agent budgets will each stop at $1.00 — but not before burning $5.00. The global ceiling can catch this as a pattern: global spend accelerating faster than individual agent budgets would explain.

More importantly, the global ceiling is a forcing function for organizational discipline. At $25.00/day, the agent fleet costs at most $9,125/year in LLM spend. That is a number a solo operator can reason about, budget for, and justify. Without the ceiling, the number is unbounded — and unbounded numbers are not numbers you can plan around.

GLOBAL_DAILY_CEILING = 25.00
ALERT_AT_PCT = 0.80

The 80% alert applies to per-agent budgets. The global ceiling has no warning threshold — when global spend hits $25.00, everything stops. The alert for the global state comes from the nightly CFO report, which shows ceiling utilization percentage. If the fleet regularly runs at 85-90% of the global ceiling, that is a signal to either increase the ceiling or investigate which agents are spending more than allocated.

Calibrating Budgets for New Agents

When adding a new agent to the fleet, how do you set its initial budget? The approach:

Run the agent in a test environment and measure per-session cost at typical load
Multiply by expected sessions per day to get daily expected spend
Add a 50% buffer for variability
Round to the nearest $0.25

If an agent is expected to run 4 sessions per day at $0.15/session, the expected daily spend is $0.60. With 50% buffer: $0.90. Round to $1.00.

Start conservatively. It is easier to increase a budget after observing real usage than to explain why an agent spent $10 on its first day because the initial allocation was too generous.

Next: Loop Detection

Budget enforcement stops spend after an agent has used its allocation. Loop detection catches the failure mode earlier — before the spend accumulates. Lesson 207 covers Jaccard similarity scoring, consecutive-turn tracking, and the termination logic that cuts off loops at 5 consecutive similar turns.

Per-Agent Daily Budgets — 80% Warning, 100% Stop