Loop Detection — Jaccard Similarity and Session Termination
The Loop Detector uses Jaccard token overlap to score consecutive agent turns. Three similar turns triggers a VP warning. Five triggers forced termination. Here is how it works and why it is the cheapest insurance in your agent stack.
Budget enforcement stops spend once it has accumulated. Loop detection stops it before it accumulates.
An agent in a reasoning loop will hit its daily budget eventually — but "eventually" might be $0.80 into a $1.00 budget, or $2.50 into an override that was granted for a legitimate extended session. The loop detector does not wait. It detects the behavioral signature of a loop — consecutive turns with high content similarity — and intervenes at the session level.
What a Loop Looks Like
Agent loops are not always obvious. They do not always look like exact repetition. Common patterns:
Direct repetition. The agent produces nearly identical turns: "I need to check the current price and make a trading decision." Each time slightly rephrased but semantically identical.
Alternating stuck states. The agent oscillates between two responses: state A ("I need more information"), state B ("Let me check the data"), state A, state B. Neither state makes progress.
Reformulation loops. The agent rephrases its task on each turn, never actually executing: "Let me approach this differently." "Perhaps I should start by..." "A better framing might be..." Each turn is the agent restarting the same failing attempt.
Tool retry loops. A tool call fails. The agent tries again with identical or near-identical input. The tool fails again. Repeat.
All of these produce a detectable signal: turns with high lexical overlap. The loop detector measures this signal.
Jaccard Similarity
The similarity metric is Jaccard token overlap:
@staticmethod
def _text_similarity(a: str, b: str) -> float:
"""
Simple token overlap similarity (Jaccard).
Production would use sentence-transformers embeddings.
"""
if not a or not b:
return 0.0
tokens_a = set(a.lower().split())
tokens_b = set(b.lower().split())
if not tokens_a or not tokens_b:
return 0.0
intersection = tokens_a & tokens_b
union = tokens_a | tokens_b
return len(intersection) / len(union)
Jaccard similarity is: |intersection| / |union|
The intersection is the set of tokens that appear in both strings. The union is the set of all tokens that appear in either string. A score of 1.0 means the strings have identical token sets. A score of 0.0 means they share no tokens.
Working through an example:
Turn A: "check price and decide trade"
tokens_a = {"check", "price", "and", "decide", "trade"}
Turn B: "check current price and make trade decision"
tokens_b = {"check", "current", "price", "and", "make", "trade", "decision"}
intersection = {"check", "price", "and", "trade"} → 4 tokens
union = {"check", "price", "and", "decide", "trade", "current", "make", "decision"} → 8 tokens
similarity = 4 / 8 = 0.50
A 0.50 score is below the 0.85 threshold — these turns would not trigger the loop counter. But:
Turn A: "check price and decide trade"
Turn C: "check price and decide trade action"
tokens_c = {"check", "price", "and", "decide", "trade", "action"}
intersection = {"check", "price", "and", "decide", "trade"} → 5 tokens
union = {"check", "price", "and", "decide", "trade", "action"} → 6 tokens
similarity = 5 / 6 = 0.833
Still below 0.85. Add one more overlap:
Turn D: "check price and decide on trade"
tokens_d = {"check", "price", "and", "decide", "on", "trade"}
intersection with Turn A = {"check", "price", "and", "decide", "trade"} → 5 tokens
union = {"check", "price", "and", "decide", "trade", "on"} → 6 tokens
similarity = 5 / 6 = 0.833
The 0.85 threshold is intentionally conservative. It allows natural variation in agent language without triggering false positives. Two turns must share almost all of their vocabulary to exceed 0.85.
The Check Turn Logic
Each turn goes through check_turn():
def check_turn(
self, session_id: str, turn_content: str
) -> LoopCheckResult:
"""Check if the current turn is too similar to recent turns."""
history = self._session_history[session_id]
if not history:
history.append(turn_content)
return LoopCheckResult()
# Compare against last N turns (window), not just the last one.
# This catches alternating patterns (A, B, A, B...) too.
window = history[-5:] # last 5 turns
max_similarity = max(
self._text_similarity(prev, turn_content) for prev in window
)
history.append(turn_content)
# Cap history to prevent unbounded memory growth
if len(history) > MAX_HISTORY_PER_SESSION:
self._session_history[session_id] = history[-MAX_HISTORY_PER_SESSION:]
if max_similarity >= SIMILARITY_THRESHOLD:
self._consecutive_similar[session_id] += 1
else:
self._consecutive_similar[session_id] = 0
consecutive = self._consecutive_similar[session_id]
if consecutive >= TERMINATE_CONSECUTIVE:
return LoopCheckResult(
looping=True,
should_terminate=True,
consecutive_similar=consecutive,
reason=(
f"Loop detected: {consecutive} consecutive similar "
f"turns (>{SIMILARITY_THRESHOLD} similarity). "
f"Session will be terminated."
),
)
if consecutive >= WARN_CONSECUTIVE:
return LoopCheckResult(
looping=True,
should_terminate=False,
consecutive_similar=consecutive,
reason=(
f"Loop warning: {consecutive} consecutive similar "
f"turns. VP will be notified."
),
)
return LoopCheckResult(consecutive_similar=consecutive)
Three behaviors to note:
The first turn always passes. If there is no history for the session, the turn is added and returns clean. The detector needs at least one previous turn to compare against.
The window is the last 5 turns, not all history. Comparing against all history would flag an agent that legitimately returns to a topic it discussed ten turns ago. The window limits comparison to recent context.
The consecutive counter resets to 0 on any dissimilar turn. A session that has 3 similar turns, then one dissimilar turn, then 2 more similar turns is not looping — it has made genuine progress in the middle. The counter does not accumulate across the gap.
The Two-Stage Response
# Thresholds
SIMILARITY_THRESHOLD = 0.85
WARN_CONSECUTIVE = 3
TERMINATE_CONSECUTIVE = 5
At 3 consecutive similar turns: warning. At 5 consecutive similar turns: termination.
The two-stage design mirrors the budget system's 80%/100% pattern. The warning at 3 turns gives the broker the opportunity to intervene — notify the controlling VP agent, send a Discord alert, give the operator a chance to inspect. Many real loops are recoverable: the operator can send a clarifying message that gives the agent new information, breaking the loop.
The termination at 5 turns is the hard cut. Five consecutive turns with 0.85+ similarity is not ambiguous — the session has failed. Continuing it would burn budget without making progress. The broker terminates the session, logs the termination reason, and fires a session.terminated event.
Memory Management
The session history is bounded:
MAX_HISTORY_PER_SESSION = 50
A session can accumulate at most 50 turns in the comparison history. This is not a loop detection threshold — it is a memory management limit. An agent that runs 200 turns (which would be unusual, given the turn limiter covered in Lesson 208) does not need all 200 turns in the comparison window. The comparison window is 5 turns. Keeping 50 in memory is generous storage with a cap.
When the history exceeds 50, it is trimmed to the last 50:
if len(history) > MAX_HISTORY_PER_SESSION:
self._session_history[session_id] = history[-MAX_HISTORY_PER_SESSION:]
This is the rolling-window pattern. Old turns fall off. Current state is maintained. No unbounded growth.
Why Jaccard and Not Embeddings
The comment in the source code is explicit:
"""
Simple token overlap similarity (Jaccard).
Production would use sentence-transformers embeddings.
"""
Embeddings would be more semantically accurate. Two turns that express the same idea with completely different words — "analyze the price movement" vs "examine how prices changed" — would score as highly similar with embeddings, and near-zero with Jaccard.
Jaccard was chosen for three reasons:
- Zero dependencies. No model loading, no network calls, no inference latency. It runs in microseconds.
- Deterministic. Given the same inputs, it always produces the same output. Embeddings can vary with model versions.
- Good enough. Real loops — agents that are genuinely stuck — do tend to use similar vocabulary. The agent saying "check price and decide" ten times in a row does not need semantic understanding to detect.
The comment is a signal, not a criticism. If false negatives become a problem — loops that use varied vocabulary and evade Jaccard detection — upgrading to embeddings is a well-defined path. The interface (_text_similarity(a, b) -> float) is a single function swap.
What Happens on Termination
When should_terminate=True is returned, the broker:
- Fires a
session.terminatedevent with the loop detection reason attached - Sends a Discord alert to the operator channel with the session details
- Notifies the controlling VP agent
- Records the termination in the audit log
- Calls
loop_detector.reset_session(session_id)to clear the history
The session is terminated cleanly. The broker does not let the agent issue one more turn. Once 5 consecutive similar turns are detected, the session is done.
The agent's state at termination is whatever it was when the 5th turn was received. The broker does not attempt to synthesize a clean exit. The operator can inspect the turn history in the audit log, diagnose why the agent got stuck, and restart a fresh session with whatever corrective context is needed.
Integration With Budget Enforcement
Loop detection and budget enforcement are complementary but independent. Both run on every turn. A loop that is caught at 3-warning turns has burned 3 turns of budget. A loop that reaches 5-termination turns has burned 5.
For a session running on Sonnet at $0.054/turn:
- Caught at warning (3 turns): $0.162 additional spend beyond legitimate turns
- Caught at termination (5 turns): $0.27 additional spend
These are small numbers per session. At scale — if an agent class is prone to loops and hits this pattern multiple times per day — the numbers compound. But the bigger value of loop detection is not the direct cost savings — it is the signal it provides.
An agent that hits the loop detector multiple times per week is telling you something about its prompt, its tooling, or the tasks it is being asked to perform. Loop frequency is an early indicator of agent quality problems, surfaced before they become P&L problems.
Session Isolation
Each session maintains independent state in the detector:
self._session_history: dict[str, list[str]] = defaultdict(list)
self._consecutive_similar: dict[str, int] = defaultdict(int)
Keyed by session_id, not agent_id. Two simultaneous sessions from the same agent do not interfere with each other's loop counters. A looping session does not affect a healthy parallel session.
When reset_session() is called — either on termination or when a session ends normally — the state for that session ID is cleared. No stale history accumulates between sessions.
Next: cost.attributed Events
Loop detection and budget enforcement both depend on accurate turn-level cost data. Lesson 208 covers the event layer that makes this data available: the cost.attributed event emitted on every LLM call, the emit callback pattern, and the CFO daily report structure that aggregates it.