Ask Knox

PR #28 was written by a coding sub-agent. The diff looked right. Tests passed. CI was green. The author was ready to merge.

Then CodeRabbit's review showed up with a comment on calibrator.py:

Discontinuity at branch boundary p = 0.15. Lower branch yields 18.4 at p = 0.15; upper branch yields 17.6 at the same point. The score drops as input quality increases slightly, which appears to violate the intended shape of this scoring function.

The code-writing agent had not caught it. The unit tests had not caught it. The human author had not caught it. The automated reviewer did — in seconds — and the PR was held until the formula was fixed.

Note what the bug was not: agent improvisation. The spec's instruction for calibrator.py was the single multiplier swap from PR #28 — agreement_score * 25.0 to * 20.0 — and the agent applied exactly that diff. But the calibrator function is piecewise, and its upper branch carried its own hard-coded constant that the spec's one-line change never touched. Rescaling one branch without the other is what opened the 18.4 → 17.6 cliff at p = 0.15: a bug that lived in the interaction between the diff and the surrounding code, not in any single line of the diff.

Why This Class of Bug Exists

Code-writing agents are trained to produce code that looks right. "Looks right" for a piecewise scoring function means symmetrical branches, consistent variable names, and plausible formulas. The agent has no strong prior that branch boundaries should be continuous; that is a mathematical property of the function, not a stylistic property of the code. The result is functions that compile, pass tests, and have subtle arithmetic cliffs at every branch point.

Unit tests miss these bugs because tests usually pick round numbers like 0.10 or 0.20 — not 0.14999 and 0.15001. The boundary is never exercised. The bug is invisible until a production signal happens to land exactly on it.

Why CodeRabbit Catches Them

Reviewer LLMs are tuned for a different task: finding anomalies in code by careful reading. They do not run the code; they analyze it as a mathematical object. For a piecewise function, that means:

Listing every branch boundary.
Substituting the boundary value into both the lower and upper branches.
Checking whether the outputs match.

This is a mechanical task that humans find tedious and skim past. Reviewer LLMs apply it to every branch boundary far more reliably than a time-pressured human — not because they are infallible (a reviewer LLM is still a probabilistic model and can miss bugs), but because they do not get bored and skip boundaries. The systematic coverage, backed by the human layer in the review below, is the real guarantee.

Inline Diagram — Three-Layer Review

When to Trust the Agent

Trust the agent for:

Mechanical refactors — rename, extract, move, reformat.
Targeted value changes — exactly the scenario PR #28 was dispatched into.
Boilerplate generation — scaffolds, test stubs, config files.

Do not trust the agent without review for:

Scoring math and piecewise functions — the exact class that produces continuity bugs.
Numerical edge cases — sign conventions, off-by-one, floating-point comparisons.
Security-sensitive logic — auth, access control, input validation.
Concurrency primitives — locks, retries, race conditions.

The dividing line is whether the agent's output has a mechanical verification path. Mechanical changes can be diff-checked. Mathematical changes need a reviewer that understands the math.

The Rule

Every PR involving math, boundaries, or numerical logic goes through the three-layer review: agent writes, CodeRabbit analyzes, human validates. Each layer is mandatory. The calibrator cliff is the cautionary tale — and the exact reason the review loop exists.

The Review Loop — CodeRabbit Catches What Agents Miss

Why This Class of Bug Exists

Why CodeRabbit Catches Them

Inline Diagram — Three-Layer Review

When to Trust the Agent

The Rule