Rebalance Without Moving the Threshold

The Hermes calibrator was dead. The threshold was 70. Zero signals had cleared the bar in weeks. There were two ways to fix it.

Option A: Lower the threshold. Drop 70 to 50. Signals start clearing. Everyone goes to bed happy.

Option B: Restructure the weights so the effective ceiling rises above 70 without the calibrator.

PR #28 chose Option B. This lesson is about why — and what the walkthrough looks like.

The Case Against Lowering the Threshold

The threshold in a scoring system is the quality gate. It is the line above which a signal is considered strong enough to act on. Lowering it does not make signals better. It makes weaker signals eligible.

That is exactly the wrong intervention. The signals that were failing to clear 70 were failing because the composite could not represent their quality — not because they were weak. The correct fix is to reshape the composite, not to admit lower quality.

The deeper reason is the second-order effect: once you lower the threshold, you have lost the reference point. Future performance comparisons become impossible ("did the bot get better, or did we just lower the bar again?"). The threshold is your quality anchor. Do not touch it unless the underlying signal quality has actually changed.

The PR #28 Walkthrough

The rebalance had four changes:

Component	Before	After	Delta	Reasoning
Grok	30	35	+5	Strong historical fire rate (82%), worth a small boost
Perplexity	30	50	+20	Highest signal quality per point, highest fire rate — scale it up
News	15	15	0	Already calibrated, no change needed
Calibrator	25	20	-5	Load-bearing bug fixed by reducing its max — even if it stays dead, the other components can now clear 70
Total	100	120	+20	Headroom bought
Threshold	70	70	0	Unchanged

The key move is the Perplexity boost. Before PR #28, Perplexity's 30-point max meant that even a near-perfect Perplexity score (say 28) combined with a strong Grok score (say 28) would get you to 56 — well below 70 — unless the calibrator also fired. After the rebalance, a near-perfect Perplexity score (47) plus a strong Grok score (32) gets you to 79. Threshold cleared without the calibrator firing at all.

The calibrator's max dropped from 25 to 20. On the surface this looks like a demotion. But the real purpose was to shrink its share of the total so that the other components could realistically cover the threshold without it. A dead load-bearing component is a footgun; a dead additive component is just dead weight. The rebalance moved the calibrator from the first category to the second.

The Validation

PR #28 was validated via the arithmetic backtest described in Lesson 260:

SELECT COUNT(*) FROM (
  SELECT
    (grok_score * 35.0 / 30.0)
    + (perplexity_score * 50.0 / 30.0)
    + news_score
    + (calibration_score * 20.0 / 25.0) AS new_composite
  FROM hermes_signals
  WHERE created_at > NOW() - INTERVAL '14 days'
) rescaled
WHERE new_composite > 70;

Result on historical data: 8 signals would have cleared, compared to 0 under the old weights. That number was the merge criterion. If the rescale had produced too many (say 200) or too few (say 1), the rebalance would have needed another round.

Inline Diagram — Before / After Distribution

The Rule

When signals are not clearing, fix the math, not the threshold. Restructure component weights until the effective ceiling at realistic fire rates sits comfortably above the threshold. Validate with an arithmetic backtest. Ship with a target range for clearance count — not a target number. Preserve the bar.