v0.9.3 region-tail — the corpus lever is exhausted
Verdict: not promoted. v0.9.3 added the region tail to the international German synth (the one variable vs v0.9.2). It fixed what it was built to fix — region tagging — and left the thing we actually needed unmoved: international-order locality. Three retrains now agree the international ceiling is the postcode anchor's structure, not the data. The follow-up (v0.9.4 dual-injection) is training. Tracking: #327.
The 2×2 (DE locality-match, 3,000 real OpenAddresses German addresses, --default-country DE)
| anchor OFF | anchor ON | |
|---|---|---|
| US order | 47.1% | 44.7% |
| native DE | 48.3% | 83.6% |
No-regression: US 97.2%, FR 84.9% (both within ~1pp of v0.7.2). cross_pollution 0.00%.
vs v0.9.2 (both-order, no region tail): native 82.1 → 83.6 (+1.5, still beats Pelias's 78.7), international anchor-ON 44.5 → 44.7 (flat), international anchor-OFF 48.4 → 47.1 (flat).
What moved, and what didn't
The region tail is rendered correctly. A direct look at the synth confirms it: Davoser Straße, Berlin, Berlin 14199 tokenizes to Berlin/B-locality Berlin/B-region 14199/B-postcode, region labeled and aligned. So the "rendering is broken" escape hatch in the pre-registered gate doesn't apply here.
And it worked, for region. International region-match rose from 30.8% (anchor off) to 38.3% (anchor on): the model is now learning to segment a City, Region Postcode tail it never saw before. But international locality-match sat flat at 44.7% with the anchor on, against 83.6% native. The region tail taught region. It did not teach locality.
PIP-containment (gold point inside the resolved polygon) confirms this is a real geographic miss, not a name-match artifact: 57% on international order against 96% native. The model is genuinely placing international-order German localities in the wrong spot, and the anchor isn't rescuing it.
Why the gate fails, and where it points
The pre-registered gate asked two things. First, that the region tail lift the intrinsic international ceiling (anchor-OFF international ≥ native). It didn't — international anchor-OFF held at ~47% while native held ~48%, no movement from v0.9.2. Second, that the anchor-ON international gap close to 10pp or less. It's 38.9pp (83.6 native − 44.7 international). Both miss.
The diagnosis the gate was built to reach: the corpus lever is spent. Anchor-OFF parsing is already order-agnostic (~48% both orders), forcing the country posterior to DE=1.0 on v0.9.2 left international unchanged at 44.5%, and now the region tail moves region but not locality. The per-token anchor fires at the trailing postcode, which in international order sits on the far side of the locality from where it's needed. The harm is positional, baked into where the anchor lives, not into the data feeding it.
Decision
v0.9.3 is not promoted (the target metric, international locality, didn't move; nothing else regressed). The next experiment is v0.9.4 dual-injection — pool the per-token anchor and also inject it at position 0, an order-independent global cue the locality can attend back to regardless of word order. One variable vs v0.9.3 (model.inject_first_token=true), same corpus, de-risked by test_anchor_channel.py. DeepSeek signed off the direction twice (2026-06-06 and again today). Its call cell: international anchor-ON locality must clear 55% (a 10-point lift) to call dual-injection a win; if it still trails by more than 10pp, the always-on anchor design itself is the next thing to reconsider.