Status

This page is the single source of truth for what works in Mailwoman right now.

Verified as of 2026-06-10 (night-10) — release 4.2.0

Numbers below are sourced from the shipped model-card.json and the parity scorecard (2026-06-10), which is the authoritative per-tag table. For which release added which capability, see Releases & capabilities.

Packages

All publishable workspaces release in lockstep; the current npm version is 4.2.0.

Package	Description
`mailwoman`	CLI + high-level `AddressParser` + runtime pipeline factory
`@mailwoman/core`	Tokenization, classification, solver, decoder, policy registry
`@mailwoman/codex`	Per-address-system postal reference data (USPS suffixes, ZIP types)
`@mailwoman/classifiers`	Rule-based classifiers
`@mailwoman/neural`	ONNX runtime + SentencePiece tokenizer + anchor inference
`@mailwoman/neural-weights-en-us` / `-fr-fr`	Trained model bundles
`@mailwoman/normalize` / `query-shape` / `locale-gate` / `kind-classifier` / `phrase-grouper`	Pipeline stages 1–2.7
`@mailwoman/resolver-wof-sqlite` / `-wasm`	WOF gazetteer resolver (Node SQLite / browser WASM)
`@mailwoman/neural-web`	Browser-side neural classifier (onnxruntime-web)

Model weights (4.2.0)

Property	Value
Lineage	`v1.0.2-consolidation-runB` @ step 20k — the v1.0 parity consolidation (unit + affix + country anchor + multi-locale), corpus `v0.4.12`
Architecture	29.3M params — 6-layer encoder, 384 hidden, 6 heads, 48K-vocab SentencePiece embedding
Label vocabulary	33 BIO labels over 16 component tags (incl. `unit`, `po_box`, `street_prefix`/`_suffix`, `intersection_a/b`)
Training corpus	`corpus-v0.4.12-consolidation` · tokenizer `v0.6.0-a0`
Anchors	Postcode anchor (PCB1) + gazetteer soft anchor (country candidate-tags), both live at inference
Decode	Viterbi with frozen BIO structural mask (CRF at inference; CE-only training)
Size	28.4 MB int8 ONNX (112.9 MB fp32) · opset 17 · max sequence 128

Per-tag health (summary)

The full per-tag table — two lenses, golden-dev and real-OOD — lives in the parity scorecard. The shape of it:

Healthy: postcode (98/99 US/FR), house_number (96/91), venue-US (90), street (79 US).
Fixed by the parity campaign: unit 0 → 92.3 on real-OOD designators (the 4.1.0 headline).
Shipped by the consolidation (v4.2.0): locality +12.8, region +10.7, country 0 → 89.8 (gazetteer soft anchor), affixes now exist (0 → 64.9/48.8 at P≈100), DE 90.9. Stated re-baselines: US street 76.2 (−2.3), unit 90.6 (−1.7) — capacity ceiling measured, see #492.
Open gaps: FR venue/region (#330), intersections (#487).

Capability-arena truth (whole-parse, vs the v0 rules system): rules still win on clean/canonical input, the neural parser wins decisively on noisy/degraded input (+21pp), both are weak on rare edge formats. The arbitration layer (#478) exists to make the pipeline take the better of both per component.

Runtime pipeline — stage status

Stage	Name	Status	Notes
1	Normalize	✅ default	NFC, whitespace collapse, locale-aware case folding
2	Locale gate	🚩 opt-in	Ships (`@mailwoman/locale-gate`); not wired as factory default
2.5	Kind classifier	✅ default	Rule-based; fast-paths bare postcodes and single localities
2.7	Phrase grouper	✅ default	Rule-based span proposals before classification
3	Token classify	✅ default	Neural encoder + rule classifiers, hybrid via policy registry (per-component modes)
4	Sequence correct	✅ default	Viterbi over the frozen BIO mask; postcode/unit label repairs
5	Reconcile	🚩 opt-in	Joint decoding with WOF concordance scoring (`forceJointReconcile` + `parseWithLogits`); argmax is the default
6	Resolve	🚩 opt-in	WOF SQLite gazetteer: admin/postcode-level candidates with coordinates, exact-match tiering, coordinate-first postcode disambiguation, opt-in `hierarchyCompletion`/`cityStateFallback`/`includeAncestors`

Confidence

Every node carries a confidence score; pass a calibrator (isotonic, shipped per-locale in the weights bundle as calibration.json) to make those scores honest probabilities — see Confidence calibration for what the numbers mean.

Browser demo

The live demo at mailwoman.sister.software/demo runs the int8 model (28.4 MB) via @mailwoman/neural-web plus a slim WOF "hot" database via @mailwoman/resolver-wof-wasm, with artifacts served from a versioned releases.json pointer. It calls the classifier's argmax path directly; joint reconcile is not active in the demo.

Model line (June 2026)

The shipped weights are 4.2.0 — the v1.0 consolidation flag-plant (#466, closed). The next model-side step is the architecture escalation (#492, operator-gated); the next pipeline step is arbitration (#478). The ordered work queue to the full geocoder is epic #488.

What is not supported today

Rooftop / address-point geocoding — the resolver returns place-level candidates. The address-point tier (#476) and interpolation (#483) are queued.
Reverse geocoding (#484) and autocomplete-as-a-product (#190).
Multi-locale ensemble (en-US + fr-FR weights in the same runtime).
Korean addresses (no open data path). Japanese addresses are supported via the postcode-route resolver (Geographic Rule Engine).
A supported server API — mailwoman/server ships an experimental Express server (/api/parse, /api/resolve); production service hardening is #485.

Packages​

Model weights (4.2.0)​

Per-tag health (summary)​

Runtime pipeline — stage status​

Confidence​

Browser demo​

Model line (June 2026)​

What is not supported today​