Status
This page is the single source of truth for what works in Mailwoman right now.
Numbers below are sourced from the shipped model-card.json and the
parity scorecard (2026-06-10), which is the
authoritative per-tag table. For which release added which capability, see
Releases & capabilities.
Packagesโ
All publishable workspaces release in lockstep; the current npm version is 4.2.0.
| Package | Description |
|---|---|
mailwoman | CLI + high-level AddressParser + runtime pipeline factory |
@mailwoman/core | Tokenization, classification, solver, decoder, policy registry |
@mailwoman/codex | Per-address-system postal reference data (USPS suffixes, ZIP types) |
@mailwoman/classifiers | Rule-based classifiers |
@mailwoman/neural | ONNX runtime + SentencePiece tokenizer + anchor inference |
@mailwoman/neural-weights-en-us / -fr-fr | Trained model bundles |
@mailwoman/normalize / query-shape / locale-gate / kind-classifier / phrase-grouper | Pipeline stages 1โ2.7 |
@mailwoman/resolver-wof-sqlite / -wasm | WOF gazetteer resolver (Node SQLite / browser WASM) |
@mailwoman/neural-web | Browser-side neural classifier (onnxruntime-web) |
Model weights (4.2.0)โ
| Property | Value |
|---|---|
| Lineage | v1.0.2-consolidation-runB @ step 20k โ the v1.0 parity consolidation (unit + affix + country anchor + multi-locale), corpus v0.4.12 |
| Architecture | 29.3M params โ 6-layer encoder, 384 hidden, 6 heads, 48K-vocab SentencePiece embedding |
| Label vocabulary | 33 BIO labels over 16 component tags (incl. unit, po_box, street_prefix/_suffix, intersection_a/b) |
| Training corpus | corpus-v0.4.12-consolidation ยท tokenizer v0.6.0-a0 |
| Anchors | Postcode anchor (PCB1) + gazetteer soft anchor (country candidate-tags), both live at inference |
| Decode | Viterbi with frozen BIO structural mask (CRF at inference; CE-only training) |
| Size | 28.4 MB int8 ONNX (112.9 MB fp32) ยท opset 17 ยท max sequence 128 |
Per-tag health (summary)โ
The full per-tag table โ two lenses, golden-dev and real-OOD โ lives in the parity scorecard. The shape of it:
- Healthy: postcode (98/99 US/FR), house_number (96/91), venue-US (90), street (79 US).
- Fixed by the parity campaign:
unit0 โ 92.3 on real-OOD designators (the 4.1.0 headline). - Shipped by the consolidation (v4.2.0): locality +12.8, region +10.7, country 0 โ 89.8 (gazetteer soft anchor), affixes now exist (0 โ 64.9/48.8 at Pโ100), DE 90.9. Stated re-baselines: US street 76.2 (โ2.3), unit 90.6 (โ1.7) โ capacity ceiling measured, see #492.
- Open gaps: FR venue/region (#330), intersections (#487).
Capability-arena truth (whole-parse, vs the v0 rules system): rules still win on clean/canonical input, the neural parser wins decisively on noisy/degraded input (+21pp), both are weak on rare edge formats. The arbitration layer (#478) exists to make the pipeline take the better of both per component.
Runtime pipeline โ stage statusโ
| Stage | Name | Status | Notes |
|---|---|---|---|
| 1 | Normalize | โ default | NFC, whitespace collapse, locale-aware case folding |
| 2 | Locale gate | ๐ฉ opt-in | Ships (@mailwoman/locale-gate); not wired as factory default |
| 2.5 | Kind classifier | โ default | Rule-based; fast-paths bare postcodes and single localities |
| 2.7 | Phrase grouper | โ default | Rule-based span proposals before classification |
| 3 | Token classify | โ default | Neural encoder + rule classifiers, hybrid via policy registry (per-component modes) |
| 4 | Sequence correct | โ default | Viterbi over the frozen BIO mask; postcode/unit label repairs |
| 5 | Reconcile | ๐ฉ opt-in | Joint decoding with WOF concordance scoring (forceJointReconcile + parseWithLogits); argmax is the default |
| 6 | Resolve | ๐ฉ opt-in | WOF SQLite gazetteer: admin/postcode-level candidates with coordinates, exact-match tiering, coordinate-first postcode disambiguation, opt-in hierarchyCompletion/cityStateFallback/includeAncestors |
Confidenceโ
Every node carries a confidence score; pass a calibrator (isotonic, shipped per-locale in the
weights bundle as calibration.json) to make those scores honest probabilities โ see
Confidence calibration for what the numbers mean.
Browser demoโ
The live demo at mailwoman.sister.software/demo runs the
int8 model (28.4 MB) via @mailwoman/neural-web plus a slim WOF "hot" database via
@mailwoman/resolver-wof-wasm, with artifacts served from a versioned releases.json pointer.
It calls the classifier's argmax path directly; joint reconcile is not active in the demo.
Model line (June 2026)โ
The shipped weights are 4.2.0 โ the v1.0 consolidation flag-plant (#466, closed). The next model-side step is the architecture escalation (#492, operator-gated); the next pipeline step is arbitration (#478). The ordered work queue to the full geocoder is epic #488.
What is not supported todayโ
- Rooftop / address-point geocoding โ the resolver returns place-level candidates. The address-point tier (#476) and interpolation (#483) are queued.
- Reverse geocoding (#484) and autocomplete-as-a-product (#190).
- Multi-locale ensemble (en-US + fr-FR weights in the same runtime).
- Korean addresses (no open data path). Japanese addresses are supported via the postcode-route resolver (Geographic Rule Engine).
- A supported server API โ
mailwoman/serverships an experimental Express server (/api/parse,/api/resolve); production service hardening is #485.