A walkthrough — NY-NY Steakhouse, Houston, TX
The knowledge ladder explains why the v0.5.0 pipeline grew two new information layers (Stage 2.7 phrase grouper, expanded Stage 5 reconcile). This article walks through what they actually do on one concrete input, end-to-end.
Dual-role places and hierarchy completion
Some places are two things at once. Berlin is a city and a federal state. Milano is a comune and the province named after it. Bristol is a city and a unitary authority. Utrecht is a city and a province. A census of our gazetteer found about 122 of these across nine countries — Italy alone has 70 (its provinces are nearly all named after their capital), then Spain (14), the UK (12), Japan (10), Korea (7), and a handful each for France, Germany, the Netherlands, and China.
FST gazetteer prior
The FST (finite-state transducer) gazetteer prior is the structure that lets the neural classifier benefit from everything Who's On First already knows. The neural model knows grammar; the gazetteer knows places. The FST is the bridge — pre-computed at build time so the classifier can consult it at inference time without paying gazetteer-lookup costs per token.
Gazetteer-first geocoding
The most radical simplification: don't parse at all. Treat the input as an information retrieval problem. Tokenize the string, try every token and n-gram against a gazetteer index, return the best placetype match. The parser is a search engine. The resolver is the search result.
How it works now
Mailwoman v2 (current as of May 2026) runs addresses through a six-stage pipeline. Each stage adds one kind of knowledge the stages below it cannot easily derive. Rule classifiers and a neural classifier coexist. A policy registry decides whose vote wins for each address component.
Reconcile — why empty parses dominate without an inclusion bonus
A short article on the one non-obvious knob inside Stage 5's joint decoder. If you skip it the reconciler converges on the empty parse for every input.
Resolver and Who's On First
Parsing answers "what kind of thing is each part of this string?". Resolving answers "where is the resulting place?". They are different jobs and Mailwoman keeps them apart on purpose.
Synonymy and homonymy — canonicalize, disambiguate, never drop a span
An address parser has to hold two facts at once that feel contradictory until you
The knowledge ladder
The staged pipeline is a contract for decomposition by what each layer knows. Every stage is the rightful home of a particular kind of information; pushing work to the wrong stage produces fragile systems that try to learn things from data that they could have looked up, or look up things that they could have learned. This article catalogues the layers, what each one knows, and the two layers we don't ship yet but should.
What is a concordance?
In Mailwoman's architecture, a concordance is the resolver's answer to the question: "Do these parsed components form a coherent place in the real world?" It is the mechanism that prevents the parser from emitting a parse that is structurally valid but geographically impossible — like "Paris, Texas" labelled as "Paris, Île-de-France" or "NY-NY Steakhouse, Houston TX" with "NY" tagged as a region.
Who's On First — data model and gotchas
Who's On First (WOF) is the best open gazetteer available. It's also one of the strangest datasets you'll encounter as a developer. This article documents the gotchas — the structural quirks that trip up new consumers — and the tooling Mailwoman built to work around them.