Concept deep dives

Fifty articles, one per concept. Each is independent — read them in any order. If you want the problem framed before the solution, read understanding/ first.

New to Mailwoman? Start with How Mailwoman parses an address for the walkthrough — one address, followed through the whole parse — then What Mailwoman is for the product boundary: what the system claims to do, and what it deliberately doesn't. Everything else here is a deep dive behind one of those two pages. The parse handoff continues in How Mailwoman resolves a place; Data, locales, and coverage and Quality and evaluation cover what's supported where and how to trust the numbers.

The clusters

Parsing internals — tokenization, BIO labels, the Viterbi decoder, the FST priors, attention. What happens between a raw string and a labeled tree. Start at How the model reasons.
Training and corpus — how the training data gets built, validated, and turned into a model. Start at Training pipeline.
Resolver and gazetteer — turning labeled spans into coordinates against Who's On First. Start at Resolver and Who's On First.
Record matching — geocode-first entity resolution for messy records with no shared ID. Start at Geocode-first record matching.
Switching guides — coming from Pelias, libpostal, Nominatim, Photon, or OpenCage? These map the differences directly, starting with How Mailwoman compares.

Article shape

Each article is about 5–10 minutes to read, with a short motivation, an explanation that defines its terms on first use, a diagram where structure helps, a short code-or-data example, pointers into the source for readers who want to go further, and a "See also" list.

The clusters​

Article shape​

The clusters

Article shape