From Pelias to Mailwoman
A short lineage for readers who already know Pelias.
Pelias in one paragraphโ
Pelias is an open-source geocoder. It takes an input string like "350 5th Ave, New York, NY 10118" and returns a structured place. Under the hood, Pelias splits this into two jobs:
- Parsing โ turn the string into labelled parts (
house_number=350,street=5th Ave,locality=New York,region=NY,postcode=10118). For this, Pelias useslibpostal, a C library trained on OpenStreetMap data. - Resolving โ look up the parsed place in a gazetteer (a large place database, often Who's On First) and return coordinates plus IDs.
The two-job split is important. If we get a great parser but a weak gazetteer, results are still bad. If we get a great gazetteer but a weak parser, the gazetteer never sees the right query. Both halves have to be strong.
Where Mailwoman came fromโ
Mailwoman v1 was an effort to replace libpostal with a TypeScript-native parser. The motivations were practical:
libpostalis a 2 GB binary that's painful to ship in browsers, serverless functions, or edge runtimes.libpostalships as a black box: its training data is OpenStreetMap, its model is a CRF (conditional random field โ seeconcepts/crf-decoder.md), but there is no easy way to retrain it on your own data.- Pelias's TypeScript ecosystem wanted a parser it could iterate on directly.
Mailwoman v1 used rule classifiers: hand-written code that looks at each token (word) of the input and decides what kind of address component it is. A rule like "if it starts with 5 digits, it is a US postcode" is one classifier. There are dozens of them, one per component type. They run in parallel, vote, and a solver picks the best combination. See how-it-used-to-work.md for the full story.
This worked, but it ran into the same limit every rule-based parser hits: the long tail. Real-world addresses have shapes the rules do not know about. "Saint Petersburg, FL" is two words but one city. "Mt Tabor Park" is a venue, not a street. Rules can describe these cases, but writing them all out is a never-ending project.
What changed in 2026โ
Mailwoman v2 โ what you are reading docs for โ keeps the rule classifiers but adds a neural classifier alongside them. Both run; both produce candidate labels; a per-component policy decides which one's vote counts more for each address component type. The migration is gradual on purpose: rules stay until the neural classifier's metrics prove it does better.
The neural classifier is a small transformer model (about 9 million parameters โ see concepts/neural-classification.md) trained on a 677-million-row corpus built from many open data sources. It ships in two pieces:
@mailwoman/neuralโ the runtime that loads the model and runs inference (works in Node.js and the browser).@mailwoman/neural-weights-en-usand@mailwoman/neural-weights-fr-frโ the model files (one per locale).
The Pelias side of the equation โ the gazetteer + resolver โ is also evolving. Mailwoman now ships its own resolver against Who's On First as a SQLite database, both server-side and in the browser via WebAssembly. See concepts/resolver-and-wof.md.
What stayed the sameโ
- The two-job split. Parse, then resolve. Same as Pelias. Same as every modern geocoder.
- The output shape. Mailwoman emits parsed components with confidence scores and offsets, the same surface a Pelias consumer would expect.
- The CLI ergonomics.
mailwoman parse <input>is the entry point. - Multi-locale design. Mailwoman ships separate weight packages per locale (en-us, fr-fr) and the architecture is built around the idea that locales are first-class (see
plan/reference/ARCHITECTURE.md).
What we deliberately do differently: no span left behindโ
One Pelias behavior we do not carry forward. The Pelias parser's solver is lossy by design: spans that don't fit the winning interpretation are quietly demoted to "accessory" information and punted to the resolver. If your only goal is "resolve to a point", that's a reasonable economy. But it forecloses the geocoder-adjacent jobs โ chiefly record linkage, deciding whether two rows in two datasets describe the same entity โ because the parse no longer accounts for the whole string.
Mailwoman's parser labels every token (the BIO sequence covers the full input), and
the design goal is a lossless, typed decomposition: every character classified or
explicitly tagged unknown, each span carrying a canonical value. Two records match when
their canonicalized span-sets agree, surface variation and all โ something a lossy
parser structurally cannot offer. The full argument lives in
concepts/synonymy-and-homonymy.md.
Nextโ
- How it used to work โ the rule-only era in detail
- How it works now โ the hybrid
- What is an address? โ the deep dive on the data model