The case for simple geocoders
Not every geocoding problem needs a neural parser. This series steel-mans the reasonably defensible compromises that work for most applications โ the architectures that ship in days, cover 90% of addresses, and cost nothing to maintain. Each article describes one approach, when it works, what it loses, and where Mailwoman fits (or doesn't).
These are not straw men. They are the architectures most production geocoders actually use, and for many applications they are the right choice. Mailwoman exists for the applications where they are not.
The compromisesโ
| Compromise | What you do | What you lose |
|---|---|---|
| Normalize-to-match | Strip, lowercase, abbreviate, fuzzy-match against a known database | Understanding of what anything means |
| Postcode-only | Parse the postcode, centroid the result | Everything finer than ~1 mile |
| Gazetteer-first | Skip parsing, treat as IR โ try every token against a placename index | Streets, building numbers, venue names |
| Regex-anchored fields | Extract the 3-4 fields you care about, ignore the rest | The rest |
| Locality-only | Find the city, centroid it | Street-level routing, delivery-point accuracy |
| Human-in-the-loop | Don't parse โ suggest, let the user confirm | Automation, scale |
| Close-enough | Define your precision requirement, pick the cheapest approach, stop | Everything below your requirement |
When simple is the right choiceโ
The simple architectures win when:
- You are geocoding US addresses only.
- You need administrative-level accuracy (city, state, postcode), not street-level.
- Your volume is under 1 million addresses per month.
- You can fall back to a paid API or manual review for failures.
- You have a week, not a year.
- You do not need graceful degradation โ a confident wrong answer is acceptable if it's rare enough.
These conditions describe most geocoding use cases. The simple architectures are not the wrong choice for most applications. They are the right choice. Mailwoman exists for the applications where they are not.
When to choose Mailwomanโ
Mailwoman is the right choice when:
- You need street-level or venue-level parsing.
- You serve international users with non-Anglophone address formats.
- Your volume makes fallback costs material.
- You cannot use a third-party API for privacy, regulatory, or cost reasons.
- You need honest confidence โ ambiguous inputs should surface their ambiguity, not produce a confident wrong answer.
- You are willing to invest in infrastructure for long-term accuracy gains.
See alsoโ
- Why a neural parser? โ the affirmative case
- The 90% trap โ the economic argument for going past 90%
- Falsehoods about addresses โ the counterexamples that break simple architectures