Skip to main content

The case for simple geocoders

Not every geocoding problem needs a neural parser. This series steel-mans the reasonably defensible compromises that work for most applications โ€” the architectures that ship in days, cover 90% of addresses, and cost nothing to maintain. Each article describes one approach, when it works, what it loses, and where Mailwoman fits (or doesn't).

These are not straw men. They are the architectures most production geocoders actually use, and for many applications they are the right choice. Mailwoman exists for the applications where they are not.

The compromisesโ€‹

CompromiseWhat you doWhat you lose
Normalize-to-matchStrip, lowercase, abbreviate, fuzzy-match against a known databaseUnderstanding of what anything means
Postcode-onlyParse the postcode, centroid the resultEverything finer than ~1 mile
Gazetteer-firstSkip parsing, treat as IR โ€” try every token against a placename indexStreets, building numbers, venue names
Regex-anchored fieldsExtract the 3-4 fields you care about, ignore the restThe rest
Locality-onlyFind the city, centroid itStreet-level routing, delivery-point accuracy
Human-in-the-loopDon't parse โ€” suggest, let the user confirmAutomation, scale
Close-enoughDefine your precision requirement, pick the cheapest approach, stopEverything below your requirement

When simple is the right choiceโ€‹

The simple architectures win when:

  • You are geocoding US addresses only.
  • You need administrative-level accuracy (city, state, postcode), not street-level.
  • Your volume is under 1 million addresses per month.
  • You can fall back to a paid API or manual review for failures.
  • You have a week, not a year.
  • You do not need graceful degradation โ€” a confident wrong answer is acceptable if it's rare enough.

These conditions describe most geocoding use cases. The simple architectures are not the wrong choice for most applications. They are the right choice. Mailwoman exists for the applications where they are not.

When to choose Mailwomanโ€‹

Mailwoman is the right choice when:

  • You need street-level or venue-level parsing.
  • You serve international users with non-Anglophone address formats.
  • Your volume makes fallback costs material.
  • You cannot use a third-party API for privacy, regulatory, or cost reasons.
  • You need honest confidence โ€” ambiguous inputs should surface their ambiguity, not produce a confident wrong answer.
  • You are willing to invest in infrastructure for long-term accuracy gains.

See alsoโ€‹