Addresses that break geocoders
What is an address? covered the data model. This article goes the other way: a tour of address shapes that consistently break parsers and resolvers, with concrete examples for each. If you are building or evaluating a geocoder, this is the failure-mode catalogue worth keeping next to your test suite.
Falsehoods about postcodes
The falsehoods
How can a building have two addresses?
A single physical building can have multiple valid addresses. For commercial buildings, multifamily housing, corner properties, and any structure that touches more than one administrative system, that is the normal state of affairs. A parser that assumes "one building = one address" will fail silently on a large fraction of real-world queries.
How humans break addresses
Users do not type addresses the way gazetteers store them. They type what they know, in the order they think of it, with the spellings their keyboard supports, trusting autocomplete suggestions they didn't verify. A parser that only handles well-formed addresses fails on real input.
Night 3 postmortem — v0.7 kickoff (calibration + postcode repair)
Posture entering: v0.6.x HELD. v0.7 plan (calibration + postcode fix)
Postcode-only diagnostic (v0.6.0)
Date: 2026-05-29
Postcode-only geocoding
The fastest geocoder extracts the postcode, looks it up in a postcode-to-coordinate table, and returns the centroid. It doesn't parse the street, the building number, or the locality. It doesn't need to. The postcode is the most machine-readable part of any address, and for many applications, postcode-level accuracy is sufficient.
Regex-anchored fields
Most applications don't need a full address parse. They need three or four fields: the postcode, the state, the street number, maybe the street name. Extract those with regexes. Ignore everything else. The unparsed tokens are "the rest" — available for display, label printing, and downstream processing, but not part of the geocoding decision.
The database fallacy
There is a persistent belief in geocoding: "If we just had a database of all addresses, this problem would be trivial." The belief is wrong in three independent ways. Together, they make the database approach not merely incomplete but structurally inadequate.
What is a postcode?
A postcode is a routing instruction, not a geographic area. It tells a postal service how to sort and deliver mail. It does not tell a geocoder where a building is, what municipality it belongs to, or what polygon contains it. Confusing these two things is the most common error in address geocoding — and the source of a surprising fraction of production bugs.
What is a ZIP Code?
The US ZIP Code is the most influential postal code system in the world — not because it is the best, but because US-origin address data dominates geocoding training sets and shapes what parsers expect. Understanding its structure is essential for understanding why US-trained parsers fail on international addresses.