Eval report — step-001800
- entries evaluated: 4535
- full-parse exact match: 0.1074
- mean token confidence: 0.8566
Per-component F1
| tag | precision | recall | f1 | support |
|---|
| country | 0.2955 | 0.2653 | 0.2796 | 245 |
| region | 0.4411 | 0.1098 | 0.1759 | 3205 |
| locality | 0.2330 | 0.3089 | 0.2657 | 3357 |
| dependent_locality | 0.0000 | 0.0000 | 0.0000 | 40 |
| postcode | 0.8426 | 0.6846 | 0.7554 | 2980 |
| subregion | 0.0000 | 0.0000 | 0.0000 | 0 |
| cedex | 0.0000 | 0.0000 | 0.0000 | 1 |
| venue | 0.3862 | 0.4024 | 0.3941 | 1101 |
| street | 0.3326 | 0.2217 | 0.2660 | 2928 |
| house_number | 0.7316 | 0.8433 | 0.7835 | 1742 |
Calibration (confidence bucket → accuracy)
| bucket | n | accuracy |
|---|
| 0.0–0.1 | 0 | 0.0000 |
| 0.1–0.2 | 19 | 0.5789 |
| 0.2–0.3 | 316 | 0.3323 |
| 0.3–0.4 | 1213 | 0.3776 |
| 0.4–0.5 | 2754 | 0.3413 |
| 0.5–0.6 | 3647 | 0.3548 |
| 0.6–0.7 | 3659 | 0.3884 |
| 0.7–0.8 | 4610 | 0.4063 |
| 0.8–0.9 | 7716 | 0.4396 |
| 0.9–1.0 | 37314 | 0.5988 |