Skip to main content

Eval report — step-001800

  • entries evaluated: 4535
  • full-parse exact match: 0.1074
  • mean token confidence: 0.8566

Per-component F1

tagprecisionrecallf1support
country0.29550.26530.2796245
region0.44110.10980.17593205
locality0.23300.30890.26573357
dependent_locality0.00000.00000.000040
postcode0.84260.68460.75542980
subregion0.00000.00000.00000
cedex0.00000.00000.00001
venue0.38620.40240.39411101
street0.33260.22170.26602928
house_number0.73160.84330.78351742

Calibration (confidence bucket → accuracy)

bucketnaccuracy
0.0–0.100.0000
0.1–0.2190.5789
0.2–0.33160.3323
0.3–0.412130.3776
0.4–0.527540.3413
0.5–0.636470.3548
0.6–0.736590.3884
0.7–0.846100.4063
0.8–0.977160.4396
0.9–1.0373140.5988