Skip to main content

Eval report — step-002200

  • entries evaluated: 4535
  • full-parse exact match: 0.0818
  • mean token confidence: 0.8063

Per-component F1

tagprecisionrecallf1support
country0.21430.20820.2112245
region0.34300.12980.18833205
locality0.24780.30530.27363357
dependent_locality0.00500.10000.009640
postcode0.83240.59160.69162980
subregion0.00000.00000.00000
cedex0.00000.00000.00001
venue0.37650.40150.38861101
street0.35590.26160.30162928
house_number0.74460.83350.78661742

Calibration (confidence bucket → accuracy)

bucketnaccuracy
0.0–0.100.0000
0.1–0.2210.2381
0.2–0.36660.2913
0.3–0.424160.3266
0.4–0.543080.3152
0.5–0.647770.3383
0.6–0.749430.3569
0.7–0.855340.3825
0.8–0.980660.4363
0.9–1.0305170.6440