Every win probability we publish is calibrated against real ATP & WTA matches. When we say 60%, it should win about 60% of the time — here's the proof, good or bad.
We sort every graded match into 10 buckets by the win probability we gave the favorite, then check what fraction actually won. A perfectly honest model sits on the diagonal: the dots should hug the dashed line.
| Predicted | Won (actual) | Miss | Matches |
|---|
Brier score is the average squared error between the probability we gave and what happened (lower is better; 0.25 = a coin flip, 0 = perfect). Straight-up accuracy is how often the side we favored actually won. Calibration is the most important: across the whole range, our stated probabilities match real-world frequencies.
These are measured on the same surface-aware power-rating + serve/return blend we use to price every match on the site — not a curated subset.