A Deep Dive into Predictive Performance

The 2024 Tour de France provided a great testing ground for our advanced predictive models, which are rooted in the dynamics of the peloton and tailored to various stage types. This post TdF analysis highlights our methodology and key findings, showcasing how our predictions compared to actual race outcomes. Through detailed examination of performance across different stage types, we offer insights into the factors driving elite cycling performance and the potential applications of our models for future race strategies.

Crafting the Models

Our approach to modeling the Tour is grounded in the dynamics of the peloton. The peloton's collective power and aerodynamic efficiency play a crucial role in our predictions. By analyzing typical pull power and understanding the impact of drafting, we create a baseline that reflects the average conditions of the race.

For the mountainous stages, we elevate our model to mirror the extraordinary efforts of the top climbers, capturing the essence of what it takes to lead in these segments.

In time trials, precision is key. Our models are designed to predict what it takes to reach the podium, focusing on optimal pacing strategies and power outputs that align with elite performance standards. While we refrain from using specific rider values due to our partnerships, our generalized approach still delivers highly competitive predictions.

Key Findings from the 2024 Tour de France

Our models were put to the test across various stages of the Tour de France, revealing intriguing insights into our predictive capabilities. Here's a breakdown of how our models performed across different stage types and what we learned along the way.

Performance by Stage Type

  • Mountain Stages: Our models excelled in predicting mountain stages, often coming closer to the winning times than the median times. For instance, in Stage 4, a challenging mountain stage, our model was only 2:34 behind the winning time, demonstrating its ability to capture the intense efforts required in these segments.
  • Hilly Stages: The results for hilly stages were mixed. In Stage 17, our model was 13:21 behind the winning time but notably closer to the median time by 4:09, indicating that our predictions were more aligned with the overall field performance in these undulating terrains. However, for Stage 8, the model was only 16 seconds off both the winning and median times, and for Stage 18 the model was within 13 seconds of Victor Campenaert’s, who is a documented BBS enthusiast, winning time!
  • Flat Stages: Our flat stage predictions consistently leaned faster than both the winning and median times by 5 to 8 minutes. This is likely due to teams letting off the gas more in flat stages and setting things up for the sprinters, as the difference in winning and median times were negligible in these stages, with no breakaways in flat stages this year.
  • Time Trials: Our models were originally designed for time trials. In Stage 7, a critical time trial, our model was merely 0:28 behind the winning time, and for Stage 21, the model was 1:01 behind the winning time. That said, the power strategy provided by the model would have resulted in a podium finish in both races.

Modeled Time Compared to Winning and Median Times

Stage Type Date Modeled Time Winning Time Median Time Diff to Winning Diff to Median
1 Hilly 06/29/2024 5:18:19 5:07:22 5:15:01 -0:10:57 -0:03:18
2 Hilly 06/30/2024 4:58:29 4:43:42 4:52:35 -0:14:47 -0:05:54
3 Flat 07/01/2024 5:18:46 5:26:48 5:27:35 0:08:02 0:08:49
4 Mountain 07/02/2024 3:49:12 3:46:38 4:07:34 -0:02:34 0:18:22
5 Flat 07/03/2024 4:00:36 4:08:46 4:08:47 0:08:10 0:08:11
6 Flat 07/04/2024 3:26:49 3:31:55 3:35:38 0:05:06 0:08:49
7 Time Trial 07/05/2024 0:29:20 0:28:52 0:32:43 -0:00:28 0:03:51
8 Hilly 07/06/2024 4:04:34 4:04:50 4:04:50 0:00:16 0:00:16
9 Hilly 07/07/2024 4:33:15 4:19:43 4:31:25 -0:13:32 -0:01:50
10 Flat 07/09/2024 4:07:50 4:20:06 4:20:06 0:12:16 0:12:16
11 Mountain 07/10/2024 5:15:01 4:58:00 5:27:23 -0:17:01 0:29:23
12 Hilly 07/11/2024 4:32:08 4:17:15 4:18:21 -0:14:53 -0:13:47
13 Hilly 07/12/2024 3:32:06 3:23:09 3:25:38 -0:08:57 -0:06:28
14 Mountain 07/13/2024 4:11:46 4:01:51 4:32:43 -0:09:55 0:30:52
15 Mountain 07/14/2024 5:15:01 5:13:55 5:54:55 -0:01:06 0:41:00
16 Flat 07/16/2024 4:04:33 4:11:27 4:11:27 0:06:54 0:06:54
17 Mountain 07/17/2024 4:19:34 4:06:13 4:23:43 -0:13:21 0:04:09
18 Hilly 07/18/2024 4:10:07 4:10:20 4:24:00 0:00:13 0:13:40
19 Mountain 07/19/2024 4:01:17 4:04:03 4:44:37 0:02:46 0:43:20
20 Mountain 07/20/2024 3:58:56 4:04:22 4:33:16 0:05:26 0:34:20
21 Time Trial 07/21/2024 3:58:56 4:04:22 4:33:16 0:05:26 0:43:20
Total 79:50:39 77:51:52 82:24:57 -1:58:47 2:34:18

Overall, Pogacar was 2.54% faster than our modeled time, and the Median Total time was 3.15% slower than our model time.

The analysis of our model's performance across the stages revealed several key trends:

  • Closer to Winning Times: On average, our predictions were more closely aligned with the winning times, particularly in mountain and flat stages. This indicates that our models are tuned to predict the efforts of leading competitors.
  • Median Time Deviations: The largest deviations from the median times were observed in mountain stages, suggesting that our model emphasizes top-tier performances rather than the average field. For example, in Stage 14, a grueling mountain stage, our model was 9:55 behind the winning time but significantly ahead of the median time by 30:52.
  • Stage-Specific Insights: Certain stages stood out where our model's predictions were exceptionally accurate. In Stage 8, a hilly stage, our model was only 16 seconds behind both the winning and median times, showcasing its precision. In stage 16, the model was also exceptionally close at only 13 seconds; however, the median time was much slower for this stage.

Median Predictions and Tour Progression

Our analysis also uncovered how median predictions varied throughout the Tour and across different stage types:

  • Tour Progression: As the Tour progressed, the gap between our predictions and the median times widened slightly, reflecting the increasing unpredictability and fatigue impacting the peloton.
  • Stage Variations: The median predictions showed significant variation across stage types. For example, in the final Mountain Stage 20, our model was 28:54 ahead of the median time, while in Stage 16, it perfectly matched the median time. This variability underscores the complex dynamics of the Tour, influenced by factors such as stage profile, weather conditions, and rider strategies.

BBS Modeled Time vs Winning Time

modeled time vs actual time

Root Mean Squared Error (RMSE)

Value: 9.23 minutes

Interpretation:

  • RMSE measures the average magnitude of the errors between the predicted times and the actual winning times.
  • An RMSE of 554.10 seconds means that, on average, the model's predictions are off by about 9.23 minutes from the actual winning times.
  • This value provides a sense of the typical error magnitude but is sensitive to larger errors due to squaring.

Mean Absolute Error (MAE)

Value: 7.51 minutes

Interpretation:

  • MAE measures the average magnitude of the errors between the predicted times and the actual winning times without considering their direction (positive or negative).
  • An MAE of 450.52 seconds means that, on average, the model's predictions are off by about 7.51 minutes from the actual winning times.
  • MAE provides a straightforward interpretation of the average prediction error.

Mean Absolute Percentage Error (MAPE)

Value: 3.04%

Interpretation:

  • MAPE expresses the error as a percentage of the actual winning times, providing a normalized measure of accuracy.
  • A MAPE of 3.04% means that, on average, the model's predictions deviate from the actual winning times by 3.04% of the winning time.
  • Lower MAPE values indicate higher accuracy. A MAPE of 3.04% suggests the model performs quite well in predicting the winning times relative to their magnitude.

R-squared (R²)

Value: 0.98

Interpretation:

  • R² measures the proportion of the variance in the actual winning times that is predictable from the modeled times.
  • An R² value of 0.98 indicates that 98% of the variance in the winning times can be explained by the model’s predictions.
  • Higher R² values (close to 1) suggest a strong correlation between the predicted and actual times, indicating the model fits the data very well.

Summary

  • Accuracy: The model shows high accuracy in predicting winning times, as indicated by a low MAPE of 3.04%.
  • Error Magnitude: The errors, while present, are reasonably small with RMSE and MAE values of about 9.23 minutes and 7.51 minutes, respectively.
  • Model Fit: The high R² value (0.98) demonstrates that the model explains almost all the variability in the actual winning times, indicating a very good fit.

Insights

  • The model performs exceptionally well overall, with strong predictive accuracy and a good fit to the actual data.
  • The small average errors (MAE and RMSE) suggest that while there are discrepancies, they are generally not large.
  • The high R² and low MAPE confirm that the model is reliable and provides accurate predictions for the winning times.

Going Deeper To Win on Race Day

Our BBS Tour de France modeling showcases the power of advanced predictive analytics in cycling. By focusing on the typical pull power, aerodynamics of group riding, and elite efforts on key climbs, we deliver predictions that are consistently closer to the winning times. While the median predictions varied, our models demonstrated robust performance across all stage types, offering valuable insights into the dynamics of the world's most prestigious cycling race. These types of predictions are, of course, a 10000 ft high-level starting point for deeper Tour-Level race analytics. By using this type of modeling and our Time Analysis tools, coaches and athletes can determine season goals, refine power targets, and even plan specific course attack strategies!

Below, we see a deep-level analysis of the last 4.5 km of Stage 14 where Pogacar accelerates away from Vingegaard to cement his lead on the way to his 3rd Tour de France victory. As you can see in the Time Analysis tool, by Pogacar pushing a higher power and accelerating at this point on the course, his rivals have minimal time to react because if they try to pull back time a kilometer later by pushing higher power, they will not gain nearly as much time per distance due to the course. In contrast, Pogacar can conserve a bit of energy from 149 km to 150 km in order to make another hard effort, eliminating any possibility of being reeled in and ensuring the stage win.

2024 TDF Analysis

In Stage 14 Pogacar attacked right around 147 km into the race and according to our Time Analysis tool and the Time Delta/Distance chart, this is right where he gains the most time per distance.