Which Stroke Is Truly the Decider in the Individual Medley? (Stats Included)

jay-litherland-individual medley

Which Stroke Is Truly the Decider in the Individual Medley?

The individual medley events can be some of the most exciting races. With every stroke change, there is the potential for the shape of the race to completely change. All of a swimmer’s strengths and weaknesses are on display, and this opens up many different possibilities for how a race can go.

We often think that the breaststroke leg is where the biggest lead changes happen. But this is not always the case. For example, at the 2016 U.S. Olympic Trials, Jay Litherland made his big move on Ryan Lochte on the freestyle leg. And over the years, there have been plenty of IMers who had breaststroke as their “weaker” stroke (most notably, Michael Phelps). So, which of the strokes really makes the biggest difference in the IM? Let’s find out if there is a clear answers.

Methodology

Splits were taken from the top 16 finishers in the 200 and 400 IM at the 2017-19 NCAA championships (both men and women), meaning that all the data is from short course yards. Prelims and finals swims were included, in order to balance swimmers who fell off in finals or went easier in prelims. This approach provided 96 data points in each event.

In order to determine which splits had the greatest variation, we will look at the spread and standard deviation. This will give a good idea of where there is the most average difference between swimmers.

Spread

The first results I looked at were the ranges between the fastest and slowest splits in each data set. Results are in the table below.

W 200 IM M 200 IM W 400 IM M 400 IM
Fly 2.63 2.01 4.33 3.81
Back 3.48 3.10 5.96 4.87
Breast 4.28 3.75 7.91 6.78
Free 2.46 2.23 6.25 5.36

 

brooke-forde-

Photo Courtesy: Peter H. Bick

Already, we see that breaststroke has the largest range for all four events. However, the data gets more interesting when looking for the second-highest range. For shorter events, backstroke had the second largest range, but for longer events, it was freestyle. This was ultimately because of a couple of slower times in the 400 IMs, indicating that swimmers were fading. For example, in the women’s 400 IM, there was one freestyle split of 1:00.42, but nobody else was slower than 59 seconds.

For a better measure of the average spread, I looked at the difference between the 25th and 75th percentile times. Results are in the table below.

W 200 IM M 200 IM W 400 IM M 400 IM
Fly 0.59 0.43 1.15 0.95
Back 1.26 0.92 1.66 1.59
Breast 1.47 1.04 2.36 1.75
Free 0.62 0.62 1.21 1.43

 

The results here paint a clearer picture. While breaststroke still has the largest spread, backstroke is the second highest in all four events. Freestyle, meanwhile, is now much closer to having the smallest spread. This indicates that freestyle splits are mostly packed into a smaller range, but have some splits that are much faster or slower than the average.

Standard Deviation

To continue getting a better understanding of the average variation from the mean, I looked at the standard deviations. The standard deviation gives a better idea of the average difference from the mean, rather than just looking at the difference between two arbitrarily chosen data points. Results are in the table below.

W 200 IM M 200 IM W 400 IM M 400 IM
Fly 0.50 0.32 0.84 0.78
Back 0.83 0.62 1.22 1.14
Breast 0.93 0.71 1.61 1.37
Free 0.47 0.47 1.04 0.99

 

These results back up the idea that breaststroke features the most variable of the strokes. Backstroke is a close second, especially in the 200 distances, with freestyle seeming to be the third most variable and butterfly the least variable.

Analysis

All three measures indicated that breaststroke has the most variation of the four strokes in the IM. This means that we can safely say that breaststroke is the stroke that does the most, on average, to change the shape of a race.

The data only comprises three years, but the trends were clear from the first 16 data points and there is no reason to believe they would change by adding more years into the mix. It may have been interesting to include all swimmers rather than just swimmers in the top 16. This would have added more variability to freestyle specifically, because many of these swimmers were fading toward the ends of their races. However, I wanted to mainly look at races where swimmers were at their best.

Another limitation would the applicability of the data to long course swimming. Especially with short course being underwater-dominated, the variability in swimming speed may not be captured well with this data. Further analysis could be done to see if these trends hold in long course.

Conclusion

Ultimately, while breaststroke is the stroke with the most variability, that does not mean that you have to be a great breaststroker to swim IM. The IM races are great because they allow you to find a formula that works best with your strengths and weaknesses. For example, take a look at these (partial) results from prelims of the men’s 2017 NCAAs.

Fly Back Breast Free Total
Gunnar Bentz 50.79 56.46 1:01.60 49.77 3:38.62
Jonathan Roberts 48.79 53.78 1:04.39 51.95 3:38.91

 

These two swimmers swam completely different races. In fact, none of their splits are even remotely close. But in the end, they both swam 3:38 and made the A final. You do not need to be great at any one stroke to be an outstanding IMer, you just have to be strong in all four strokes.

10 comments

  1. avatar
    Swim fan

    Wonder if same stats and conclusions would hold up in long course?

  2. avatar
    Warren

    An even better test would be to switch the order of strokes so that each stroke could be measured in each position. My guess is when the butterfly leg is last, you would see the greatest variance in times.

    • avatar
      Frederik Frederiksen

      Fun analysis that shows us that the Best IM swimmers is All round..
      It would be great to see the results as % though, as the breast leg i always the slowest, it makes sense that it has the largest difference in seconds.

    • avatar
      Joe Helmer

      With fly as the final leg of the 400 IM…
      Even though my best event was the 2fly, and my second best was the 4IM; that sounds grim. I would have still swum the 4IM if the order was altered, but what an unpopular event it would have been.

  3. avatar
    Retired swim coach

    Since breaststroke is the slowest stroke it seems only logical that it would have the largest range of time differential. (For example, compare the 100/200 back/breast in those same meets and you will find a larger time spread among the finalists in back races than in breast.) I think this study needs to have that factor included.

    • avatar
      Hunter Kroll

      Thanks for the reply! I went back to the data and compared their coefficients of variation, which is the ratio of standard deviation to the means. By this measure, I found that for the 200 IMs, backstroke had the highest CV, but for 400 IMs, breaststroke remained the most variable. Not sure why that is, but it could have something to do with more pure breaststrokers swimming the 200 IM on day one of the NCAAs (ex. Ian Finnerty’s backstroke split was 1.4 seconds higher than the mean). I’m interested to dig into the long course results as well to see if they tell a different story.

  4. avatar
    Anthony Iacopetti

    Great analysis. One major flaw, as is the case with much of the analytics in all areas of study is the data points. In order to truly gain an understanding of what is going on, significantly more data points are required. Predictive modeling and analytical modeling require vast data points. Everything is needed to be considered and catalogued. Stroke rate, distance per stroke, strength of each swimmer in each stroke of every event, height, weight, age(grade) proximity of race to other events, health, etc. External influences such as lane position, adjacent competition speed and wake generation, team point necessity, etc.

    You also need to look at the individual abilities of each swimmer in each stroke. For example, one reason Phelps was such a dominant Im’er is because of his elite ability in fly, especially the 200 fly. And let’s not forget, he was a 1:56 in 200 meter back. Lochte was the best ever, by world record standards, because of his dominance in backstroke. And just like Phelps, he was a 51 low in 100 meter fly. Katinka Hosszu had similar strengths to Lochte. Probably a better flyer comparatively. The common theme here, is great flyers and backstrokers extend less energy in the front half of the race, therefore allowing them to perform better in breastroke and ultimately have the legs and lungs to close out the freestyle leg. Change the order of the race and you may have a different t result.

    Again, I love the analysis. It is well documented and supported, with a thorough consideration of many aspects of the race, but as is the case with lots of analytics, it’s simply lacking in enough data points to gather meaningful conclusions. Many of theses problems will be solved as “computer vision” will provide us with many of these data points(it will see the race like a human, but record the data like a computer) in a real time and analytic data bots/algorithms will harvest quantitative data from other aspects of the event. At that point the AI/machine learning software will take the data and run millions of simulations to give us our answer.

    My guess is it will tell us the best IM’ers need to be GREAT flyer’s or backtrokers(or both), be competent in breast and the freestyle leg will be just fine.

    • avatar
      Noah Crawford

      Not that your points aren’t valid, because I 100% agree with them. I just think you might be misinterpreting what he’s concluding in this analysis. Hunter’s research here wasn’t to find which stroke makes you the best IMer, because that is so subjective that it’s almost not worth exploring (trust me, my lived experience as a decent mid-grade IMer can point to that). His conclusion is that breaststroke has the most variability of all 4 strokes, and therefore is the most likely to affect the outcome of the race. Phelps, Lochte, Hosszu were all incredible in the fly, back, and free, while average at best in breaststroke. But, if the research extends to them, that would indicate that, in the IM, their fly/back/free legs are not astronomically faster than the people around them. Indeed, if you look at the results of the mens 400 IM from London (the first I could quickly find), you see that the range of times in the fly leg is 3.9 seconds, 3.41 seconds for the back leg, and 2.54 in the freestyle. Seem like large ranges until you consider that the breaststroke leg had a range of 4.82 seconds. Women’s 400 IM had similar ranges, with the exception of Ye Shiwen’s monster freestyle leg.

      At the end of the day, no one stroke will make an IMer great, at least in my opinion. It just seems to be that, if your “weak” stroke is breaststroke, you’d find more success in the event by getting your breaststroke time closer to average than it would be to try and run away on another leg, because the ability of people to do breaststroke in the IM varies much more than the other 3 strokes

  5. avatar
    kevin j

    I think the obvious direction would have been a correlation matrix of all the splits with finish time. Or similarly a linear regression. That would get at which stroke is the best predictor of finish time, which seems to be what you set out to answer. ranges and standard deviations are helpful but dont show the link.

  6. avatar
    Robert Macartney

    I have known excellent IM swimmers who were not very good breaststrokers. However, they were dominant in the other three strokes. Nevertheless, the very best IM’ers I have known were excellent breaststrokers.
    It is my opinion that since the breaststroke is the third leg of the race where one is beginning to tire, the lack of efficiency in a weak stroke in enhanced by being tired and increases fatigue because your efforts are less effective.

Leave a Reply

Your email address will not be published.