Sunday, June 3, 2018

Swim the Charles!

Yesterday I raced in the Charles River One-Mile Swim, setting a new personal record with a time of 29:48. The water was 68-69 °F, just right for my 2-5 mm wetsuit. The water was surprisingly opaque - looking down in the water, I couldn't see my feet! Visually, it was like swimming in dark beer. In a way, it was pretty when the sun shined through splashed water, illuminating its deep golden-brown color. It's a bit like how urban haze can make sunsets more spectacular. The race organizers closely monitored water quality leading up to the event to ensure that levels of pollutants and bacteria were sufficiently low. The main issue is that rainfall causes runoff, temporarily degrading water quality. With no significant rain preceding the event, we were good to go.

crowd starting to trickle in

post-race merriment

coat hanger-like course shape

Here's how the results stacked up:

The race was staggered into two waves based on estimated finish time. I was placed into the first wave. Finishing with a very average time, I was slower than most in the first wave and faster than most in the second wave.

Where did people come from? Primarily Cambridge, Boston, Somerville, Brookline, and Belmont, in that order.

The biggest surprise I found in the data was how closely a person's bib number correlated to their race time. The trend is very clear in the plot above. In fact, the data indicates that a person's bib number is a significantly better predictor of their performance than any other factor, including their age and how far they traveled to the race. How interesting!

Looking to dig deeper, I performed a sexually bi-modal multi-linear regression for finish time based on age, bib number, and distance traveled to the race from each athlete's hometown using Google's Geocoding API. The error between predicted and actual times had a mean of about 0 minutes and standard deviation of about 3.9 minutes, which is not bad considering the sparseness of the data. I then ranked the results by "surprising-ness", taking the normalized regression error as a metric for how surprising a performance was:

Top 10 Surprising Performances
RankNameAge, yearsStateGenderTime, min.Reg. error, norm., %
1Kirkham Wood63CAM24.7126.4
2Rafael Irizarry46MAM26.3821.8
3Don Haut52MAM23.6220.1
4Donald Kaiser67MAM27.6420.0
5Darryl Starr49RIM24.0219.1
6Ursula Hester47COF29.7718.4
7Len Van Greuning50MAM22.1818.1
8Alex Meyer29MAM18.8117.9
9Louis Harwood28MAM29.9916.9
10Haruka Uchida22MAF23.7416.2

This is neat, but it's easily gamed, as a person could increase their "surprising-ness" significantly by registering at the last minute, making them appear statistically as more of an underdog, for example. To this end, I also characterized pure athletic performance by excluding bib number and hometown data, looking only at gender and age, which increased the error standard deviation to 5.6 minutes. In effect, this is similar to awarding prizes by division, but in a continuous sense rather than a discrete one with arbitrarily-defined brackets such as "Men 30-39".

Top 10 Athletic Performances
RankNameAge, yearsStateGenderTime, min.Reg. err., norm., %
1Alex Meyer29MAM18.8131.70
2Eric Nilsson30MAM18.9331.55
3Anton McKee24MAM19.7726.62
4Jessica Stokes41MAF22.1826.61
5Jen Olsen47MAF22.6326.39
6Len Van Greuning50MAM22.1826.14
7Kathleen Tetreault56MAF23.5225.39
8Gail Fricano44MAF22.7725.31
9Christophe Graefe44MAM22.0924.65
10Ed Baker39MAM21.9123.70

A major aspect of this swim is changing public opinion on the feasibility of the Charles as a public recreation area for swimming. A great deal of work has been to study, improve, and monitor its water quality. A 2016 report said that the prospect of a permanent swimming facility on the Charles "could potentially be feasible" and included this tantalizing photo below. It's a very intriguing possibility. Getting the river to a point where urban swimming is once again feasible would have significant benefits aside from enabling the creation of a forward-looking, yet retro swimming site.

artist's rendering of an urban swimming site on the Charles

