crowd starting to trickle in
post-race merriment
coat hanger-like course shape
Here's how the results stacked up:
The race was staggered into two waves based on estimated finish time. I was placed into the first wave. Finishing with a very average time, I was slower than most in the first wave and faster than most in the second wave.
Where did people come from? Primarily Cambridge, Boston, Somerville, Brookline, and Belmont, in that order.
The biggest surprise I found in the data was how closely a person's bib number correlated to their race time. The trend is very clear in the plot above. In fact, the data indicates that a person's bib number is a significantly better predictor of their performance than any other factor, including their age and how far they traveled to the race. How interesting!
Looking to dig deeper, I performed a sexually bi-modal multi-linear regression for finish time based on age, bib number, and distance traveled to the race from each athlete's hometown using Google's Geocoding API. The error between predicted and actual times had a mean of about 0 minutes and standard deviation of about 3.9 minutes, which is not bad considering the sparseness of the data. I then ranked the results by "surprising-ness", taking the normalized regression error as a metric for how surprising a performance was:
Top 10 Surprising Performances
This is neat, but it's easily gamed, as a person could increase their "surprising-ness" significantly by registering at the last minute, making them appear statistically as more of an underdog, for example. To this end, I also characterized pure athletic performance by excluding bib number and hometown data, looking only at gender and age, which increased the error standard deviation to 5.6 minutes. In effect, this is similar to awarding prizes by division, but in a continuous sense rather than a discrete one with arbitrarily-defined brackets such as "Men 30-39".
Top 10 Athletic Performances
A major aspect of this swim is changing public opinion on the feasibility of the Charles as a public recreation area for swimming. A great deal of work has been to study, improve, and monitor its water quality. A 2016 report said that the prospect of a permanent swimming facility on the Charles "could potentially be feasible" and included this tantalizing photo below. It's a very intriguing possibility. Getting the river to a point where urban swimming is once again feasible would have significant benefits aside from enabling the creation of a forward-looking, yet retro swimming site.
artist's rendering of an urban swimming site on the Charles
No comments:
Post a Comment