Thursday, March 15, 2018

Predicting March Madness with Elo ratings and Monte Carlo simulations

I was recently invited to join a March Madness bracket pool at my office. I quickly realized that this was a great opportunity to apply some of the concepts that I learned about in reading Kasparov's recent book Deep Thinking, discussed in my previous post. My original idea was to apply Elo ratings, originally developed for chess, to inform my bracket. The scope of this effort quickly grew as I realized that I could wrap additional layers of abstraction around an Elo calculator, incorporating Monte Carlo simulations and sensitivity analysis, to make a fully-automated bracket generator. In this post, I'll discuss my approach, method, and results.

About Elo ratings

Elo ratings were originally created for the purpose of quantifying a player's skill level in chess over time relative to their peers. The Elo rating system has many nice properties, such as the ability to predict game outcomes based on prior skill levels, and its drop-in/drop-out asynchronous nature. For its simplicity and power, the rating scale has found use in many sports beyond chess. You can read about the math behind it on Wikipedia.

Graph showing relationship between Elo difference and win probability; source

Elo vs. NCAA RPI

In ranking its teams, the NCAA uses a ranking system called RPI, or rating percentage index, which is an oddly arbitrary system. Historically, it seems to have good predictive power, but this may be somewhat deceiving. The act of receiving a high seed seems to be inherently advantageous regardless of team skill due to the bracket structure, 
as Jon Bois humorously shows:

The primary free parameter in the Elo rating system is its so-called K factor, which controls the speed at which ratings respond to apparent increases in player skill. In comparison, the RPI system has three weighting parameters, which are more or less arbitrary, and whose meaning is less obvious in a higher-level sense. As Wikipedia puts it, RPI "lacks theoretical justification from a statistical standpoint" and the "heavy emphasis upon strength of schedule [may give] an unfair advantage to teams from major conferences". My analysis supports this criticism of the RPI system - more on this below. More critically for this effort, it also has limited predictive power, so in my model, I use Elo exclusively and neglect NCAA ranking except as it influences the initial bracket structure.

Generating Elo ratings

Elo ratings are generated by examining win/loss records. I decided to limit my scope to only the 2017-2018 season, neglecting any prior data, to maintain simplicity and avoid the additional modeling assumptions that would be required.

I scraped a complete game log for all of the active teams and parsed them to remove duplicates and non-D1 games. In total, the scraper extracted 5,401 unique games between 351 D1 teams spanning 2017-11-10 to 2018-03-10. I also manually added the play-in "First Four" games, which are effectively regular-season games. For those interested, this dataset is available on my GitHub.

With a log of all unique games, calculating Elo ratings is straightforward. Teams are initialized with an average score, and their score is adjusted as they win and lose games according to Elo's equations. In essence, the equations predict the probability of each team winning based on their rating difference, then compare the actual result to the prediction, then update the teams' ratings accordingly. Its simple and self-correcting nature reminds me of Bayes' Theorem, another useful statistical tool for making predictions based on observations.

Elo over 2017-2018 season plus First Four, selected teams highlighted

Comparing Elo and NCAA rankings

Elo provides a check against the NCAA ranking system. Which teams are most overrated? Most underrated? Here is what the data says about the NCAA's top 25 teams:

Green: underrated; red: overrated, per Elo assessment
For Elo, K = 25 and mean = 1500; NCAA source: Google

Interestingly, the comparison produces a jarring collection of strong agreements and strong disagreements. Most overrated is Florida, ranked at #23 with a 20-12 record. Most underrated are Houston and Saint Mary's (CA), ranked #21 and #25, with records of 26-7 and and 28-5, respectively. The systems agree that Virginia and Villanova are the strongest teams. Here are my top 25 teams sorted by regular-season Elo:

Notably, Saint Mary's (CA), ranked #16 in Elo, did not qualify for the Round of 64 - a victim of the NCAA's archaic ranking system.

Determining postseason probabilities

Elo's equations combined with teams' ratings can be used to calculate the probability of any team beating any other team. 
For example, in the first game, Virginia, rated 1761, plays UMBC, rated 1604. Per Elo, Virginia's win probability is:

[1+10^((1604-1761)/400)]^-1 = 71.2%

So Virginia is most likely to win. At this point, a call to a random number generator bounded on 0 to 1 can be used to simulate a game. If the result is < 0.712, Virginia wins, otherwise, UMBC wins. So Virginia is most likely to advance here, but we also can't rule out an upset either. How to track both outcomes? For this, I turned to Monte Carlo simulations. This allows us to re-run this game, and all downstream games, an arbitrarily large amount of times until a coherent image of overall probability emerges. The result is the probability of any team reaching any round, encompassing both team skill and bracket structure. FiveThirtyEight has done a typically excellent job of visualizing this. My own visualization of the same data is shown below, though our models and probabilities differ.

As a clerical note
, I experimented with two approaches for dealing with Elo ratings after simulated games: "static" Elo, where the ratings are unchanged after games, and "dynamic" Elo, where the ratings are updated based on the simulated outcome. The difference between the two ended up being fairly negligible - more on this later.

Incorporation of Monte Carlo simulations

Each game in March Madness can have two possible outcomes, and there are 32+16+8+4+2+1 = 63 games in total. So the total number of brackets is 2^63, or 9.22 x 10^18. This enormous state space makes exhaustive analysis impossible, and also explains why Warren Buffett has not yet had to pay out his jackpot prize for a perfect bracket, and is not likely to. However, exhaustive analysis, even if it was possible, is not necessary - Monte Carlo simulations provide a sufficiently accurate approximation.

My program operates at a speed of about 5,000 tournaments per second on a Dell Precision 5520. It monitors the maximum probability of any team becoming champion, as a figure of merit, and considers the probabilities converged when this value changes less than 0.025% between 5-second intervals. This takes around 100,000 tournaments, or 20 seconds.

Visualizing probabilities

Click to enlarge - Elo K: 25, Elo mean: 1500, dynamic Elo in postseason

Colors: green denotes above-average probability for that round; red is below-average 

The probabilities are then sorted according to each team's probability of winning the championship - more why the probabilities are sorted this way below. The low probability of the most likely outcomes - e.g. the strongest teams have only a 5-10% of winning the championship - makes clear how difficult the outcome is to predict. The ultimate odds are relatively slim for all but the top teams, and absolutely slim for all teams.

Down-selecting to a single bracket

In moving from probabilities to a final bracket, I considered how the bracket will be scored and more generally, what the ultimate goal is. In my case, my goal is to maximize my bracket's score, as scored by ESPN, so I can win the office pool. ESPN's rules indicate:
  • Round of 32: 10 points per pick
  • Round of 16: 20 points per pick
  • Round of 8: 40 points per pick
  • Round of 4: 80 points per pick
  • Round of 2: 160 points per pick
  • Round of 1: 320 points per pick
This scoring system results in the optimal strategy being a bit counter-intuitive. Suppose every game is decided at random with a coin toss. The probability of correctly selecting the champion, worth 320 points, is 1/64, or 1.6%. One round upstream, the probability of correctly selecting both teams in the Round of 2, also worth 320 points, is (1/32)^2, or 0.1%. The percentages get progressively worse further upstream. In short, the expected point yield gets worse the further upstream you target, so the optimal strategy is to target as far downstream as possible - in effect, to build your bracket in reverse chronological order.

For my bracket, I use overall probability of winning the championship as the metric to determine which teams advance. If the rounds were equally weighted, or equivalently, if I was competing for overall percentage, I would simply advance the team with the greater Elo rating in each game, and Monte Carlo simulation would not be necessary.

In summary, my bracket does not represent my best guess at the tournament's outcome, but rather a series of choices made to maximize score - an important distinction. 

Sensitivity analysis

To probe the sensitivity of my results to my modeling assumption, I wrapped an additional layer of abstraction around the bracket generator and modified three of the key model parameters, borrowing historically stable values from chess to guide parameter ranges:
  1. Elo mean - 1000 and 1500, the two standard values I've seen in literature
  2. Elo K value - 10, 25, and 40, per FIDE's standards
  3. Elo rating in playoffs - either "dynamic" or "static"
Since it wasn't obvious which of these models was best, I decided to analyze all permutations, 12 in total, and weigh them equally. In other words, each bracket gets a vote, and the majority vote wins.

In general, the models are in unanimous agreement about the best choice for most bracket positions, including 
from the Round of 4 onward, and with only a single dissenting vote in the Round of 8. Further upstream, minor disagreements creep in progressively. After extracting a consensus from the brackets, I surveyed which, if any, models produced a bracket that exactly matched the consensus bracket - in other words, which model was most "centrist". I found that my initial set of model parameters - Elo mean: 1500, Elo K: 25, dynamic postseason Elo - provided an exact match. So the sensitivity analysis confirms that these settings are reasonably stable.

My bracket

Bringing it all together, here is my bracket:

Source Code

Full source code and data logs, minus the game scraper, are available on my GitHub.

Sunday, February 25, 2018

Tic-Tac-Toe, and musings about Kasparov/Deep Blue

I'm currently reading Deep Thinking by Garry Kasparov, best known for his historic series of chess games against IBM's Deep Blue supercomputer in 1996 and 1997. These games are considered to be a collective watershed moment in the history of man/machine interaction for demonstrating that computers had finally exceeded human skill in the game of chess. In his book, Kasparov grieves that Deep Blue's victory was not the result of a scientific breakthrough about intelligence, but was rather an inevitability given the predictable relationship between computation speed and chess Elo rating, combined with Moore's Law. Kasparov's loss is fascinating for both its historical importance and its contemporary relevance, as computers continue reaching into increasingly "human" domains, such as driving cars, writing stories, diagnosing illnesses, and composing music.

In 1997, it shocked the public that a supercomputer could beat the chess world champion. In 2018, it's even more shocking that Kasparov held his own, achieving a win and several draws, against a supercomputer capable of evaluating more than 200,000,000 positions per second. This is absolutely astonishing to me, and difficult to even comprehend.

Inspired by this reading, I wrote an AI to play a much simpler game, tic-tac-toe. Unlike with chess, the game tree of tic-tac-toe can be explored fully in a reasonable amount of time because the game's state space is many orders of magnitude smaller. Heuristic rules for perfect play are straightforward to grasp, and perfect play always results in a draw. My interest in tic-tac-toe has less to do with the game itself, and more in the game as a simple testbed for general 2-player perfect-information games like chess, the more complex of which are unlikely to ever be solved. Interestingly, this leaves room for technical and creative innovation in AI design, as evidenced by the recent dramatic overtaking of StockFish by AlphaZero in computer chess, which is radically different in design than its contemporaries.

My tic-tac-toe program uses random playouts to evaluate possible moves. A value matrix is incremented if the random playout results in victory, decremented if a loss, and unchanged if a draw. On my 2011 Dell E6420 laptop, the program executes about 700 playouts per second, which is very modest. Nonetheless, this is sufficient for the program to loosely deduce many higher-level game heuristics, such as:

  1. It's best to start play at the center, or secondarily the corners 
  2. A winning move out-ranks all other moves 
  3. If the opponent is one move away from winning, they must be blocked 
  4. Playing on the non-corner edges is weak unless blocking or winning 
To visualize the computer's "thoughts", I displayed the computer's value matrix alongside the game board - red for the worst move, green for the best, and gray for occupied spaces. It's fascinating to watch the computer's "thoughts" develop in a way that seems to mirror my own while playing in a much more casual fashion.

Most games result in a draw:

Here, the computer capitalizes on my second-move blunder to create a fork, achieving a victory:

Here, the computer fails to block, allowing me to win.

I probed this error on move 6 further by giving the computer a full 30 seconds to think, instead of the 2.5 seconds shown above. The value matrix came back as:

  NaN   NaN   NaN
-8907   NaN -4317
  NaN -4442 -8824

indicating that it considered the correct move, the block, a toss-up with its blunder move. Replaying the scenario repeatedly for debugging purposes, I noted that the computer chose to block about 50% of the time. Working through the game tree, it became clear why the computer blunders: the blocking move results in zero paths for O to win and one path for X to win, while the blunder move results in one path for O to win and two paths for X to win. The net score between the moves is equal at -1, so the computer essentially flips a coin in deciding between them.

So this is not a bug, but rather a limitation of the random playout valuation method in this particular game. It's easy enough to hard-code the computer to block, but it's much more interesting to see what the computer can and cannot work out on its own with a primitive valuation function. The short answer seems to be: a lot, but not everything.

Full source code can be found on my GitHub:

More games can be seen in this video:

Monday, October 30, 2017

"Fog of War" Maps with Real-World Data


In this post, I describe an approach for generating "fog of war" maps from real-world location data collected by Google via your mobile phone, in the style of classic real-time strategy video games, implemented in MATLAB.


Growing up, I was a big fan of real-time strategy (RTS) video games, with my favorites in the genre being Age of Empires II, Rise of Nations, Age of Mythology, and Impossible Creatures. A standard feature of these games is called "fog of war", a borrowed term for a game mechanic that hides areas of the map that you have not visited yet and/or are not currently occupying to simulate limitations on human view distance and encourage exploration. Typically, you have full visibility to areas you are currently occupying, and limited visibility to areas you passed through but have since vacated, as seen in the image below.

screenshot from Age of Empires showing the "fog of war" mechanic; source: Giant Bomb

Switching gears, I recently stumbled upon Google Maps Timeline, which allows Google users to peruse their personal location history as recorded by their mobile phones. I immediately found the data fascinating as a way of characterizing my transportation habits with uncanny precision. Putting the two ideas together, I reasoned that I ought to write a program to make a fog of war map with my personal real-world location data.


My personal data encompasses about 120,000 latitude/longitude points dating back to May 2017, or about a point every 1-2 minutes. At a high level, my MATLAB program works like this:

  1. Parse the JSON location data from Google Takeout to extract only the lat/lon points. Other data such as confidence intervals and transportation mode are discarded. 
  2. Define a geographic region of interest with a geographic center, in plain English, and a zoom level. 
  3. Use Google's Geocoding API to translate the geographic center to a lat/lon point. 
  4. Use a 3rd party MATLAB function, plot_google_map.m, to retrieve a map image and lat/lon vectors using the Google Static Maps API. 
  5. Crop the lat/lon points to include only those inside the region defined by the map. 
  6. Generate an opacity map for where to "lift" the fog by plotting a circular marker at each location, then applying a series of Gaussian blurs at varying opacity values. 
  7. Superimpose the map image with the opacity map to generate a shaded map image.
With this setup, the length scale of the unshaded area remains constant with varying zoom, which is not physically realistic, but provides a more interesting and useful map. In this way, as the zoom decreases, the fog of war becomes more general, and as you zoom in, more specific. Otherwise, the un-shaded regions would become nearly invisible at lower zoom levels, a reflection of how small the lengthscale of human vision is relative to the lengthscale of the Earth.


Cambridge, zoom: 15

Boston, zoom: 11

By switching from maptype "hybrid" to "satellite", you can generate maps that look more like what you would see in a standard RTS game, although they are less practically useful for exploring data:

Boston, zoom: 11, maptype: satellite


These maps produced fascinating insights about my location and transportation habits. The fog of war visualization emphasizes only whether or not you have visited a particular location, and not how long/frequently. A heatmap might be a good way of incorporating a duration weighting into the standard fog of war visualization. The low sample frequency, ~1 sample every 1-2 minutes, limits the usefulness of this visualization to areas that you visit most frequently. Areas that I visited on vacation for only a few days produce maps that are disjointed and not very informative. Some online sources say that sample frequency can be increased somewhat despite the frequency not being directly user-adjustable. Overall, I see this method as being a useful tool to aid future urban exploration as I push into new territory and expand my map.

Source Code

Full source code is available via my GitHub.

Saturday, September 30, 2017

Boston Sharkfest 2017

On 9/17/2017 I raced in the 6th annual Boston Sharkfest, a 1-mile open-water swim in Boston Harbor. It was an absolute blast and a very unusual swim even by Sharkfest standards. The race was scheduled to start at 8:50 am with a course that went from East Boston to Southie via Boston Harbor, but it was so foggy that it was impossible to see more than a few hundred feet across the harbor, making the passage too dangerous to attempt.

photo near South Station that morning

seagull at the edge of Southie with a couple ships barely visible through the fog

looking out across Fort Point Channel at Boston proper

swimmers waiting in line to pick up their chips and swim caps

The race organizers decided to re-route the race down and back Fort Point Channel, which divides Southie from Boston proper. We were told to loop around "a giant pyramid" in the middle of the channel, which no one seemed to know why it was there. I later learned that the pyramid is a public art installation dating back to 2014 by artist Don Eyles.

Don Eyles' pyramid in Fort Point Channel, said to be 10 feet tall.

approximate map of race course, 1.12 miles long as shown, according to

a wave of swimmers departing for the swim
credit to Steve Bolter via

We were told that the water had been measured at 66 °F, although it seemed colder when I first jumped in. An older man next to me, after jumping in, kept saying "Oh, Jesus Christ. This is 66? Jesus Christ..." I had been on the fence about wearing my new wetsuit, preferring the feel and simplicity of jammers, but was very glad to be wearing it when I hit the frigid water.

The course felt distinctly urban - we swam under three bridges, with some early risers stopping to watch above, with tall buildings rising up on both sides. The channel was a good width - not too narrow for overcrowding, and not too wide that it was easy to lose your bearings. For a course that the organizers seemed to put together only minutes prior to the race, it all seemed to work out pretty well, and it was a beautiful swim. I made sure to touch the backside of Eyle's pyramid at the halfway point, noting its material, surprisingly, to be soft, probably made of something similar to polystyrene. I heard a few people afterwards saying they felt a bit grimy from the questionable water quality, but I didn't notice anything. I had expected the water to be slightly brackish due to its proximity the mouths of the Charles and the Mystic, but it tasted like regular seawater to me - quite salty.

Here are some graphs I made with the race results, with my result overlaid:

A surprising result here - non-wetsuit swimmers tended to finish faster than wetsuit swimmers, even though wetsuits are known to increase speed by increasing buoyancy and reducing drag. I suspect there is an underlying selection bias here, i.e. the sort of person to brave the cold water without a wetsuit is probably going to be in strong physical shape and tend to perform well.

This heatmap shows two semi-distinct clusters - one centered at age 20 with a 23-minute finish, and another at age 49 with a 28-minute finish. Interpolating between them, you can see that in general, people tend to slow down about 5 minutes per 25 years of age, or about 1% per year.

Overall, this was a great race and I'm looking forward to doing it again next year, hopefully across the harbor if weather allows, and perhaps a bit faster too.

Friday, August 25, 2017

Optimizing softball fielding positions

In April 2017, I signed up for a coed recreation softball league through HubSports, which runs several recreational sports leagues throughout the greater Boston area. When signing up online, I checked a box indicating that I would be willing to serve as team captain for whatever team the organizers put together. A few days later, the teams were arranged and I was selected as the captain of the "Danehy Dozen".

One of the first things I did was send out a survey to my teammates asking for everyone's preferences on fielding positions. There was a lot of variety—some people only liked a couple positions, some people liked playing only infield or outfield, and some had no preferences at all. I quickly realized that with 10 fielding positions—our league plays with 4 outfielders—and ~13 players, the number of possible fielding lineups was immense at 13!/3!, or more than 1 billion! I reasoned that I could save time and improve player satisfaction by writing a program to determine optimal fielding lineups for maximum player happiness, based on the survey results.

The first implementation of the program, written in MATLAB, worked like this:
  1. Generate a random lineup.
  2. If fewer than 3 women are on the field, discard the lineup and start again from step 1. This is a coed league requirement to avoid penalties.
  3. Score the lineup, giving 1 point if a player likes their position, 0 if they are neutral, and -1 if they dislike it.
  4. If the lineup's score is equal to or greater than the current best lineup found, output it as text.
  5. Loop back to step 1.
This worked well, and generally after 10-15 seconds of searching, the program stumbled upon an arrangement where everyone liked where they were playing. However, I noticed a problem: since the program sought only to maximize score, players who liked more positions were more likely to be placed in lineups, and conversely, players who were choosier were more likely to be benched. Since I also wanted to equalize play time, this was not an acceptable behavior.

To fix this, I altered the scoring step, focusing on individual player satisfaction—i.e., for this player in this position, are they as happy as they can be? I represented maximum happiness as the most-positive response they gave, as some players were neutral on everything. I also shifted from a -1 to 1 scale to a 1 to 3 scale, since happiness is like length in that it can't be negative. The new scoring step was:

  1. ...
  2. ...
  3. Score the lineup, giving 1/N points if the player dislikes their position, 2/N if they are neutral, and 3/N if they like it, where N represents the most-positive response that player gave on a 1-3 scale.
  4. ...
  5. ...
This new scoring method both maximized player happiness and equalized play timea success! However, during our first game I realized there was still room for improvement when the program inadvertently placed newer players in high-pressure positions like shortstop and left field. I also began to realize that part of maximizing player happiness is maximizing the team's win probability, since everyone likes to win and people generally prefer to play positions where they are most likely to succeed.

To this end, I made one additional change. Prior to each game, I would open the cell array representing player preferences and update it with my own subjective assessments based on player performance in practices and games. I primarily used this step to steer the program away from putting players in positions where I thought they would not perform well based on my assessment. As the season progressed and I refined my ideas of who would play best where, a stability emerged from our lineups where players tended to play the same 1-2 positions, which they liked and were well-suited in terms of skills to play.

We ended up finishing 3rd out of 5 teams in our league during the regular season, with a win-loss record of 3-3. The real magic happened during the playoffs, where we unseated the #2 and #1 teams in a doubleheader to claim the championship! Overall I was very pleased with this optimization approach. I think it helped build team spirit by generating fair and successful lineups, leading to player satisfaction and the emergence of a team identity, a virtuous cycle that continues into our second season together.

The Danehy Dozen, 2017 champions of HubSports' Danehy Monday night spring league

Wednesday, August 23, 2017

Digital un-varnishing

A post-processing algorithm for improved color accuracy in photos of paintings

Taking good pictures of oil paintings is surprisingly challenging. When I photograph paintings, my goal is to take pictures that are sufficiently accurate and lifelike that my monitor/screen appears to be the painting itself, rather than a picture of the painting. Improving the hardware of your photography setup - lighting, camera, lenses - will cause the most significant improvements in this respect, and these hardware aspects are well-documented on the internet, so I won't go into them. Instead, my focus in this post is on the digital post-processing of pictures, and specifically on a technique and algorithm that I use in my personal work once I've taken the best picture I can with my setup.

The idea came to me when I connected two observations:

  1. In painting restoration, one of the first steps a restorer will usually take is removing the discolored varnish from the surface of the painting with a solvent. By removing this brownish/yellowish translucent film, the painting looks much closer to the way it did when it left the artist's easel.
  2. Pictures of paintings, particularly with imperfect photography setups, often have a characteristic neutral-colored haze that looks much like discolored varnish, although the color of the haze is usually closer to the average color of the painting rather than the yellow-brown color of discolored varnish.

My idea was to somehow invent an algorithm that would digitally un-varnish my pictures. To get some physical intuition, shown below are examples of paintings before and after removal of discolored varnish. Removing this translucent film dramatically improves the apparent color vibrancy and value contrast of the painting to near their original levels.

Two paintings before and after restoration, source:

From an image processing perspective, the restorer is removing a translucent overlay from the original image, i.e. the paint itself. With an RGB color image, if the color of the overlay and its opacity are known, then the overlay can be digitally removed using simple arithmetic. First, consider the process in reverse, where a translucent overlay is intentionally added to an original image:

A neutral brown-gray, RGB [115 99 87], is overlaid at 35% opacity, source: author

The original image on the left has more contrast, color variation, and clarity than the image on the right. These are all desirable qualities for pictures of paintings.

From an image processing perspective, adding a translucent overlay is a simple operation: the color of each pixel in the new image is a weighted average of the color of that pixel in the original image and in the overlay, where the weights are derived from the the opacity of the overlay. The algorithm that I developed does the same process, but in reverse. The input is a hazy image that I want to improve. The assumption is that the input image is comprised of two superimposed images: the original image, and a translucent overlay. I have experimented with using two approaches for calculating the overlay image:
  1. Calculate the average color of the input image, and set every pixel in the overlay image to this color.
  2. Perform a Gaussian blur on the input image, with the blur radius as a parameter to be tuned.
The first method is simpler and faster, but the second method sometimes yields better results. It's mostly a matter of taste and what best suits the input image. In the limit where the Gaussian blur radius approaches infinity, method 2 approaches method 1, although would also approach infinite calculation time. Once you have an overlay image to subtract, the only remaining step is to select an opacity -  15-30% seems to work best - and then the digitally un-varnished output image can be calculated.

I wrote a MATLAB function that uses a GUI for adjusting the two main parameters, the Gaussian blur radius and the overlay opacity. It also has functionality to zoom, pan, compare, and export. It also shows a real-time PDF (probability density function) and CDF (cumulative distribution function) for a grayscale version of the original and modified images to visualize changes to the color distribution. Here is a screenshot:

GUI in action, painting by Michaël Borremans

Basically, what this algorithm does to the image PDF and CDF is stretch them both horizontally so that the image covers a broader dynamic range while preserving the average brightness of the original image. If the opacity is set too high, the PDF and CDF can stretch to the extreme ends and saturation to white and/or black can occur.

Graph showing effect of increasing overlay opacity on image CDF

This graph shows the CDF for a particular painting as the overlay opacity is increased. Note that black saturation, color intensity = 0, starts happening at around 25% opacity, and that the mean of the CDF remains fixed at the mean color intensity, denoted by the dashed lined, as the transform is applied.

It's worth mentioning that this process is most useful when you can actually physically see the original painting yourself in good lighting, otherwise you are only guessing what the original painting actually looks like.

Here are some before/after comparisons:

Overlay opacity: 17%, blur radius: ∞ (used method 1), source: author

Overlay opacity: 24%, blur radius: 300 pixels, source: Winslow Homer

Overlay opacity, 24%, blur radius: 67 pixels, source: Michaël Borremans

The difference is fairly subtle, but, in my experience/opinion, worthwhile, especially if the paintings are your own and you want the best pictures possible. Here is a couple video of my MATLAB program in action:

You can download my MATLAB code here.

Note: an original draft of this post was written in March 2016, and completed/published in August 2017.

Boston Triathlon 2017

On 7/30/2017, I did my first triathlon, The Boston Triathlon, sprint-length distance - 0.5 mile swim, 10 mile bike ride, 5.5 km run. It was a great time! Here is a photo of me from the bike portion:

I scraped the results database and made some graphs to see what I could learn about the athletes that completed this event. Here are some highlights:

My swim ranked 166 out of 574, or 29th percentile.

Here is a histogram with everyone's overall time. I am just about at the top of the bell curve.

Interesting, the correlation of age vs. overall time was weak - good news for aging triathletes.

Plotting individual event times against each other as a scatter plot showed much stronger correlation.

Here's the age breakdown of people that finished the race.

You can download my parsed results database here. The file extension is .mat, for MATLAB.