Sunday, March 8, 2020

Prioritizing Color Over Value

A discussion on value-priority vs. color-priority in color theory and painting, and a quantitative method for shifting between them in digital photos.



Motivation

As a painter, I work from both life and from photos. Both approaches have key advantages and drawbacks. Painting from life is the best method for accurately capturing a scene's colors, but is somewhat limited to static scenes. Painting from photos can capture dynamic scenes, but has limited color accuracy. As an engineer, I'm interested in unifying the approaches in a way that preserves the best aspects of each.

Let me start by defining my terminology:


  • Painting photo: a photo of a painting that accurately matches the color of the physical painting
  • Camera photo: a photo straight from a camera without modification
  • Reference photo: a camera photo, augmented digitally for better use as a painting reference

Because all these photo types reside within the same digital color gamut, as I'll show, nothing prevents a reference photo from having the same desirable characteristics as a painting photo. However, camera photos make for poor references photos due to their inaccurate and uninspiring portrayal of color. Therefore, I'm interested in how to turn camera photos into reference photos using image processing. You can think of this approach as an alternate path that parallels the traditional/academic process, both leading to the same end result.



I've previously written about two image processing techniques toward this goal - Histogram Diffusion for color differentiability, and 3D Focal Point Blur for focal realism. In this article, I'll discuss another approach, inspired by artistic color theory.

Background

In his 2009 book Landscape Painting, Mitchell Albala describes the concept of "color identity":
"Extremes of value profoundly affect a color's ability to be read as color...A color's chromatic identity is most visible when its value is neither too dark nor too light but is in the middle-value range."
Albala illustrates this concept of "color identity" with a chart, reproduced below, showing various colors transitioning from very light (near-white) values to very dark (near-black) values. Colors that are very light or very dark have nearly no perceptible "color". In other words, they read almost as pure values, or grayscale. Another way to think of this is how "nameable" a color is. This broad definition is what Albala means by "color identity".



To clarify the terminology:


  • Hue: color, independent of lightness and intensity
  • Saturation: how vibrant, intense, or colorful a color is
  • Value: how light or dark a color is

Albala illustrates this concept by comparing two schools of landscape painting:

1: 17th century Dutch "Golden Age" landscapes are characterized by "value-priority", maximizing value contrast (sometimes called chiaroscuro) at the cost of low color identity.


Meindert Hobbema, Entrance to a Village, c. 1665


I've copied the Levels histogram from GIMP (an open source Photoshop equivalent), a visualization for how the photo utilizes the value gamut. This painting shows a strong bimodal (two-peak) distribution which creates a value-priority effect.

2: Impressionist landscapes are characterized by "color-priority", maximizing color identity at the cost of low value contrast.

Claude Monet, Antibes Seen from Las Salis, 1888




The Levels histogram shows a clear difference: just one peak, so colors are much closer in terms of value, differentiated primarily by color, creating a color-priority effect.

To simplify, you can think of these two schools as favoring differing regions of the color chart to get their ideas across:


Which is more realistic? More aesthetically pleasing? More correct? These are fundamentally subjective questions, best left to personal taste. As Albala notes, "[both] successfully depict natural light", "[although] the Impressionist approach to color is perhaps more in tune with contemporary color sensibilities".

My take is that neither image is particularly realistic, but I personally find the Impressio
nist approach much more painterly and compelling, and closer to "correct".


Albala has since expanded on this topic at his blog with a couple excellent articles:


Quantifying Color Identity

Digital color provides a quantitative framework to examine and enact Albala's view of color identity. But it first requires some careful consideration of color space nuance.



RGB Color Space

Let's start at square one with RGB color, the standard for storing and displaying digital images. RGB color has three channels corresponding to red, green, and blue light, and within a channel, a pixel can take any of 256 values, 0-255. Therefore, there are 256^3 or ~16,700,000 colors available.

RGB's core purpose is telling monitors how to display color, and it's very effective for this. However, its utility in addressing color identity is very limited, because the intuitive properties of hue, saturation, value - the way we think about color - are all interdependent with respect to the axes of R, G, B.

HSV Color Space

Invented in the 1970s, the HSV (hue, saturation, value) color space was designed to provide more intuitive framework for working with digital color. Whereas RGB is a cube, HSV is a cylinder; hue describes the angle, saturation the radius, and value the height. It's computationally easy to transform between RGB and HSV, which was initially a significant advantage, but largely moot in the current era. HSV's key shortcoming is that saturation and value are not actually independent - just close to it. This can be observed from a constant-hue chart, which corresponds to half of a cross section of the HSV cylinder. Here is H=130°, arbitrarily:



Columns theoretically have constant saturation, and rows theoretically have constant value. To test the latter, we can convert the chart to grayscale, either by explicitly switching the image mode or desaturating it completely. Both provide the same result:



Examine the grayscale chart and you'll see upon close inspection a couple key shortcomings:


  1. As you move horizontally within a row, value decreases (darkens) slightly. The actual disparity between min and max across rows is about 5%.
  2. As you move vertically within a column, the rate at which value changes is not uniform. Note for example the steep gradient from 10-15% value.

While very subtle in this example, these issues significantly impede HSV's ability to address color identity.

Lab Color Space

The Lab color space, invented in 1976, solves these problems with full independence of hue, saturation, and value, as well as perceptual uniformity. Like RGB, Lab is a 3D orthogonal color space:


  • L corresponds to a color's value, or lightness/darkness
  • a corresponds to a theoretical green-magenta/red axis
  • b corresponds to a theoretical blue-yellow axis

Whereas RGB is discretized 0-255, Lab is continuous and unlimited. Therefore, Lab can theoretically describe "imaginary" colors that cannot be displayed in RGB, or describe infinite colors between RGB discretizations. It is perceptually uniform, unlike RGB and HSV, meaning that Euclidean (straight-line) distance corresponds linearly to perceived color distance/difference in the human eye, which is a significantly non-linear sensor.

Every color in RGB color space can be converted to Lab, and vice versa. Because the equations that perform this conversion are non-linear, the neat RGB cube becomes non-trivially deformed in Lab color space:



In this visualization, a uniform, dense point cloud is created inside the RGB cube, then converted to Lab, and its boundary is approximated using a concave hull algorithm. The mesh for each boundary is overlaid on the volume. The non-linearity of human vision causes the initially-uniform point cloud to be significantly non-uniform in Lab color space, which is reflected in its boundary mesh.

These visualizations are very colorful - in fact, they could not be any more colorful, because their outer surfaces contain their most highly saturated colors. However, keep in mind that this visualization shows only the outer surfaces, which contain a small fraction of the complete color gamut. To view these less-colorful colors, you have to look inside these volumes. This can be achieved with a cross-section:



Here, I section the RGB cube along its R axis arbitrarily, and the Lab volume along its L axis. On the Lab gamut, I overlay the color space's "neutral axis", which occurs where a=0 and b=0. In other words, all colors on this line have zero hue because they are pure shades of gray. Note that the neutral axis intersects pure black and pure white. Hue and saturation are defined simply as:


  • Hue is a color's angle from the neutral axis, atan2d(b,a)
  • Saturation is a color's distance from the neutral axis, sqrt(a^2+b^2)

With this visualization, you can show that Lab has true value independence by converting any frame to grayscale:



Note that in Lab, there is only a single, constant shade of gray within the cross-section boundary - not so with RGB which has no such constraint.

This framework provides a basis for corroborating Albala's notion of color identity. Since the colors with the strongest color identity occur on the outer surface of the gamut volume, we can deduce that saturation is a proxy for color identity. Further, we can take the section area as a proxy for color identity potential at that value:



This is a general result for all hues; as you can see from the cross-section, different hues achieve maximum color identity at different L values. This can be compared with the color swatch chart from earlier:




Here I've superimposed the earlier color swatch chart with the color identity potential plot. The swatches don't necessarily occur at the correct L values, so the result is merely suggestive, but the qualitative fit between the two is clearly strong.

Why Photos Tend Toward Value-Priority and Drab Colors

Modern digital cameras in "automatic" mode are so adept in adjusting their settings under the hood that it's easy to think of their images as objective truth. However, much of this process is actually based on heuristics that are designed to provide decent results over a wide range of conditions. This is ideal for the typical consumer who simply wants to point and shoot with minimal fuss.

However, for those who paint from photos, the automatic settings often provide seriously sub-optimal results - namely, they tend to minimize color identity. This is both deliberate and reasonable: creating a value-prioritized image is algorithmically easy and will usually look decent, whereas creating a color-prioritized image is not straightforward and will be prone to pitfalls. Allow me to illustrate:

Suppose you are the camera's setting adjustment algorithm. You look out at the scene and see this distribution of light:


Note that a scale and units are not provided; a camera is not measurement tool. You simply see that there are some darker regions, some midtones, and some lighter regions. So that the image can be saved and displayed, your job is to fit this distribution into a fixed bounding box:


How do you do this? Approaches vary by manufacturer, but one certainty is that you don't want to saturate any part of the spectrum. This would mean "pushing" any part of the distribution outside of the bounding box, resulting in pure white or pure black. For example:


No good - the lower end is saturated.


No good - the upper end is saturated.


Maybe okay, but there's so much leftover space. What to do with it? You can shift the distribution left or right by as much as 25% without saturating. So which of these 50 different solutions is best? There's simply isn't a good, general-purpose, algorithmic way to make this decision. So what generally happens is that the algorithm simply scales and stretches the distribution so that it's just shy of saturation on both ends.


This requires just a couple hard-coded parameters to work correctly, and is simple, open-loop (requiring no feedback), and almost always produces decent results. Wide distributions are safer and easier to work with, because by design they fit neatly into the bounding box.

This distribution comes from a photo of mine from Yellowstone:


Does the distribution or photo look familiar? In the sense of having a wide, bimodal (two-peak) distribution with few midtones, it's actually very similar to the drab Dutch landscape from earlier. This is no coincidence - the pictures were, in effect, recorded with a very similar process.

So, digital photos have a tendency to be drab for the same reason that 17th century Dutch landscapes did. And by drab, I mean lacking in color identity. What can be done about this?

How to Shift From Value-Priority to Color-Priority

The Lab color space works well for shifting photos from value-priority to color-priority, or vice versa, for the reasons discussed earlier. To summarize the process in geometric terms:

  1. Store a photo as an RGB point cloud
  2. Convert the RGB point cloud to Lab color space
  3. Manipulate the Lab point cloud using linear transforms
  4. Convert the Lab point cloud back to RGB for display
  5. Repeat from Step #3 until satisfied with result

The linear transforms in Step #3 are simple. For each of the three properties of hue, saturation, and value, the linear transform is defined by an offset and scale which is applied globally to the entire point cloud. Continuing with the geometric analogy, these transforms as a whole enable you to move, stretch, compress, twist, and expand the point cloud in an intuitive and freeform way. It's easy to push parts of the point cloud outside the limited RGB gamut, so the Lab-to-RGB conversion process accounts for this by saturating, or limiting its input before conversion.

Because these adjustments are best done by eye, they require a good deal of tweaking to get right. To facilitate this, I wrote a GUI to quickly get the settings right before exporting. Here's an overview:



A control panel lets you define the scale and offset for lightness (i.e. value), hue, and saturation on sliding scales. There's also a slider to control the quality of the preview, and buttons to restore defaults, compare before/after images, and to export the image at full resolution.




A histogram panel shows the result of the scaling operations, before and after. This is most helpful for shifting value to where the desired level of color identity can be achieved. The functionality for scaling hue is fun to play with, because it can produce wild and impossible color palettes, but for my purposes I leave it as-is.



Lastly, there's a preview panel which shows a downsampled copy of the image with the transforms applied. An alphanumeric string across the top conveys the transform parameters, which are also appended to the filename on export.

There's some skill and taste involved in manipulating photos with this technique. In general, I try to keep the peaks of the value histogram between 20 and 80%, outside of which there is minimal color identity. With the saturation scaling, the offset helps "push" neutrals outward for greater color identity, while the scale helps accentuate the already-vibrant regions of an image.

Demo Gallery

Here are some photos that I've attempted to shift from value-priority to color-priority using this technique, all taken with my 2003 Canon Rebel without any prior modifications. In each photo, the left half is the original and the right half is the result.























And to circle back to the original concept, here are a couple examples of shifting a painting's color gamut to align with a different school of painting:

Recasting a Dutch landscape
 as an Impressionist painting:


Meindert Hobbema, The Avenue at Middelharnis, 1689

Recasting an Impressionist painting as a Dutch landscape:


Claude Monet, Coquelicots [Poppy Field], 1873


Source Code

Freely available on my GitHub.

Acknowledgements


  • Mitchell Albala provided corrections and suggestions for the Motivation and Background sections.

Saturday, February 15, 2020

Paint Palette to Color Gamut

Overview

In mid-2019, I published "How Paints Mix", outlining a method for mathematical simulation of color in paint mixing. This method can be applied in two distinct ways:
  1. Color Mixing: Predict the digital color of an arbitrary mixture of primary paints.
  2. Color Decomposition: Decompose a digital color into a primary paint mixing recipe.
My original article focused heavily on Mixing, leaving Decomposition as an afterthought. In this article, I'll lay out a much-improved decomposition method that works by translating an analog paint palette to a digital color gamut.

Decomposition by Random Optimization

Previously, to decompose a digital color, I used a simple random optimization method:
  1. User provides a "target" digital color to decompose.
  2. Start a timer.
  3. Randomly generate a valid recipe and evaluate its color distance to the target.
  4. If the color distance is less than the previous best, save the result and reset the timer.
  5. Repeat #3-4 until 3 seconds have elapsed since the last timer reset.
  6. Return the best known solution.
This algorithm is simple and works well enough, but has the drawbacks of being relatively slow (~5 seconds per decomposition), not repeatable, and not necessarily optimal.

Size of Solution Space

I initially implemented random optimization because I estimated that the solution space of possible recipes for a given palette was far too large to search exhaustively. Considering my baseline values for painting:
  • Number of primary paints: 7
  • Maximum number of ingredients per recipe: 4
  • Mixing precision: 5% or 1/20
I reasoned that if any primary can have 21 possible values (0%, 5%, 10%,...100%), then the solution space could be as large as 21^7 or 1,800,000,000 recipes. While technically true that this represents an upper bound, this turns out to be an extremely inaccurate estimate.

When you account for the fact that recipes must sum to 100% and include no more than 4 ingredients with non-zero fractions, the solution space shrinks by an incredible 5 orders of magnitude to about 35,000 recipes. This solution space is then possible to generate and search exhaustively.

As a clerical note: I use the term "primary" loosely, to mean any paint in its pure, unmixed state, i.e. straight from the tube. Here are my primaries, all from Winsor & Newton's "Winton" line of oil paints:


Creating a Cookbook

Realizing that there is already a convenient term to describe an exhaustive collection of recipes, I began using the term "cookbook" to describe this database of colors. The main challenge of creating a cookbook is generating the set of all possible recipes. My method works as follows:
  1. Generate an ingredient subset matrix. This is an n-choose-k aka binomial coefficient problem. For 7 primaries and a 4 ingredient max, there are nchoosek(7,4) = 35 ingredient subsets, i.e. 35 unique ways of picking 4 ingredients from a set of 7 when order is irrelevant. This result looks like:
    • [1 2 3 4] Row 1
    • [1 2 3 5] Row 2
    • [1 2 3 6] Row 3
    • [1 2 3 7] Row 4
    • [1 2 4 5] Row 5
    • ...
    • [1 4 6 7] Row 19
    • [1 5 6 7] Row 20
    • [2 3 4 5] Row 21
    • [2 3 4 6] Row 22
    • ...
    • [3 4 6 7] Row 33
    • [3 5 6 7] Row 34
    • [4 5 6 7] Row 35 (end)
  2. Generate a fraction vector representing the set of all permitted values for a single ingredient. For 5% precision, this result looks like:
    • [0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 ... 0.90 0.95 1.00]
  3. Generate a matrix for all possible 4-element permutations of the fraction vector, with repetition permitted. I use permn for this purpose. This result looks like:
    • [0.00 0.00 0.00 0.00] Row 1
    • [0.00 0.00 0.00 0.05] Row 2
    • [0.00 0.00 0.00 0.10] Row 3
    • ...
    • [0.45 0.05 0.15 0.35] Row 83,861
    • ...
    • [0.45 0.85 0.10 0.10] Row 90,891
    • ...
    • [1.00 1.00 1.00 0.95] Row 194,480
    • [1.00 1.00 1.00 1.00] Row 194,481 (end)
  4. Delete all rows of the permutation matrix that do not sum to 1.00 or 100%. This winnows the list from 194,481 rows to 1,596 rows. This result looks like:
    • [0.00 0.00 0.00 1.00] Row 1
    • [0.00 0.00 0.05 0.95] Row 2
    • [0.00 0.00 0.10 0.90] Row 3
    • ...
    • [0.20 0.20 0.05 0.55] Row 796
    • [0.20 0.20 0.10 0.50] Row 797
    • [0.20 0.20 0.15 0.45] Row 798
    • ...
    • [0.95 0.00 0.05 0.00] Row 1,594
    • [0.95 0.05 0.00 0.00] Row 1,595
    • [1.00 0.00 0.00 0.00] Row 1,596 (end)
  5. Create a recipe matrix by copying the permutation matrix for each row of the ingredient subset matrix. This result looks like:
    • [0.00 0.00 0.00 1.00 0.00 0.00 0.00] Row 1
    • [0.00 0.00 0.05 0.95 0.00 0.00 0.00] Row 2
    • [0.00 0.00 0.10 0.90 0.00 0.00 0.00] Row 3
    • ...
    • [0.05 0.00 0.25 0.00 0.00 0.10 0.60] Row 24,243
    • [0.05 0.00 0.25 0.00 0.00 0.15 0.55] Row 24,244
    • [0.05 0.00 0.25 0.00 0.00 0.20 0.50] Row 24,245
    • ...
    • [0.00 0.00 0.00 0.95 0.00 0.05 0.00] Row 55,858
    • [0.00 0.00 0.00 0.95 0.05 0.00 0.00] Row 55,859
    • [0.00 0.00 0.00 1.00 0.00 0.00 0.00] Row 55,860 (end)
  6. Eliminate duplicate rows. This reduces the row count from 55,860 to 35,651.
  7. Simulate each recipe, recording its reflectance spectrum and color in XYZ, RGB, and Lab formats.
  8. Return the completed cookbook, totaling 17.3 MB (or 2.4 MB if excluding reflectance spectra which are not strictly necessary). The entire process of generating the cookbook for this set of parameters takes about 18 seconds.

Decomposition by Cookbook

With the cookbook complete, decomposition is very fast, accurate, and repeatable. A target color in RGB format is converted to Lab color space, its color distance to each cookbook recipe is evaluated, and the recipe with minimum color distance to the target is returned. This happens in about 0.04 seconds, permitting decomposition of about 24 colors per second, or ~125x the speed of random optimization.

There are a few finer details required to make this process work optimally:
  • If supplying target colors from an RGB image, the image's value domain must be aligned with the cookbook's value domain. In other words, the image must be scaled such that its pure black matches the darkest color in the cookbook, and so forth for the lightest. This assumes the palette can achieve a near-black and a near-white color. I use imadjust for this purpose, and it is functionally equivalent to a Levels adjustment in Photoshop/GIMP.
  • Lab colors must be derived in a consistent fashion between image and cookbook. In other words, since image colors can only be derived by RGB > XYZ > Lab, then the cookbook Lab colors should be derived by Reflectance > XYZ > RGB > XYZ > Lab, rather than Reflectance > XYZ > Lab. This is counter-intuitive but greatly improves color matching accuracy. I use Bruce Lindbloom's color conversion equations and an illuminant of D65.

Other Uses for a Cookbook

Although my initial interest in cookbooks was color decomposition, I realized that they had many other interesting and helpful applications. As a whole, they describe the complete color gamut (set of possible colors) for a given palette. What does this look like? How does it change as a function of which primaries are permitted? How does mixing precision affect it?

Digital color is 3-dimensional, whether working on RGB, HSV, XYZ, or Lab color space. This makes visualizing it on a 2D monitor non-trivial. Here, I'll show two possible visualization methods, one theoretical and one applied.

Theoretical: Point Cloud Cross-Sections

Each recipe corresponds to a single point in 3D color space, so a cookbook corresponds to a point cloud in 3D space. To fully view this, a cross-section plane can be swept through the point cloud at varying lightness (L) values, and shown from an isometric and front view. For the baseline parameter set:


Notice that there is much higher recipe density at lower L (lightness) values. This is because oil primaries tend to be very dark to maximize their economy, and because paint mixes tend to darken more readily than lighten. To reduce the spatial gaps between recipes, especially on the lighter end of the color gamut, the precision can be increased from 1/20 to 1/40:


In reality, paint mixing is continuous rather than discretized by a fixed precision, so the gaps between recipes in this scatter-plot visualization are an artifact of this simplifying assumption.

For the sake of curiosity, the cookbook may be calculated such that it allows only 2 ingredients per recipe.


Here, distinct nodes representing pure yellow and pure white (the two brightest primaries) can be seen, with paths sweeping toward them from the other primaries. Blue can be seen transitioning through green on its way to yellow, and burnt sienna nearly maintains a=0 and b=0 as it transitions to white, a truly neutral gray.

Suppose the ingredient limit is increased from 2 to 3:


The color gamut fills in considerably, but there are still significant gaps. Overshooting the 4-ingredient baseline, the color gamut may be evaluated for 5-ingredient recipes:


Notice that the 5-ingredient color gamut is not significantly different from the 4-ingredient color gamut. There is more complexity, but the result is effectively the same. This corroborates my initial intuition of 4 ingredients per recipe being the "sweet spot" of balancing accuracy with complexity. It's also in line with traditional oil painting wisdom that a recipe needs at most 3 primaries plus white, for 4 ingredients total.

Paring the palette down just to these essential primaries - yellow, red, blue, white:



As expected, this color gamut has the same bounding box as with 7 primaries, and is nearly as full. This shows that the non-essential primaries - the browns and green - are not strictly necessary from a chromatic perspective, but practically helpful to have on hand.

Suppose that the palette includes only neutral or cool primaries:



And similarly for neutral or warm primaries:




Applied: Image Fit

Another way of visualizing a cookbook's color gamut is comparing it against a reference image. For paint mixing applications, this gives some intuition about how well a palette can recreate a provided image. Suppose this is the reference image:


By evaluating the minimum color distance between image and cookbook for each pixel, the palette can be tailored to the choice of image. Returning to the baseline parameter set of 7 primaries, 4 ingredients, and 5% precision, and downsampling the image for speed, the result is as follows:


Overall, the fit is very good - the maximum color error is ~1.7 JND, or just noticeable differences, and most of the image is below 1.0 JND. Unsurprisingly, the largest errors occur in the brightest region of the image. Suppose both browns - burnt sienna and burnt umber - are eliminated from the palette. The result is as follows:


The fit is noticeably worse, suggesting that we at least one of the browns should be retained. Applying my artistic intuition, suppose that burnt umber is returned to the palette:


And this is good result. The maximum error is no higher than with the full palette, while the number of primaries has been reduced from 7 to 6, and correspondingly the cookbook recipe count has been reduced from 35,000 to 16,000. This result indicates that there is no need for burnt sienna in this particular image. Its elimination reduces the amount of paint required, and speeds up decomposition calculations by almost a factor of 2.

These palette evaluations computationally expensive, hence the need for heavy downsampling. The results echo the traditional oil painting wisdom of using a limited palette consisting of chromatically orthogonal pigments. This method provides a quantitative rationale for this centuries-old technique.

Conclusion

Cookbooks are an extremely powerful tool for color decomposition, and provide quantitative insight into the traditional oil painting concept of a limited palette.