Monday, November 18, 2019

Histogram Diffusion Image Filter

Summary

An overview of an experimental image processing algorithm that uses 1D heat diffusion and exact histogram matching to augment the chromatic realism (i.e. the perceived color accuracy) of an RGB photo.

Background

A dream of mine since childhood has been a camera that can capture the exact image seen by one's eyes. No such device yet exists, but this dream explains my interest in painting that developed later in life. It's my belief that painting has persisted in the modern world because it retains an essential advantage over photography: it is the superior medium for accurately representing images seen by eye - in other words, the human visual experience.

Photography has its place too, but its images are cheap and inhuman. Although it invariably captures forms with perfect accuracy, it is clumsy and narrow with hue, saturation, and value, the essential ingredients for describing color. Compounding this is its unnatural way of focusing on a plane, instead of a point as eyes do. In other words, a photo is like a superimposition of an infinite set of eye-images, each focused on a different point on the same focal plane. Photos are great for quickly capturing appearances, but not how it feels to see. Here, painting at its best is unsurpassed.

Despite the shortcomings of photography, it remains a powerful and useful tool for artists. An interest of mine is how photography can be used as a basis for painting, and how various technical approaches can be used to mitigate its shortcomings. In this article, I'll describe an image filter that may be useful for augmenting the chromatic realism of a reference image.

Hypothesis

Over a series of walks through the local fall foliage recently, I was thinking about cognitive aspects of color perception. It stands to reason that the human eye is good at differentiating objects by subtle differences in color, as these were useful skills in improving the evolutionary fitness of our ancestors. The human eye has a limited bandwidth in terms of the wavelengths that it can perceive - roughly 350 to 850 nm. I reasoned that in general, an image would be maximally differentiable if these wavelengths were represented equally - in other words, if a histogram describing the frequencies of visible light was flat.

Histogram equalization achieves this ideal of a flat histogram and maximum differentiability, but at the cost of realism. The problem is that most scenes don't actually have a flat histogram, so enforcing one causes unrealistic and harsh shifts in color.


Examples of "naive" histogram equalization applied independently to images' R, G, B channels

I concluded that a flat histogram has "too much" differentiability, while a raw histogram has "too little", with neither feeling quite realistic. I wondered if some intermediate histogram would offer an ideal balance between the two extremes, and realized that 1D diffusion, as if by the heat equation, offered a solution for continuously interpolating between these extremes. This is because the steady-state solution for 1D diffusion is a flat line, while its initial condition may be any arbitrary vector of values. The hypothesis then emerged:

Diffusing an image's histogram, as if by the heat equation, can be used to augment its chromatic realism.

Implementation

The 1D heat equation states that the derivative of temperature with respect to time is proportional to its second derivative with respect to space, i.e. its curvature:


For an initial condition of T, this can be solved easily using the finite-difference method. Applying this to images requires replacing temperature, T, with "level count", i.e. how often each value occurs within each channel. This step has no real dimensional basis, but is convenient and easy. Working in RGB color space, each channel can be operated on independently as if it was a grayscale image. T
hese arguments can also be generalized to alternate color spaces such as HSV.

Less trivial yet critically important is how boundary conditions are handled. The boundaries of an image's histogram represent its saturation points. It's best to avoid saturating an image when taking or editing a photo, as it causes a loss of information and an overall loss of realism. In the most severe cases, when all three channels are saturated, the resulting color is pure white or black, and no visual information is encoded. This commonly occurs as a result of over- or under-exposing an image. However, moderate saturation may be acceptable in some cases. I define "moderate saturation" as a channel's boundary values being 10% or less of its peak value.

To permit diffusion while discouraging saturation, I enforce these constraints:
  • If a boundary is ≤10% saturated, fix its value over time, i.e. set it as a Dirichlet boundary.
  • If a boundary is >10% saturated, insulate it so that it tends to drift downward over time, i.e. set it as a Neumann boundary.
These constraints ensure, for typical images, that no additional saturation occurs as a result of the diffusion. The steady-state solutions of this equation with these boundary conditions are always straight lines, practically but not exactly flat, very much like canonical histogram equalization. In practice, best results are achieved well before reaching diffusive steady-state.

To transform the image according to the diffused histograms, I used an existing exact histogram match algorithm. This introduces a small amount of imperceptible noise to transition from a discrete-valued image to a continuously-valued image, which enables every pixel to be sorted in strictly ascending/descending order. The image can then be transformed so that it exactly produces the input histogram.

Results

All photos in this section were taken with my 2003 Canon Rebel. No modifications were made prior to diffusing the histograms. The diffusion runtimes are equal between channels, but not between images - this was qualitatively fine-tuned by eye.


This is a simple example to start with. The original's colors are very tightly clustered around a central value, so the histograms resemble bell curves and diffusion essentially amounts to a conventional horizontal scaling of the histogram.


In R (Red Channel) and G (Green Channel), fixing the left-hand boundary pushes the histogram to the right as it diffuses, brightening those channels and finding definition in the nearly-saturated shadows. In B (Blue Channel), the insulated left-hand boundary is allowed to drift downward, and the exact histogram matching algorithm injects noise, guessing at the actual values of these saturated pixels. This guesswork is well-hidden in the result, because B contributes almost nothing to a color's perceived value, i.e. its perceived brightness/darkness. This is due to a quirk of human vision which places disproportionate emphasis on the Green Channel.


Like with the desert mountain photo, but without the haze, diffusion essentially scales the histogram horizontally, accentuating the greener spots, and highlighting some contrasting patches of pale-copper/salmon soil. Note that diffusion enables augmentation of the Blue Channel where histogram scaling is infeasible due to near-saturation.



This image is more bi-modal in nature, due to the clustering around both sky and and ground. Because the original image covers the available gamut relatively fully, conventional histogram scaling cannot be used. However, diffusion encourages fuller use of the mid-range gamut between these two clusters, reducing the warm glow and the increasing the overall contrast, especially in the ground cover and clouds. A curious and persistent bug occurs here - sparse regions of tree foliage set against the sky turn whitish, as if snow-covered.



Another example of an initially well-utilized color gamut that can be augmented by diffusion.


An especially good result wherein R and G are both fixed, while B is insulated on the low end. The fog is thinned and the underlying colors are better illuminated.




Another bimodal gamut where scaling is inappropriate due to near-saturation on the low end initially. Diffusion preserves the means of each cluster, while diffusing them individually to find subtle hues in the sky and better distinguish rock from greenery.

Conclusions

Histogram diffusion is a potentially useful image processing filter for augmenting chromatic realism. It especially provides an advantage over conventional histogram scaling for images that are one or more of:

  • Bi- or multi-modal
  • Nearly-saturated initially, making histogram scaling infeasible
  • Strongly dissimilar between color channels, requiring per-channel treatment

Initial experiments applying this same methodology to HSV color space did not seem to produce significantly different/better results, but more exploration could be done in this and other color spaces.


Source Code

Freely available on my GitHub in MATLAB.

No comments:

Post a Comment