Metal Shaders: Vibrance

In this blog post we will cover how to increase vibrance in your images without blowing out the already bright colors that might be present. The code comes from the Metal GPUImage 3 framework.

Saturation vs Vibrance

In a previous blog post, I covered a saturation filter. As a quick recap, the saturation filter takes the luminance value and adds it to the original pixel color. This operation is applied uniformly to every pixel regardless of the saturation already present in the image.

This means that if you already have some saturated colors in your image, they can get blown out with a saturation adjustment:

Blown out colors cause by saturation adjustment.

Vibrance, on the other hand, is a slightly less blunt instrument. The equation takes into consideration how saturated the original value is. The higher the value, the less the equation impacts it.

Shader Code

Here is the Vibrance fragment function (with the VibranceUniform structure omitted):

fragment half4 vibranceFragment(
	SingleInputVertexIO fragmentInput [[stage_in]],
	texture2d inputTexture [[texture(0)]],
	constant VibranceUniform& uniform [[ buffer(1) ]])
{
	constexpr sampler quadSampler;
	half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
	half average = (color.r + color.g + color.b) / 3.0;
	half mx = max(color.r, max(color.g, color.b));
	half amt = (mx - average) * (-uniform.vibrance * 3.0);
	color.rgb = mix(color.rgb, half3(mx), amt);
	
	return half4(color);
}

First, we are finding the average luminance of the current pixel. We are doing this by adding the red, green, and blue values together and dividing the sum by three.

Next, we are determining which color is most prominent in the currently sampled pixel. This is a nested max() function. It checks if blue or green is greater, then it pits that value against red’s value and whichever value is larger emerges as the victor.

Next we need to determine the weight we will apply to our final color mix. This is determined using the following components:

  • The maximum color value
  • The average color value
  • The user input value from the UI determining the degree of vibrance

You need to determine the difference between the brightest color and the average color. The larger this difference, the more of an impact the shader has on that specific pixel.

Like the saturation shader, the vibrance function also uses a mix function mixing in the maximum color value. However, instead of directly pulling in that value from the slider, it’s applying the color difference value calculated above.

Increased vibrance without the blow out.

Conclusions

Color operations on images are interesting in that there are varying degrees of fidgeting you can do with the operations. There are blunt force operations that tend to be good enough for quick and dirty processing, but there are also options available if you want more control over the output of your image.

Metal Shaders: Color Adjustments

The first few shaders I’ve covered in this series utilized a constant that was applied uniformly to the entire image. However, many algorithms we use in image processing, we would like to control the degree of effect we want to apply to an image. We don’t always want all or nothing. The purpose of this post will to cover several adjustment filters present in GPUImage 3:

  • Brightness
  • Contrast
  • Exposure
  • Gamma
  • Saturation
  • Red-Green-Blue Channel Adjustment

Before we jump into these shaders, I would like to briefly cover how you can receive user input to determine the percentage of effect you would like to apply to the image.

Encoding Parameters in Metal

There are two different ways to get parameters to the GPU:

  • Hard code them to constant memory space
  • Encode them to buffers to be accessed by the GPU

In the previous post about luminance we utilized the first method. The luminance algorithm doesn’t change and it is being used by multiple shaders in the GPUImage library.

For our adjustments in this section, we have a slider value to pass from the UI to the GPU. This value is encoded into a buffer that can be accessed by the GPU. We are already encoding the image we are processing as a texture. You can read more about encoding here.

Here is an example of one of our Swift classes defining a new shader operation with a single parameter to encode:

public class BrightnessAdjustment: BasicOperation {
    public var brightness:Float = 0.0 { didSet { uniformSettings[0] = brightness } }
	
    public init() {
	super.init(fragmentFunctionName:"brightnessFragment", numberOfInputs:1)
		
	uniformSettings.appendUniform(0.0)
    }
}

Pay close attention to this line:

uniformSettings.appendUniform(0.0)

This is where we are encoding a value into our uniform buffer. We want to start out with the value set to 0.0. We want the uniform to update and respond to user input, so we take advantage of Swift’s didSet functionality:

public var brightness:Float = 0.0 { didSet { uniformSettings[0] = brightness } }

Any time the brightness variable changes, the value of the uniform setting gets updated to the current value. Since we only have one value, these are set to the [0] index of the buffer.

So we have a buffer with a value of 0.0 that can be accessed by the shader. But the shader doesn’t know what this value correlates to. In order to do that, we need to set up a custom data structure:

typedef struct
{
    float brightness;
} BrightnessUniform;

We use this in the function signature to help the shader “decode” what this value is used for in our fragment equations:

fragment half4 brightnessFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant BrightnessUniform& uniform [[ buffer(1) ]])

All of the shaders I detail in this blog post follow this pattern. The only real change between them is the name of the fragment function, the uniform structure, and the constant passed into the fragment function. Let’s look at the math that goes into these effects next.

Brightness

Brightness is the intensity of color within an image. Here is the brightness filter’s code:

fragment half4 brightnessFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant BrightnessUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
    return half4(color.rgb + uniform.brightness, color.a);
}

You need to augment the intensity of each color by increasing the amount. The brightness shader is very similar to the saturation filter, except the adjustment colors are not weighted. Each of the red, green, and blue values are being augmented by the same amount. This isn’t a super refined color adjustment algorithm but it gets the job done. Changing the brightness does not fundamentally change the dynamic range of the image.

Brightness Filter

Contrast

While brightness represents the overall intensity of an image, contrast represents the difference between the lightest parts of an image and the darkest parts. The larger the difference between these values, the more contrast you have in an image. A black and white graphic novel page has incredible contrast because each point on the page is either all or nothing.

Here is our contrast filter:

fragment half4 contrastFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant ContrastUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
    return half4(((color.rgb - half3(0.5)) * uniform.contrast + half3(0.5)), color.a);
}

While brightness was an additive operation, contrast is a multiplicative operation. Smaller, darker values are impacted less from these operations than larger, brighter values. This means that the larger the contrast value you use, the wider the disparity will be from the darkest and lightest pixel values.

Contrast Filter

Exposure

In photography, exposure is the amount of light you allow allow through the lens. If you are in a low light situation, such as astrophotography, you want the exposure set very high. In full light situations like a mid-afternoon picnic, you need to tamp down the exposure to avoid having your image be blown out.

Here is the shader that we use to emulate exposure:

fragment half4 exposureFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant ExposureUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
    return half4((color.rgb * pow(2.0, uniform.exposure)), color.a);
}

This formula utilizes a new math function: the pow function. The pow function takes two parameters:

  • The value to be multiplied
  • The power to which it will be multiplied

So, for example, if you had

pow(2, 8);

the result would be 256 (two raised to the eighth power).

The shader is taking the base value passed into the shader and multiplying it by two to the power of the exposure value. The exposure value can be anywhere between 0.0 and 1.0. Any number, besides zero, that is raised to the zero power is one. So because of how we clamp the exposure values, the exposure result will always be a value between 1.0 and 2.0.

Exposure Filter. Notice how easy it is to “blow out” the image.

Gamma

Light, similarly to sound, are not experienced by humans linearly. Small amounts of light are perceived to be much brighter and increases in brightness at the full end of the spectrum do not appear to be significantly brighter despite having similar proportional increases.

Here is a good link to an article about what gamma is.

fragment half4 gammaFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant GammaUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
    return half4(pow(color.rgb, half3(uniform.gamma)), color.a);
}

For our gamma correction, we are passing a value set by the user and raising each of the color channel’s values to that value.

Gamma Filter. Notice how much brighter this image is than brightness and exposure without blowing out the image.

Saturation

Saturation is how much chrominance is present in an image. In our earlier post about luminance we discussed how to create a monochromatic image. We are going to take this a step further and allow the user to adjust the amount of color they want in their image. Here is the shader:

fragment half4 saturationFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant SaturationUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);

    half luminance = dot(color.rgb, luminanceWeighting);

    return half4(mix(half3(luminance), color.rgb, half(uniform.saturation)), color.a);
}

First we create a three component vector to contain the same luminance weighting we initially created back in our luminance shader. We will use these values to adjust the color saturation.

Back when we simply wanted the luminance, we uniformly applied this value to all three color channels. Now we need to use a portion of this value along with a portion of another value. For this, we need to use another new Metal function to this blog: mix.

mix takes three parameters:

  • First color value
  • Second color value
  • Percentage of first color

The actual math behind mix looks like this:

T mix(T x, T y, T a)
x + (y – x ) * a

The first value is added to the second value after the second value is subtracted from the first value and multiplied by the percentage. In our case, the first value is the luminance value of the color. We want to determine how much of the original color value should be added back to the image. We subtract the smaller luminance value from the original color value and then multiply it by the percentage the user wants.

This shader is a good example of how many shaders in GPUImage are composed and build upon smaller, simpler shaders. One reason for this series of blog posts starting with very simple shaders is to show the reader how to intuitively build more complex shaders.

Saturation Filter

RGB

So far all of our shader functions have affected all color channels equally. One powerful aspect of having multiple color channels is that they can be adjusted independently.

In order to adjust each channel independently, we need more than one uniform setting:

public class RGBAdjustment: BasicOperation {
    public var red:Float = 1.0 { didSet { uniformSettings[0] = red } }
    public var blue:Float = 1.0 { didSet { uniformSettings[1] = blue } }
    public var green:Float = 1.0 { didSet { uniformSettings[2] = green } }
	
    public init() {
	super.init(fragmentFunctionName:"rgbAdjustmentFragment", numberOfInputs:1)
		
	uniformSettings.appendUniform(1.0)
	uniformSettings.appendUniform(1.0)
	uniformSettings.appendUniform(1.0)
    }
}

This could be slightly confusing, so I’ll break it down a little. We have three uniform variables for each color channel. These are connected to three separate sliders in the UI. Any time one of those sliders is changed, it updates the value of the specific variable it is attached to. This is the same as our previous shaders, but if you look at the initial uniform settings, we append three identical uniforms. We have to explicitly set the uniforms once upon launch, which is why we have those three identical lines in the Swift file. Above in the public variables, we explicitly associate the red, green, and blue slider variables with a “slot” in the uniform buffer. Once the uniforms are initially set, we don’t care anymore about the code in the initializer. The sliders take over responsibility for their own specific slot.

In order to keep this strait on the GPU side, we create a data structure emulating these public variables so the GPU can sort out how the data it’s being sent is laid out:

typedef struct
{
    float redAdjustment;
    float greenAdjustment;
    float blueAdjustment;
} RGBAdjustmentUniform;

This buffer of data is again passed into the shader as a parameter:

fragment half4 rgbAdjustmentFragment(
    SingleInputVertexIO fragmentInput [[stage_in]],
    texture2d inputTexture [[texture(0)]],
    constant RGBAdjustmentUniform& uniform [[ buffer(1) ]])
{
    constexpr sampler quadSampler;
    half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
	
    return half4(
	color.r * uniform.redAdjustment,
	color.g * uniform.greenAdjustment,
	color.b * uniform.blueAdjustment,
	color.a);
}

Each color channel is multiplied by the percentage associated with the slider in the UI. We access each color channel by name using dot notation.

Maxed Out Red Channel
Maxed Out Green Channel
Maxed Out Blue Channel

Conclusions

Many of the most common image processing functions we take for granted in programs like Photoshop are surprisingly simple. From these simple building blocks we can build many large and impressive effects. These posts might seem like humble beginnings, but big things come from small beginnings.

Metal Shaders: Luminance

Another shader in the GPUImage framework that doesn’t take any inputs other than the current pixel color is the luminance filter. This blog post will not only go over this very simple shader, but will also go into the science behind how the algorithm was developed.

How Humans Perceive Color

Luminance is the overall brightness of a specific image. If you look at a black and white image, you’re looking at the light present in the image minus the color. The luminance filter should accurately represent the image minus the color.

You might think that you would need an equal representation of red, green, and blue in the image, but humans perceive the brightness of different colors differently. Specifically, humans are most sensitive to colors within the green spectrum.

In fact, video data is commonly encoded using a YCbCr color spectrum, emphasizing the Y component, which represents luminance. Many image sensors have twice as many green sensors as red and green. This will be covered in more detail in later blog posts.

Fun fact: In the 1960 movie Psycho, the substance used for blood during the stabbing scene in the shower was chocolate syrup. Using a blood red substance for blood does not show up well on black and white film, so they needed to use a darker substance to show up accurately.

The flip side of luminance is chrominance. Chrominance is the saturation and color of the specific pixel. Human beings are more sensitive to brightness than saturation. This makes sense evolutionarily. If you’re wandering around at night where there are a lot of predators, your vision needs to be sensitive to differences in brightness and movement rather than being sensitive to how green a plant is. Color can give you information about whether a plant is poisonous or not, but fine differences in shades of red don’t really give you an advantage the way that brightness does.

Luminance Shader

in GPUImage 3, we have a file called OperationShaderTypes.h. This file contains our shared vertex structures and common constant values used in multiple shaders. One of those is our algorithm for luminance weighting:

constant half3 luminanceWeighting = half3(0.2125, 0.7154, 0.0721); 

This algorithm came from Graphics Shaders: Theory and Practice. We use these values in the luminance fragment shader:

fragment half4 luminanceFragment(
	SingleInputVertexIO fragmentInput [[stage_in]],
	texture2d inputTexture [[texture(0)]])
{
	constexpr sampler quadSampler;
	half4 color = inputTexture.sample(quadSampler,
				fragmentInput.textureCoordinate);
	half luminance = dot(color.rgb, luminanceWeighting);
	
	return half4(half3(luminance), color.a);
}

The meat of this function is this line:

half luminance = dot(color.rgb, luminanceWeighting);

One of the better explanations I have found on the dot product is on this site. What this line of code is doing is taking the red, green, and blue values of the input pixel and multiplying them by the weights we set for the luminance then summing the results. The red value is being multiplied by 0.2125. The green is weighed by a whopping 0.7154 while the blue only gets 0.0721. These three weights added together equal 1.

If you were to calculate the luminance of white, it would look something like this:

(1.0 * 0.2125) + (1.0 * 0.7154) + (1.0 * 0.0721) = 1.0

The goal is that you do not wind up with a value that can ever be above 1.0. If all of the red, green, and blue values were weighted equally, then a desaturated red would look exactly the same as desaturated green and desaturated blue. Even though these would all be the same shade of gray, it would still look off to our eyes because we perceive these colors differently, as seen below:

This weight is then applied as the value for all three color channels to ensure the image is a shade of gray:

return half4(half3(luminance), color.a);

Conclusions

This is still a relatively simple shader. Many of the shaders in the GPUImage framework build off of luminance, so understanding this concept will help you later with more complex shaders.

Next, we’re going to start covering shaders that require properties to be shared between the CPU and the GPU and you will learn how to encode those values into buffers.

Colorized version of the featured desaturated image

Metal Shaders: Color Inversion

This is the first in a series of blog posts about the math behind image manipulation filters used in GPUImage 3. I am hoping that these posts will give the reader a good foundation around common shader functions that can be used to build more complex shaders.

One of the most basic color manipulation functions is color inversion. Color inversion required just a single input: the original pixel color. The color is inverted and the value is returned to the rest of the rendering pipeline.

How Does Color Inversion Work?

In order to work with and understand shaders and image/graphics manipulation, you need to understand how the computer processes images.

If you’ve ever worked with Photoshop, you’ve probably worked a bit with the color tools. You were given options for color choice based on a red, green, and blue value between 0 and 255.

Most of the visible color spectrum we can see can be expressed as a mixture of red, green, and blue. Further, most of those values can be expressed using 8 bits of data for the value of each color component. 8 bits represents 256 values, hence the value between 0 and 255.

Color Representation in Metal

In the Metal Shading Language, the fragment function returns a half4 value. half4 is a four element data structure composed of floats at half precision. A regular float has 32 bits of precision and a half float has 16 bits of precision. Metal is natively optimized for 16 bit data types, so use those when possible.

You might be wondering why we have a half4 if we only have red, green, and blue values. The final value is for the alpha channel, which controls the opacity of the color output.

Apple’s Cocoa frameworks represent colors as a percentage between 0.0 and 1.0. This means that to get the inverse of the percentage of each color, you simply need to subtract the value from 1.0. You can take my word for it, or we can look over a few simple examples of this in practice.

White is created by outputting 100% of red, green, and blue. This is represented as:

half4 = (1.0, 1.0, 1.0, 1.0);

Subtract 1.0 from each of those values and you wind up with:

half4 = (0.0, 0.0, 0.0, 1.0)

That example is pretty easy and self explanatory. Let’s look at a slightly more complex example. Let’s invert blue. To have pure blue on the screen, you have 100% blue and 0% green and red:

half4 = (0.0, 0.0, 1.0, 1.0)

Each of these values is subtracted from one:

Red = 1.0 – 0.0 = 1.0
Green = 1.0 – 0.0 = 1.0
Blue = 1.0 – 1.0 = 0.0

The inverted blue value is:

half4 = (1.0, 1.0, 0.0, 1.0)

This results in yellow:

So far all of these examples have been of either 0% or 100%. Does this still work at values in the middle? Absolutely.

half4 = (0.5, 0.5, 0.5, 1.0)

This is gray. The inversion of gray should stay exactly the same. Let’s try it out:

Each of these values is subtracted from one:

Red = 1.0 – 0.5 = 0.5
Green = 1.0 – 0.5 = 0.5
Blue = 1.0 – 0.5 = 0.5

As you can see, none of these values changed, which is as it should be:

half4 = (0.5, 0.5, 0.5, 1.0)

Color Inversion Shader

For GPUImage, we didn’t create a separate vertex shader for every fragment shader. Many classes of shaders need the same inputs, so we set up a single vertex shader for all fragment shaders that require a single input. The output value for this single input is SingleInputVertexIO:

struct SingleInputVertexIO
{
	float4 position [[position]];
	float2 textureCoordinate [[user(texturecoord)]];
};

Each of our single input shaders requires the current vertex position and the coordinate of the texture. This is the output of the single input vertex function:

vertex SingleInputVertexIO oneInputVertex(
	device packed_float2 *position [[buffer(0)]],
	device packed_float2 *texturecoord [[buffer(1)]],
	uint vid [[vertex_id]])
{
	SingleInputVertexIO outputVertices;
	
	outputVertices.position = float4(position[vid], 0, 1.0);
	outputVertices.textureCoordinate = texturecoord[vid];
	
	return outputVertices;
}

The vertex function is pulling in the position and texture that were encoded into the buffers on the CPU side. Since most of our processing will happen in the individual fragment shaders, the purpose of this vertex shader is to basically pass the current frame to the fragment function.

The final color inversion shader from GPUImage is here:

fragment half4 colorInversionFragment(
	SingleInputVertexIO fragmentInput [[stage_in]],
	texture2d inputTexture [[texture(0)]])
{
	constexpr sampler quadSampler;
	half4 color = inputTexture.sample(
		quadSampler, 
		fragmentInput.textureCoordinate);
	
	return half4((1.0 - color.rgb), color.a);
}

The fragment function has two parameters:

  • The current interpolated position
  • The current texture

The texture tells us what image we’re processing and the position tells us which specific pixel the fragment shader will be processing.

First, we’re creating a sampler to sample from the texture. Next, we’re creating a variable to hold the current sample color by referencing the texture sampler at the specific position coordinate we received in the parameters.

Finally, we are doing our color inversion calculation. The first three values are the red, green, and blue values we are inverting. The final value is the alpha/opacity value. We do not want to invert that value, so that is simply passed through as is.

Conclusion

Sorry if this post feels like beating a dead horse by stating the obvious. For me personally, I had to change the way I think about programming to grok how to create shaders. I found that breaking down a shader into these simple truths and components, it helped me to see that there was a reason this formula exists instead of just copying and pasting an algorithm online.

With graphics, everything is expressed mathematically. It’s important to realize that the people who wrote these algorithms were attempting to create an effect and had to think about how to accomplish that mathematically. These aren’t magic. Every shader I go through for the rest of this series builds on the ideas I express here.