I wanted to accomplish something relatively simple: Given an image with a black background, replace all the black pixels with transparent ones. In pictures, take a picture like:

import cv2
import numpy as np
from matplotlib import pyplot as plt

And turn it into:

My Google-fu failed me spectacularly, because after 2 hours of trawling StackOverflow, I turned up nothing. Only through lots of trial and error (being the NumPy noob that I am), I finally stumbled onto the solution:

image = cv2.imread('./Desktop/maze.png')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGBA)
image[np.all(image == [0, 0, 0, 255], axis=2)] = [0, 0, 0, 0]

Now if you’re just interested in the solution, then you can stop reading the rest of the post. However, if you are like me and are not at all satisfied until you’ve if you want to deep deeper, and tease apart this terse one-liner and learn some cool NumPy tricks along the way, then read on!

Step 1: Adding the alpha channel

The first order of things is to create the alpha channel for the image. OpenCV to the rescue:

image = cv2.imread('./Desktop/maze.png') # works for jpeg too.
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGBA)

Let’s check that the image has indeed 4 channels:

image.shape

Gives

  (720, 966, 4)

Interlude 1: Boolean Masking

Here comes the interesting bit. Before we go any further, we’d need to cover some NumPy concepts. The first is boolean masking. Given an array

arr = np.arange(3) # array([0, 1, 2])

and a mask:

mask = np.array([True, False, True])

We can apply a mask to the NumPy array to turn on or off values:

arr[mask]

Produces

array([0, 2])

It should be straightforward to notice that this will not work since there are only three values in the array while the mask with four values. The error message makes this obvious too:

arr[[True, False, True, False]]

Results in:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-18-0766bdb07090> in <module>()
----> 1 arr[[True, False, True, False]]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 3 but corresponding boolean dimension is 4

Let’s now repeat this exercise but with higher dimensions. We’ll create a 3D array:

arr = np.arange(24).reshape(2, 3, 4)

Side note: It took me too long to realize the relationship between the value given to np.arange() and np.reshape(). In the previous line, 2 * 3 * 4 = 24. Any other combination that doesn’t result in 24 would result in an error (try it!).

arr looks like:

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

You can think about the array as having 2 rows, 3 columns, and each column consisting of 4 elements. Now, let’s say I want to apply a mask to get rid of [4, 5, 6, 7] and [20, 21, 22, 23]. How would we do that?

First, let’s note that the shape of arr is (2, 3, 4). Now let’s create a mask:

mask = np.array([[True, False, True], [True, True, False]])

And apply the mask to arr:

arr[mask]

This results in:

array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

How did I come with that mask? If I wanted to turn of the individual elements, then I would have needed a mask of a shape of (2, 3, 4), exactly the same shape as arr. Here, here we want to mask some of the columns. (it’s shown here horizontally, so you need to tilt your head 90-degrees to the left if you’re having trouble visualizing).

This means that the mask’s shape has to be (2, 3). This makes sense because the shape tells us that there are 2 * 3 = 6 “columns” in total.

Now it should be easy to figure the mask if we wanted to exclude

[[12, 13, 14, 15],
 [16, 17, 18, 19],
 [20, 21, 22, 23]]

We will require a mask of (2,). Note that it is not (2, 1) since we need to drop one dimension. Also, [True, False] has indeed a shape of (2,):

mask = np.array([True, False])

mask.shape gives:

(2,)

Now applying that mask

arr[mask]

gives

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]])

Interlude 2: numpy.all()

In the previous section, we wrote out the masks by hand. Clearly no one is going to do that for an image. What we need is a function that will generate the mask for us. But before we do that, we need to know what we’re looking out for.

The objective is to turn a black pixel. That is, turn [0, 0, 0, 255] (a black pixel), into a transparent one: [0, 0, 0, 0]. In order to do this, we need to select all the black pixels, turn the locations of where a black pixel occurs and set it to be transparent.

Enter numpy.all:

Test whether all array elements along a given axis evaluate to True.

This is exactly what we need. What’s the given axis though?

arr = np.arange(0, 10, 2)

arr is

array([0, 2, 4, 6, 8])

Here we’re asking if every element along axis = 0 (i.e. the only axis here and axes start from 0) is even:

np.all(arr % 2 == 0, axis = 0)  # Returns True!

Now to up the ante with a 2-D array:

arr = np.arange(0, 10).reshape(2, 5)

arr is

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Now, is every element in the inner arrays, that is, [0, 1, 2, 3, 4] and [5, 6, 7, 8, 9] greater than or equal to 5?

np.all(arr >= 5, axis=1)

The result shouldn’t be too surprising:

array([False,  True])

Notice that we’re using axis = 1, since we’re looking at the inner arrays. This begs the question: What happens when we use axis = 0? This means that we’re considering the elements row-wise:

np.all(arr >= 2, axis=0) 

We get:

array([False, False,  True,  True,  True])

Is every element in [0, 5] greater than or equal to 3? (Nope.) What about [1, 6]? (No). Then [2, 7]? Yes! Same goes for [3, 8] and [4, 9]. It takes a bit of getting used to but it makes sense once you think about it. Fortunately for our use case, we’ll just consider the last axis because we want to compare against the pixel values.

You might be wondering why does arr >= 5 even work at all, or arr % 2 == 2 for that matter. The reason is broadcasting. In simple terms, the >= or % operation is applied to all elements of the array. In both cases, the result is a boolean mask. For example:

arr >= 5

Returns:

array([[False, False, False, False, False],
       [ True,  True,  True,  True,  True]])

Notice that the second array is all True. That is why np.all(arr >= 5, axis=1) returns array([False, True])

Step 2: Getting the Mask

Almost there. Now we have enough knowledge to generate the mask needed to select all the black pixels. First, extract the background color values:

bg_color = image[0][0]

is

array([  0,   0,   1, 255], dtype=uint8)

Notice here that it is not completely black: [0, 0, 1, 255] instead of [0, 0, 0, 255]. Now we can create the mask:

mask = np.all(image == bg_color, axis=2)

The contents of mask

array([[ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True],
       ...,
       [ True,  True,  True, ...,  True,  True,  True],
       [ True,  True, False, ...,  True,  True,  True],
       [ True,  True, False, ...,  True,  True,  True]])

Step 3: Apply the Mask and Replace the Pixels

Finally, we can put everything together and set the black pixels to transparent:

image[mask] = [0, 0, 0, 0]
plt.imshow(image)

Side note: When you save this image, it must be a PNG. Saving it as a JPEG would result in an error.

Summary

What was seemingly a simple operation to make a background transparent took us on quite a bit of journey to learn some interesting NumPy functions and also clarifying some fundamental concepts:

I hope you enjoyed reading this and even learnt a thing or two!