I was using MacOS’s Preview to sign a PDF document and came across this amazing feature that lets you create a digital version of your signature by clicking a photo of it through your webcam. This intrigued me towards thinking what kind of technology goes into building a nifty feature like that. I haven’t been able to get all my answers yet but the following are a list of few things that I have learnt till now.
Setting up the Environment
I have primarily used Python 2.7.x for the experimentation. The following libraries are required :
- mahotas - scikit-image - matplotlib - numpy
These libraries can be found on pip or any such python package manager.
Reading an image into memory
Image can be read by using mahotas.
The image is stored in the form of a multidimensional numpy array of pixels. This is useful when further processing is done on the image.
Displaying the image using pyplot
Displaying an image through various stages of processing can be useful and give insights into how the process affects the image.
Converting an image from RGB to Gray scale
It’s difficult to handle a colour image during processing because the image becomes a 3-Dimensional array, this is usually solved by converting the image to a gray scale (“removing” the third dimension) where varying intensity of the gray colour represents the image.
This function returns a numpy array of dtype float64. Care should be taken to convert it back into integer before performing tasks that require integer type values.
Converting an image array from float64 to uint8
Through the stages of processing, data type casting is a much used technique because of the varied kind of algorithms used.
- np.uint8 : Unsigned integer (0 to 255)
- float64 : Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
In order to detect an object, its borders should be detected first. For that purpose we use Gradient Vectors of pixels of an image. Roughly speaking, gradient vector is the “change in x and y direction” of a pixel. Chris McCormick explains it beautifully in his blog.
Gradient vector can be calculated by using numpy’s gradient() method. This method returns two vectors of change in gradient x and gradient y respectively.
Since gradient vector is the "change in x and y direction",
the change will be maximum around borders in an image
The following was the outcome of the experiment:
As can be seen we extract useful features of the image, step by step removing/ignoring the information we don’t need.
We followed the following steps :
- Removed the colours by converting the image to gray scale
- Replaced the pixel values by calculating pixel gradient
- Calculated the gradient vector’s magnitude by the formula :
g = (gx ** 2 + gy ** 2) ** .5
- On plotting this magnitude, we can an image with borders clearly marked.