Face Recognition using Deep Learning CNN in Python

CNN case study

Convolutional Neural Networks(CNN) changed the way we used to learn images. It made it very very easy! CNN mimics the way humans see images, by focussing on one portion of the image at a time and scanning the whole image.

CNN boils down every image as a vector of numbers, which can be learned by the fully connected Dense layers of ANN. More information about CNN can be found here.

Below diagram summarises the overall flow of CNN algorithm.

CNN face recognition case study

In this case study, I will show you how to implement a face recognition model using CNN. You can use this template to create an image classification model on any group of images by putting them in a folder and creating a class.

Getting Images for the case study

You can download the data required for this case study here.

The data contains cropped face images of 16 people divided into Training and testing. We will train the CNN model using the images in the Training folder and then test the model by using the unseen images from the testing folder, to check if the model is able to recognise the face number of the unseen images or not.

Reading image data for CNN


Creating a mapping for index and face names

The above class_index dictionary has face names as keys and the numeric mapping as values. We need to swap it, because the classifier model will return the answer as the numeric mapping and we need to get the face-name out of it.

Also, since this is a multi-class classification problem, we are counting the number of unique faces, as that will be used as the number of output neurons in the output layer of fully connected ANN classifier.

Face id mapping for CNN


Creating the CNN face recognition model

In the below code snippet, I have created a CNN model with

  • 2 hidden layers of convolution
  • 2 hidden layers of max pooling
  • 1 layer of flattening
  • 1 Hidden ANN layer
  • 1 output layer with 16-neurons (one for each face)

You can increase or decrease the convolution, max pooling, and hidden ANN layers and the number of neurons in it.

Just keep in mind, the more layers/neurons you add, the slower the model becomes.

Also, when you have large amount of images, in the tune of 50K and above, then your laptop’ CPU might not be efficient to learn those many images. You will have to get a GPU enabled laptop, or use cloud services like AWS or Google Cloud.

Since the data we have used for the demonstration is small containing only 244 images for training, you can run it on your laptop easily 🙂

Apart from selecting the best number of layers and the number of neurons in it, for each layer, there are some hyper parameters which needs to be tuned as well.

Take a quick look at some of the important hyperparameters

  • Filters=32: This number indicates how many filters we are using to look at the image pixels during the convolution step. Some filters may catch sharp edges, some filters may catch color variations some filters may catch outlines, etc. In the end, we get important information from the images. In the first layer the number of filters=32 is commonly used, then increasing the power of 2. Like in the next layer it is 64, in the next layer, it is 128 so on and so forth.
  • kernel_size=(5,5): This indicates the size of the sliding window during convolution, in this case study we are using 5X5 pixels sliding window.
  • strides=(1, 1): How fast or slow should the sliding window move during convolution. We are using the lowest setting of 1X1 pixels. Means slide the convolution window of 5X5 (kernal_size) by 1 pixel in the x-axis and 1 pixel in the y-axis until the whole image is scanned.
  • input_shape=(64,64,3): Images are nothing but matrix of RGB color codes. during our data pre-processing we have compressed the images to 64X64, hence the expected shape is 64X64X3. Means 3 arrays of 64X64, one for RGB colors each.
  • kernel_initializer=’uniform’: When the Neurons start their computation, some algorithm has to decide the value for each weight. This parameter specifies that. You can choose different values for it like ‘normal’ or ‘glorot_uniform’.
  • activation=’relu’: This specifies the activation function for the calculations inside each neuron. You can choose values like ‘relu’, ‘tanh’, ‘sigmoid’, etc.
  • optimizer=’adam’: This parameter helps to find the optimum values of each weight in the neural network. ‘adam’ is one of the most useful optimizers, another one is ‘rmsprop’
  • batch_size=10: This specifies how many rows will be passed to the Network in one go after which the SSE calculation will begin and the neural network will start adjusting its weights based on the errors.
    When all the rows are passed in the batches of 10 rows each as specified in this parameter, then we call that 1-epoch. Or one full data cycle. This is also known as mini-batch gradient descent. A small value of batch_size will make the LSTM look at the data slowly, like 2 rows at a time or 4 rows at a time which could lead to overfitting, as compared to a large value like 20 or 50 rows at a time, which will make the LSTM look at the data fast which could lead to underfitting. Hence a proper value must be chosen using hyperparameter tuning.
  • Epochs=10: The same activity of adjusting weights continues for 10 times, as specified by this parameter. In simple terms, the LSTM looks at the full training data 10 times and adjusts its weights.

Fitting the CNN model on the training data


Testing the CNN classifier on unseen images

Using any one of the images from the testing data folder, we can check if the model is able to recognize the face.

Prediction for a single face

The model has predicted this face correctly! You can try for other faces and see if it gets recognized. You can also add your own pics and train the model again.


Conclusion

You can modify this template to create a classification model for any group of images. Just put the images of each category in its respective folder and train the model.

The CNN algorithm has helped us create many great applications around us! Facebook is the perfect example! It has trained its DeepFace CNN model on millions of images and has an accuracy of 97% to recognize anyone on Facebook. This may surpass even humans! as you can remember only a few faces 🙂

CNN is being used in the medical industry as well to help doctors get an early prediction about benign or malignant cancer using the tumor images. Similarly, get an idea about typhoid by looking at the X-ray images, etc.

The usage of CNN are many, and developing fast around us!

I hope after reading this post, you are little more confident about implementing CNN algorithm for some use cases in your projects!

Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach got him to start this blog!

5 thoughts on “Face Recognition using Deep Learning CNN in Python”

    1. Farukh Hashmi

      Hi Ola,

      Yes, the test folder which has been used in the example for single predictions was totally unseen by the model.

      For other implementations, just make sure the target size of the image is same as the training data while passing a new image to check.

Leave a Reply!

Your email address will not be published. Required fields are marked *