Extract Face Embedding from Image

Rupesh
3 min readFeb 9, 2021

--

Extract identity vector for individual faces

Here I am going to discuss how to extract the face encoding or face embedding from an image using pre-trained models that are available in open source. I have also attached the code refer this git.

Face Embeddings:

It analyses the given image and returns numerical vectors that represent each detected face in the image.The vectors of different size 64,128,256,512. Here we are going to discuss the model which returns a vector of 128 sizes.

We can use this embedding we can able to perform face recognition and face verification and face Matching Application.

It is a deep learning-based method to represent identity for individual faces. The architecture named FaceNet is used to extract face embedding to know more about it refer link.

Dlib :

We can use Dlib to locate faces in an image as discussed in the previous blog. Also by using it, we can extract the face encoding vector for faces in the image. The model named dlib_face_recognition_resnet_model_v1.dat is used to extract encodings in the Dlib module.

Here we need to say the location of faces in the given image. The advantage of Dlib is, It is a lightweight model and runs even in a CPU with low computational power and also has low inference time comparing to other models. One bonus in Dlib, By default it has a face detection module in it.

# Load the model using Dlibdlib.face_recognition_model_v1('dlib_face_recognition_resnet_model_v1.dat')  

Function to get encodings :

def encodings(img,face_locations,pose_predictor,face_encoder):    predictors = [pose_predictor(img, face_location) for face_location in face_locations]    return [np.array(face_encoder.compute_face_descriptor(img, predictor, 1)) for predictor in predictors]

Source code is available here. The pose predictor model can be downloaded here.

TensorFlow Models :

There is also a pre-trained TensorFlow model that can be used to extract encodings of faces in images. The architecture used is the same in both cases but the loss function and training data changes. Thus the encoding vector return from it will be different from the previous model but it is of size 128 dimension.

Here for a model, we need to pass only the cropped image which only has a face in it, because it will detect the face in the image it just read the image and pass the image into a network and returns a vector which is a face encoding.

To handle the above scenario, we need to crop the faces from the image and pass it into the model. To detect and crop the faces from the image using any one of the methods discussed in the previous blog.

In order to load the model...

def load_model(modelpath):
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(modelpath, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
return detection_graph

To get embeddings...

def get_embedding(graph,img):
input_array = preprocess_input_img(img)
with graph.as_default():
with tf.Session() as sess:
images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
embedding_size = embeddings.get_shape()[1]
feed_dict = {images_placeholder: input_array, phase_train_placeholder: False}
embeddings = sess.run(embeddings, feed_dict=feed_dict)
return embeddings

Need to pass the model graph and image (preprocessed image if needed).

We can also pass multiple images in a time based on the machine capacity we need to set our batch size.

It runs on both CPU and GPU, while in GPU it runs faster.

The source code is here. The model can be downloaded from. Few other facenet TensorFlow models are also available to try if needed.

On comparing these models, Dlib is easy and holds less complexity in the case of extracting face embeddings compared to TensorFlow models. But in the case of having high computation power TF models performs well.

Here I have attached the source code link use it for reference.

--

--