0:00

The triplet loss is one good way to learn the parameters of a confident for

Â face recognition there's another way to learn these parameters let me show you

Â how face recognition can also be posed as a straight binary classification

Â problem another way to train a neural network is to take this pair of neural

Â networks to take this siamese network and have them both compute these

Â embeddings maybe 128 dimensional embeddings maybe even higher dimensional

Â and then have these be input to a logistic regression unit to then just

Â make a prediction where the target output will be 1 if both of these are

Â the same persons and 0 if both of these are of different persons so this is a

Â way to treat face recognition just as a binary classification problem and this

Â is an alternative to the triplet loss for training a system like this now what

Â is this final logistic regression unit actually do the output Y hat will be

Â lets say a sigmoid function apply to some set of features but rather than

Â just feeding in these encoding x' what you can do is take the differences

Â between the encodings so let me show you what I mean let's say I write a sum over

Â K equals 1 to 128 of let's say the absolute value take an element-wise

Â between the two different encodings so I'm just finished writing the sultan and

Â we'll see what this means so in this notation f of X I is the encoding of the

Â image X I and the subscript K means to just select out the kaif components of

Â this vector so this is taking the element wise difference in absolute

Â values between these two and codings and what you might do is think of these

Â 128 numbers as features that you then feed into logistic regression and your

Â final iteration can parameters WI and be similar to a non

Â normal logistic regression unit and you would train a appropriate waiting on

Â these hundred and twenty-eight features in order to predict whether or not these

Â two images are of the same person or of different persons so this will be one

Â pretty reasonable way to learn to predict zero or one whether these are

Â the same person or different persons and there are few other variations on how

Â you can compute this formula that I had underlined in green for example another

Â formula could be this K minus f of X J okay squared divided by f of X I plus f

Â of X J okay this is sometimes called the chi-square formula this is the Greek

Â alphabet chi but this is are sometimes called the chi-square similarity and

Â this and other variations are explored and this and other variations are

Â explored in this deep face paper which I have referenced earlier as well but so

Â in this learning formulation the input is a pair of images so this is really

Â your you know training input X and the output Y is either 0 1 depending on

Â whether you input seeing a pair of similar or dissimilar images and same as

Â before your training a Siamese network so that

Â means that this neural network up here as parameters that are the same or

Â they're really tied to the parameters in this lower neural network and this

Â system can work pretty well as well um lastly it doesn't mention one

Â computational trick that can help your deployment significantly which is that

Â if this is the new image so this is an employee walking in hoping that turns

Â now the doorway were open for them and if this is from your data base image

Â then instead of having to compute this set of features then instead of having

Â to compute this embedding every single time what you can do is actually pre

Â compute that so when the new employee walks in what

Â you can do is use this upper confidence to compute that encoding and use it to

Â then compare it to your pre computed encoding and then use that to make a

Â prediction y hat so because you don't need to store the raw images and also

Â because if you have a very large database of employees you don't need to

Â compute these codings every single time for every employee your database this

Â idea of pre computing some of these encodings can save a significant

Â computation and this type of pre-computation works both for this type

Â of Sammys network architecture where you feed face-recognition as a binary

Â classification problem as well as for when you were learning and codings may

Â be using the triplet loss function as described in the last couple of videos

Â and so just to wrap up to create face verification as supervised learning you

Â create a training set of just pairs of images now instead of triplets and pairs

Â of images where the target label is one when these are a pair of pictures of the

Â same person and where the target label is zero when these are pictures of

Â different persons and you use different pairs to train the neural network to

Â train the siamese network using back propagation so this version that you

Â just saw of treating face verification and by extension face recognition as a

Â binary classification problem this works quite well as well and so of that I hope

Â that you now know what it would take to train your own face verification or your

Â own face recognition system when they can do one-shot learning

Â