I’m going to try to summarize my works on Neural Action for this few months.
Neural Network (Gaze Estimation)
I changed the model structure to use ROI image of the face too.
It made my NN result is strong to face rotations. I will post model structure images later too.
While change model structure, I did really A LOT of parameter optimizations too.
I added SELU (scaled exponential linear units). It amazing brand-new NN normalizing technology. I love it a lot. I mean ALOT.
When I use BN (batch normalizations) for NN training, my training error is ~8.5 degree, and then apply magical SELU, training error going to be ~6 degrees. 50% of benefit… 🙂
I added weight decay to training loss. Weight decay is weight normalize technology. It makes a sum of L2Loss of weights gonna smaller (preventing weight explode).
I added learning rate decay. Learning rate decay is decay learning rate while training. L.r decay helps to settle NN to lower loss. I applied exponential learning rate decay every 12 epochs.
I checked out this helped my NN error decreases ~2 degrees.
Hmm, I did parameter optimizations too, but I am not sure what I actually did in this section. Parameters in NN is really chaos. We can’t manage them all by brute force with just 1 machine.
Ex) Loss functions, learning rate, conv channels, filter size, input size, output functions, hidden node counts, batch size, weight decay rate, learning rate decay rate, decay steps, optimizer algorithm, layer depths, feature sharing. And there are much more parameters if add more NN optimization technics.
SELU vs RELU(with BN) Compare
Mean error is an error between estimated gaze vector. You can change it to degrees like this.
edegree=tan-1 ( eerror )
- RELU Error/Loss Graph. You can see larger overfitting gaps then SELU.
Both graphs are at 103 epoch. I couldn’t get more than 103 epoch because while training my RELU NN, bugs occurred.
But SELU one is keep trained until 250 epoch.
This is the final graph of SELU training one.
Facial Landmark Detector Changes
OpenFace (by Tadas)
I wrote about C# wrapper of it on my previous post. It is lastest facial landmark detector.
Compare to previous backend “flandmark”, it’s really high accuracy. Actually, compare to OpenFace, flandmark’s performance and accuracy are almost useless 🙁
Compare between Flandmark
- This is translate/test look vector graph.
- And this is a graph of OpenFace
You can see Flandmark have a lot of detection error. So it should affect to train dataset generator too.
So by changing landmark library to OpenFace, I could improve landmark/dataset accuracy.