Calibrate Tracking Result With Linear Regression

I applied linear regression to gaze tracking. The tracking result is getting really nice now. It has only about 1~2 cm tracking error. So we can use this for real application I think.

Actually, I tried to implement support vector regression, but it was way too hard for me… I will try it later.

I think I did my best to calibrating tracking result. Now to improve result more, I have to put more variable data to NN!

I Should Apply CapsNet For My Network.

Prof. Hinton who made CNN, release brand new neural network for compute image that called Capsule network.

It manage image’s data as Capsule that contain some high-level image feature like stroke.

This is seems nice solution for my network, because gaze track task is very spacial depend.

Actually I planned to implement this next year, but I saw other guys starting to use CapsNet already in Facebook.

I need to update alwayyys.

Gaze tracking is now acceptable quality!

You can see tracking test since 30 sec.

I applied updated neural network and I manually applied simple linear regression. Just like ptNew = (ptOld + biasVec) * scale.

I will implement some machine learning tech for calibration now. Just like support vector regression. :3

I think I will use NN for it too. NN is the most simple to use it if I have many data and nice machine. Simple multi-layered perceptron is easy.

What’s wrong with CNN regression?

In my koi17 project, I am using convolutional neural network regression model.

My model is working with standard VGG16 architecture.

But the performance of NN is not enough for my goal, and it also seems under fitted too.

So I decided stack more layer and change the architecture to ResNet. I stacked 4 of the residual block.

In a usual classification problem, ResNet must be working much better than VGG… But in my case, it wasn’t.

ResNet worked worse then VGG, and then I tested shallow DenseNet too, but it showed up higher bias and test-acc is overfitted too…

I think deeper CNN is not a good solution in regression problem because ConvNet ignores geometric placements.


I have no solution to solve this problem yet…


-Update 2017-11-16

I found what is the problem with my model. I made too narrow space for CNN. I passed 4×4 image for last of 5~6 layers and it made overfitting because filter size was 3×3, so 9 of 16 nodes were fully connected. So it increased more fully connected ratio in my network so it got overfitted.

I removed some pooling layer on top of the model so the last layer got the 8×8 image to compute. And it solved the problem.

Progress Report: Optimize Neural Network And Some Landmark Detectors.

I’m going to try to summarize my works on Neural Action for this few months.

Neural Network (Gaze Estimation)

Model Changes

I changed the model structure to use ROI image of the face too.

It made my NN result is strong to face rotations. I will post model structure images later too.

While change model structure, I did really A LOT of parameter optimizations too.

Model Optimizations

SELU Implemented

I added SELU (scaled exponential linear units). It amazing brand-new NN normalizing technology. I love it a lot. I mean ALOT.

When I use BN (batch normalizations) for NN training, my training error is ~8.5 degree, and then apply magical SELU, training error going to be ~6 degrees. 50% of benefit… 🙂

Weight Decay

I added weight decay to training loss. Weight decay is weight normalize technology. It makes a sum of L2Loss of weights gonna smaller (preventing weight explode).

Learning-rate Decay

I added learning rate decay. Learning rate decay is decay learning rate while training. L.r decay helps to settle NN to lower loss. I applied exponential learning rate decay every 12 epochs.

I checked out this helped my NN error decreases ~2 degrees.

Parameter Optimizations

Hmm, I did parameter optimizations too, but I am not sure what I actually did in this section. Parameters in NN is really chaos. We can’t manage them all by brute force with just 1 machine.

Ex) Loss functions, learning rate, conv channels, filter size, input size, output functions, hidden node counts, batch size, weight decay rate, learning rate decay rate, decay steps, optimizer algorithm, layer depths, feature sharing. And there are much more parameters if add more NN optimization technics.

SELU vs RELU(with BN) Compare

Mean error is an error between estimated gaze vector. You can change it to degrees like this.

edegree=tan-1 ( eerror )

  • RELU Error/Loss Graph. You can see larger overfitting gaps then SELU.

  • SELU Error/Loss Graph

Both graphs are at 103 epoch. I couldn’t get more than 103 epoch because while training my RELU NN, bugs occurred.

But SELU one is keep trained until 250 epoch.

This is the final graph of SELU training one.

Facial Landmark Detector Changes

OpenFace (by Tadas)

I wrote about C# wrapper of it on my previous post. It is lastest facial landmark detector.

Compare to previous backend “flandmark”, it’s really high accuracy. Actually, compare to OpenFace, flandmark’s performance and accuracy are almost useless 🙁

Compare between Flandmark

  • This is translate/test look vector graph.

  • And this is a graph of OpenFace

You can see Flandmark have a lot of detection error. So it should affect to train dataset generator too.

So by changing landmark library to OpenFace, I could improve landmark/dataset accuracy.

Finally OpenFace Porting Is Finished

Yeap, finally I got result of Openface portation to C# Vision library ;3

The ported library called SharpFace (Sharp for C# hehe)

이미지: 사람 1명 이상, 텍스트

OpenFace is open source face landmark detector made by Tadas.

I ported it into C# for my competition project’s face landmark detector.

I used Flandmark lib previously, and I got huge accuracy advantage after change library to OpenFace.

See those graphs.

This is the graph of translate vector and rotation vector while using Flandamark + Kalman filter.

You can see huge detection errors on values.

And this is current detection results! See clean graph… It is even without Kalman filters.

Hehe, now my next task is re-gen my whole datasets with it and then re-train NN.


Long time no see, my blog.

I’m still alive here. I think I will post about competition later too. It is over in September, but I need to work on next one, so I don’t have time to keep updating blog ;-;

And I live in GitHub usually in these days. I spent my whole thanks giving holidays on GitHub. 🙂

You are welcome to visit my GitHub repos.

Prograss Report, Vision : Cross-Platform Tensorflow, OpenCV Wrapping Library

For “Neural Action”, my Koi project, I needed OpenCV and Tensorflow on cross-platform and C# language support.

So I created OpenCV wrapper for Xamarin, that C# cross-platform framework, and I ported legendary migueldeicaz’s project into Xamarin.

The result of this combination is computer vision library with tensorflow!

In my project, I will use both for eye gaze estimation with CNN and word suggestion with RNN (seq2seq).

I am really afraid this method is working properly because there are not any success examples…


Github Links


Tensorflow for Xamarin:

p.s. There is not any example on TensorFlowSharp repository. You can find some application implements on Vision repository.

p.s. Currently, TFSharp for Xamarin just supports Windows, Andorid, and UWP, because I don’t have any Mac 🙁