My best wishes for a new year that full-fills your dreams and hope happy new year.
I’m going to code in this year too.
I applied linear regression to gaze tracking. The tracking result is getting really nice now. It has only about 1~2 cm tracking error. So we can use this for real application I think.
Actually, I tried to implement support vector regression, but it was way too hard for me… I will try it later.
I think I did my best to calibrating tracking result. Now to improve result more, I have to put more variable data to NN!
Prof. Hinton who made CNN, release brand new neural network for compute image that called Capsule network.
It manage image’s data as Capsule that contain some high-level image feature like stroke.
This is seems nice solution for my network, because gaze track task is very spacial depend.
Actually I planned to implement this next year, but I saw other guys starting to use CapsNet already in Facebook.
I need to update alwayyys.
You can see tracking test since 30 sec.
I applied updated neural network and I manually applied simple linear regression. Just like ptNew = (ptOld + biasVec) * scale.
I will implement some machine learning tech for calibration now. Just like support vector regression. :3
I think I will use NN for it too. NN is the most simple to use it if I have many data and nice machine. Simple multi-layered perceptron is easy.
In my koi17 project, I am using convolutional neural network regression model.
My model is working with standard VGG16 architecture.
But the performance of NN is not enough for my goal, and it also seems under fitted too.
So I decided stack more layer and change the architecture to ResNet. I stacked 4 of the residual block.
In a usual classification problem, ResNet must be working much better than VGG… But in my case, it wasn’t.
ResNet worked worse then VGG, and then I tested shallow DenseNet too, but it showed up higher bias and test-acc is overfitted too…
I think deeper CNN is not a good solution in regression problem because ConvNet ignores geometric placements.
I have no solution to solve this problem yet…
I found what is the problem with my model. I made too narrow space for CNN. I passed 4×4 image for last of 5~6 layers and it made overfitting because filter size was 3×3, so 9 of 16 nodes were fully connected. So it increased more fully connected ratio in my network so it got overfitted.
I removed some pooling layer on top of the model so the last layer got the 8×8 image to compute. And it solved the problem.
I’m going to try to summarize my works on Neural Action for this few months.
I changed the model structure to use ROI image of the face too.
It made my NN result is strong to face rotations. I will post model structure images later too.
While change model structure, I did really A LOT of parameter optimizations too.
I added SELU (scaled exponential linear units). It amazing brand-new NN normalizing technology. I love it a lot. I mean ALOT.
When I use BN (batch normalizations) for NN training, my training error is ~8.5 degree, and then apply magical SELU, training error going to be ~6 degrees. 50% of benefit… 🙂
I added weight decay to training loss. Weight decay is weight normalize technology. It makes a sum of L2Loss of weights gonna smaller (preventing weight explode).
I added learning rate decay. Learning rate decay is decay learning rate while training. L.r decay helps to settle NN to lower loss. I applied exponential learning rate decay every 12 epochs.
I checked out this helped my NN error decreases ~2 degrees.
Hmm, I did parameter optimizations too, but I am not sure what I actually did in this section. Parameters in NN is really chaos. We can’t manage them all by brute force with just 1 machine.
Ex) Loss functions, learning rate, conv channels, filter size, input size, output functions, hidden node counts, batch size, weight decay rate, learning rate decay rate, decay steps, optimizer algorithm, layer depths, feature sharing. And there are much more parameters if add more NN optimization technics.
Mean error is an error between estimated gaze vector. You can change it to degrees like this.
edegree=tan-1 ( eerror )
Both graphs are at 103 epoch. I couldn’t get more than 103 epoch because while training my RELU NN, bugs occurred.
But SELU one is keep trained until 250 epoch.
This is the final graph of SELU training one.
I wrote about C# wrapper of it on my previous post. It is lastest facial landmark detector.
Compare to previous backend “flandmark”, it’s really high accuracy. Actually, compare to OpenFace, flandmark’s performance and accuracy are almost useless 🙁
You can see Flandmark have a lot of detection error. So it should affect to train dataset generator too.
So by changing landmark library to OpenFace, I could improve landmark/dataset accuracy.
Yeap, finally I got result of Openface portation to C# Vision library ;3
The ported library called SharpFace (Sharp for C# hehe)
OpenFace is open source face landmark detector made by Tadas.
I ported it into C# for my competition project’s face landmark detector.
I used Flandmark lib previously, and I got huge accuracy advantage after change library to OpenFace.
See those graphs.
This is the graph of translate vector and rotation vector while using Flandamark + Kalman filter.
You can see huge detection errors on values.
And this is current detection results! See clean graph… It is even without Kalman filters.
Hehe, now my next task is re-gen my whole datasets with it and then re-train NN.
CAC sent an email that my CAC account email is will be removed on September 1st.
I am downloading my data of account now… But Google said it will take a long time to download. ;-;
And also I changed email account of Codex to my personal one too. I can keep contact via Codex 🙂
P.S. Thanks Mr.Miller for telling me about downloading data 🙂
What is BF
BF is one of the most famous esoteric programming languages.
It has 8 kinds of operators (
> move the memory pointer,
] are a loop,
- add, and subtract value in memory, and
, is I/O functions.
The program has the little editor for a script.
Write a script, and press run button.
For “Neural Action”, my Koi project, I needed OpenCV and Tensorflow on cross-platform and C# language support.
So I created OpenCV wrapper for Xamarin, that C# cross-platform framework, and I ported legendary migueldeicaz’s project into Xamarin.
The result of this combination is computer vision library with tensorflow!
In my project, I will use both for eye gaze estimation with CNN and word suggestion with RNN (seq2seq).
I am really afraid this method is working properly because there are not any success examples…
Tensorflow for Xamarin: https://github.com/gmlwns2000/TensorFlowSharp
p.s. There is not any example on TensorFlowSharp repository. You can find some application implements on Vision repository.
p.s. Currently, TFSharp for Xamarin just supports Windows, Andorid, and UWP, because I don’t have any Mac 🙁