Training the AI brain / Python is SLOW

I got up and running with Keras quite quickly – a pipenv install of tensorflow==2.0.0rc2, keras, numpy==1.16.4 was the right combination.

Fairly quickly I threw together something to use small half-second snippets of samples across 26 classes (“people”!), some LSTM neural net layers and managed to my astonishment to build a brain with almost 80 percent accuracy, without any tuning, in about 30 seconds flat:

Annotation 2019-09-27 135958

Given it’s the first time I’ve used any ML library, I was astonished at how the pieces had fallen into place to give this result so early.

Of course, all I read about are models which achieve well over 90/95% accuracy, and this is only across 26 classes. But I’m very happy already 🙂

Python, you’re killing me

About an hour ago I sat down to update the blog, and this is the third blog post in a row I’ve put together. This was to give Python some time to run a fairly straightforward data processing operation. It’s still not done, and I’ve no idea how much longer it will take.

I get it – there’s a fair amount of data here, and Python isn’t JITted, there’s the GIL, threading is tricky with it. But man, as many people’s introduction to development, and being popular in the data science field, a whole load of people must get immensely frustrated with this.

Update 10 mins later: “ValueError: Can’t convert non-rectangular Python sequence to Tensor.”. Great, guess I’ll take a look and run it again!

Makes me love C# more and more..

The Secret AI Master Plan (just between you and me)

First, hyperparameters! They sound more awesome than they are, so I’ll explain. When putting together a Machine Learning model, there are many decisions to make. For a neural network, how many layers? How many units in each layer? What learning rate to use? And so on. Each of these values are known as hyperparameters.

The big surprise here is that while there’s some guidance and math around choosing these values, it’s generally a case of trial-and-error. The idea is to experiment: tweak each of these values, individually and in combination, until the best results are observed.

So at some point, I’ll parameterise the model-building test program so it can be automated. I could run all of this on a normal laptop/desktop computer, and may do, but I’m thinking I’ll like Dockerise and run some Kubernetes jobs to try them out more quickly and in parallel.

My exact to-do list for the AI aspect is currently:

  1. Convert the rotation data, currently in quaternion form, to two orthogonal unit vectors, rotated by the quaternion. If you imagine something spinning around, at some point, the angle will jump from 359 to 0 degrees. Despite being a continual movement, the data going into the model would “jerk” suddenly when this happens. I’ll try this with my initial prototype version to see if things improve much.
  2. Run the model over a single-person test set. I’d love to see for myself, over the 1,000 samples for my own data, how it guesses each sample!
  3. Have the model-building program persist the model to disk
  4. Create a separate app with HTTP API (maybe I’ll use that Flask thingy), which either takes raw telemetry data, or maybe a chunk ID and goes and fetches the data itself, and returns the model’s classification output
  5. Add another main BeatBrain API endpoint to analyse/check a session’s chunks, and return some cool stats!

As above at some point I’ll want to tune and improve the model a whole load. Getting everything hooked up is more important though, I think.

Leave a comment