Table of Contents
Introduction
In my previous blog (experiment 4), I stated that my next step would be to work out a methodology that would give me a consistent gesture result. I indicated that before jumping on the machine-learning bandwagon, I would spend a bit more time researching gesture recognition algorithms.
Well, it did not take long to find this website https://nickgillian.com/ which provided some interesting details about a machine learning gesture recognition toolkit. This looked quite promising, but after skimming through the GitHub pages I decided to shelve this idea. At face value, this toolkit was more detailed than required but I did spot a swipe gesture class, so I might revisit it at a later stage.
But first I had to go back to the original gesture library to figure out what data I could use that was relevant for a learning model, as there’s little value adding machine learning complexity to any system if it’s potentially a GIGO (garbage in, garbage out) system.
Centre of Mass calculated values
Updating my code base
As demonstrated in my previous blog, the noise filtering algorithm and the creation of normalised values using background/foreground filtering appeared to be working really well.
So I decided to revise my MbedOS 6.16 firmware for the MAX32620FTHR board so that the serial output included the background/foreground filtered data rather than just raw data.
To implement this I created two C++ classes. The first class, as before, handles the driver interface with the MAX25404 chip, offering either a SPI or I2C bus interface, with the additional class handling the gesture algorithms based on the pixel data received. In the main routine, I simply call the function "processGesture" to get the new data.
I also decided to reduce the sleep interval between gesture measurements from 200msec down to 100msec to get more defined gesture data. This is done by modifying the MAX25x05 configuration register:
I have updated my GitHub repository with the new firmware code: https://github.com/Gerriko/Max25x05_MbedOS6
What I still had to prove was whether the pixel interpolation algorithm and then the centre of mass calculation was all that was needed to enable a consistent gesture detection result or whether I needed additional data.
As demonstrated in my previous blog, this is where I've used the Processing IDE to create a desktop application that display the data in real time via a graphical user interface.
By way of update, I removed the redundant noise filtering and normalisation functions and I added in data logging functionality so that I could capture the Centre of Mass x and y values. This code is also available in the GitHub repository.
I also made significant changes to the UI, where I added separate output for the heat map and for the Centre of Mass gesture movements. The normalised pixel values on the left were then duplicated with the horizontal line representing pixels ordered left to right and the vertical line representing pixels ordered top to bottom. This output, while useful to begin with, has proven less helpful than the heat map and the gesture centre of mass movements. Still it helps confirm which pixels are detecting the movement.
I also added a new pixel object class, called PixelProperty, which handles the graphical pixel and heat map displays, using different colours for different pixels depending on location.
I then also added in some data logging functionality as I needed to this to help with my analysis of the Centre of Mass values. This can be turned on or off using a mouse click. When data logging is on and a gesture is in progress the screen background changes to green.
There is also a results screen to display gesture detection. I have included some rudimentary code for left, right, up and down movements. More work will be done once I have evaluated my test data.
{gallery}Processing App Gesture Graphical User Interface |
---|
Testing Centre of Mass behaviour
The first test I did was determine the degree of movement of the Centre of Mass x and y coordinates. Did it really provide a proper grid reference, for example, as it was hard to determine watching in real time.
Well, the good news is that it does detect the full range of movement. By using a scaling factor (0 - 100) I was able to capture Centre-of-Mass movement across the full measurement range.
Of course, some of you may be wondering why this is not in a 10 x 6 aspect ratio.
Well, the reason it's not is that the “y” pixels are adjusted using a scaling factor inside the firmware (note that I changed this as in the original firmware framework it was set at a 10x7 ratio):
Then for the second test I wanted to try and determine how stable or predictable was the centre of mass calculated position when making specific gesture movements. If I could demonstrate a consistent pattern then this will help determine the degree of fuzziness required to determine the gesture movement with reasonable accuracy and without false positives etc.
So I set about capturing Centre of Mass data for each known gesture (as used on the demo GUI software). Here are the results:
{gallery}Centre of Mass values for different gestures |
---|
Distribution of Centre of Mass values for Left to Right gestures |
Distribution of Centre of Mass values for Right to Left gestures |
Distribution of Centre of Mass values for Upward hand gestures |
Distribution of Centre of Mass values for Downward hand gestures |
Distribution of Centre of Mass values for Click (hand flick) gestures |
Distribution of Centre of Mass values for CW (rotational) gestures |
Distribution of Centre of Mass values for CCW (rotational) gestures |
Overall I am very pleased with the data. It demonstrates some nice distinctive patterns and I have already created some basic gesture recognition patterns. Here is a very short video demo (more videos will follow):
Conclusions and Next Steps
Based on the results shown above, it’s clear that different gestures can be detected using Centre of Mass values, however there is a large degree of fuzziness that needs to be handled through some clever coding. As such, my next step will be to continue to improve on my algorithms to handle this fuzziness. This may involve machine learning if it proves difficult to hard code.
At least I have a toolkit that may help me, if required.