CatDogFoxBot #6 : Trying out an Artificial Neural Network

10 Aug 2019

As I am having trouble processing and interpreting the data obtained from the GridEye sensor when trying to detect a cat I decided I needed some artificial Intelligence to help me as all my intelligence is worn out. There is a programme for Arduino that implements an Artificial Neural Network, hopefully still available as shown below.

/******************************************************************
* ArduinoANN - An artificial neural network for the Arduino
* All basic settings can be controlled via the Network Configuration
* section.
* See robotics.hobbizine.com/arduinoann.html for details.
******************************************************************/

/* Adapted 10th Aug'19 to use data from the GridEye sensor
* to recognise a cat walking by at night, for Project14.
* Dubbie Dubbie
*/

Describing and implementing an ANN is a bit complicated so I'll leave that for someone else, but using one is quite easy. They operate by 'learning' from data sets provided by the user, which are ideally 'correct'. This particular ANN uses Back Propagation for the learning process. The training data set is presented to the ANN which then makes small adjustments to internal constants that are designed to reduce the error between the required outputs and the current outputs and is an iterative process that continues until the error falls below a specified value. Then once trained, real data can be presented to the trained network which will indicate what it contains. It is essentially a statistical process and the outputs are effectively given as probabilities. So there are no certainties and there is no guarantee that the ANN will have learned correctly.

So for me to use this I need to provide some GridEye style training data sets some of which contain will not contain a Cat and others that will. It is a memory hungry process and the programme would not compile into a Nano. Fortunately I have a MKRZERO that Element14 kindly provided as a reward for entering the 10 years of Element14 competition so I tried that. There are some issues. I can only use 7 x 7 data arrays as when I use 8 x 8 arrays the compiler fails but doesn't say why. It might be something to do with insufficient memory or maybe a compiler fault, or maybe something I did. At present I do not know. Also, the programme will not output any text to the Serial Monitor until later on in the programme - this problem I have not been able to work out. But it shows the results so I have left solving that problem for another time.

I have provided two training data sets which is about the minimum that you can have as there are two outputs: cat detected and no cat detected, so I need to provide an example of each. The first 7 x 7 array has all the elements set to 0.1. Neural networks work best with normalised data between the values of 0.0 and 1.0 although they will still work with any other numbers. Floating point numbers are used as the neural network needs floating point numbers to perform it's calculations. It is not a good idea to use 0.0 as it can lead to calculation problems and 1.0 isn't used either as this would indicate ideal data. Sensors do not provide ideal data so I have decided not to use that value. Therefore, for no cat I have set all the array values to 0.1. For the 7 x 7 array containing a cat I have put one value of 0.7 with the immediate neighbours being 0.2 or 0.4. I could have made this data a bit more ideal but the ANN will work better when the training data more closely resembles the actual data.

const float Input[PatternCount][InputNodes] = {
{ 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, // This is not cat detected
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1 },
{ 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, // This is a cat detected
    0.1, 0.1, 0.1, 0.2, 0.2, 0.1, 0.1,
    0.1, 0.1, 0.2, 0.7, 0.7, 0.2, 0.1,
    0.1, 0.1, 0.2, 0.4, 0.4, 0.2, 0.1,
    0.1, 0.1, 0.1, 0.2, 0.2, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1 }

There are obviously many more possible data sets that could be used in the training process which would result in a more accurate ANN, but it does increase the training time as well as leading to a greater possibility that the system will not train at all. I thought I would start with the simplest data and see how it goes.

The Arduino programme doesn't save any trained ANNs and it retrains every time it is started. This is a bit of a shame but is because you need to save the values from the trained ANN somewhere and the Arduino doesn't have an easily accessible storage area. It would not be that difficult to add a memory card or similar to implement a storage process. The drawback of not saving the trained ANN is that as the training starts with randomised values each time, then every time you restart the programme and train with the same data, you get a different trained ANN. You can see this is this implementation as I get a slightly different result every time I restart the programme.

Once the ANN is trained then you need to provide it with some real data for it make decisions on. As I have moved from the Nano to the MKRZERO the GridEye is not currently connected so I have made up a test input data 7 x 7 array, see below.

float Inputs[InputNodes] =

{ 0.1, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1, // Simulating a real cat

0.1, 0.1, 0.2, 0.2, 0.2, 0.1, 0.1,

0.1, 0.3, 0.8, 0.5, 0.1, 0.1, 0.1,

0.1, 0.3, 0.7, 0.4, 0.2, 0.1, 0.1,

0.1, 0.1, 0.2, 0.3, 0.1, 0.1, 0.1,

0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,

0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1 };

I've put some higher values nearish to the middle, surrounded by some lower values with the rest being mainly low values of 0.1. The results of using the trained ANN to 'recognise' the 'cat' are shown below:

CatDogFox Detector for Project14
Dubbie Dubbie
August 2019

Inputs
0.100 0.200 0.100 0.100 0.100 0.100 0.100
0.100 0.100 0.200 0.200 0.200 0.100 0.100
0.100 0.300 0.800 0.500 0.100 0.100 0.100
0.100 0.300 0.700 0.400 0.200 0.100 0.100
0.100 0.100 0.200 0.300 0.100 0.100 0.100
0.100 0.100 0.100 0.100 0.100 0.100 0.100
0.100 0.300 0.100 0.200 0.100 0.100 0.100

Output is 0.64

The output is given as 0.64. An output value of 0.1 would mean no cat and a value of 0.9 would be definitely a cat. So a value of 0.64 is indicating that it is more likely to be a cat than not a cat. This isn't particularly specific but then again, that is real life. A better outcome might be achieved (or it might not) by using more training 7 x 7 data arrays illustrating other correct cat detected values, as well as non cat values. So once I have sorted out the problem with the 8 x 8 arrays I can start trying them .

Once the programme seems to be working correctly then I can connect the GridEye sensor directly to the MKRZERO and use live data. I'm not sure if I'll be able to get to that point as this could be a lot of work and there isn't much time left before this Project14 competition ends.

I have attached the programme to this email and I would welcome any suggestions on why it will not work with 8 x 8 arrays, as well as why the Serial.print commands do not work near the beginning of the programme.

Dubbie

PS For some reason it would not let me attach the Arduino programme file so I have stuffed in it here. Hopefully this is OK.

/* Adapted 10th Aug'19 to use data from the GridEye sensor
* to recognise a cat walking by at night, for Project14.
* Dubbie Dubbie
*/

#include <math.h>

/******************************************************************
* PIN CONFIGURATION OF INPUT SENSORS(IR SENSORS)
*/

/********************
* PIN CONFIGURATION OF LP293D PINS MOTOR DRIVER TO BE DONE
*/

#define sigfig 3

/******************************************************************
* Network Configuration - customized per network
******************************************************************/

const int PatternCount = 2;
const int InputNodes = 49;
const int HiddenNodes = 60;
const int OutputNodes = 1;
const int NumbInRow = 7;
const float LearningRate = 0.3;
const float Momentum = 0.9;
const float InitialWeightMax = 0.5;
const float Success = 0.0004;

//const byte Input[PatternCount][InputNodes] = {
const float Input[PatternCount][InputNodes] = {
{ 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, // This is not cat detected
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1 },
{ 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, // This is a cat detected
    0.1, 0.1, 0.1, 0.2, 0.2, 0.1, 0.1,
    0.1, 0.1, 0.2, 0.7, 0.7, 0.2, 0.1,
    0.1, 0.1, 0.2, 0.4, 0.4, 0.2, 0.1,
    0.1, 0.1, 0.1, 0.2, 0.2, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1 }
};

// const byte Target[PatternCount][OutputNodes] = {
const float Target[PatternCount][OutputNodes] = {
{ 0.1 },
{ 0.9 },
};

/******************************************************************
* End Network Configuration
******************************************************************/

int i, j, p, q, r;
int ReportEvery1000;
int RandomizedIndex[PatternCount];
long TrainingCycle;
float Rando;
float Error;
float Accum;
int front;

float Hidden[HiddenNodes];
float Output[OutputNodes];
float HiddenWeights[InputNodes+1][HiddenNodes];
float OutputWeights[HiddenNodes+1][OutputNodes];
float HiddenDelta[HiddenNodes];
float OutputDelta[OutputNodes];
float ChangeHiddenWeights[InputNodes+1][HiddenNodes];
float ChangeOutputWeights[HiddenNodes+1][OutputNodes];

void setup(){
Serial.begin(9600);
randomSeed(analogRead(3));
ReportEvery1000 = 1;
for( p = 0 ; p < PatternCount ; p++ ) {
RandomizedIndex[p] = p ;
}
delay(250);
} /* setup */

void loop (){
front = 0;

Serial.println(" CatDogFox Detector for Project14 ");
Serial.println(" Dubbie Dubbie ");
Serial.println(" August 2019 ");
Serial.println(" ");
// delay(5000);
// toTerminal();

/******************************************************************
* Initialize HiddenWeights and ChangeHiddenWeights
******************************************************************/

for( i = 0 ; i < HiddenNodes ; i++ ) {
    for( j = 0 ; j <= InputNodes ; j++ ) {
      ChangeHiddenWeights[j][i] = 0.0 ;
      Rando = float(random(100))/100;
      HiddenWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ;
    }
}
/******************************************************************
* Initialize OutputWeights and ChangeOutputWeights
******************************************************************/

for( i = 0 ; i < OutputNodes ; i ++ ) {
    for( j = 0 ; j <= HiddenNodes ; j++ ) {
      ChangeOutputWeights[j][i] = 0.0 ;
      Rando = float(random(100))/100;
      OutputWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ;
    }
}
Serial.println("Initial/Untrained Outputs: ");
toTerminal();
/******************************************************************
* Begin training
******************************************************************/

for( TrainingCycle = 1 ; TrainingCycle < 2147483647 ; TrainingCycle++) {

/******************************************************************
* Randomize order of training patterns
******************************************************************/

    for( p = 0 ; p < PatternCount ; p++) {
      q = random(PatternCount);
      r = RandomizedIndex[p] ;
      RandomizedIndex[p] = RandomizedIndex[q] ;
      RandomizedIndex[q] = r ;
    }
    Error = 0.0 ;
/******************************************************************
* Cycle through each training pattern in the randomized order
******************************************************************/
    for( q = 0 ; q < PatternCount ; q++ ) {
      p = RandomizedIndex[q];

/******************************************************************
* Compute hidden layer activations
******************************************************************/

      for( i = 0 ; i < HiddenNodes ; i++ ) {
        Accum = HiddenWeights[InputNodes][i] ;
        for( j = 0 ; j < InputNodes ; j++ ) {
          Accum += Input[p][j] * HiddenWeights[j][i] ;
        }
        Hidden[i] = 1.0/(1.0 + exp(-Accum)) ;
      }

/******************************************************************
* Compute output layer activations and calculate errors
******************************************************************/

      for( i = 0 ; i < OutputNodes ; i++ ) {
        Accum = OutputWeights[HiddenNodes][i] ;
        for( j = 0 ; j < HiddenNodes ; j++ ) {
          Accum += Hidden[j] * OutputWeights[j][i] ;
        }
        Output[i] = 1.0/(1.0 + exp(-Accum)) ;
        OutputDelta[i] = (Target[p][i] - Output[i]) * Output[i] * (1.0 - Output[i]) ;
        Error += 0.5 * (Target[p][i] - Output[i]) * (Target[p][i] - Output[i]) ;
      }

/******************************************************************
* Backpropagate errors to hidden layer
******************************************************************/

      for( i = 0 ; i < HiddenNodes ; i++ ) {
        Accum = 0.0 ;
        for( j = 0 ; j < OutputNodes ; j++ ) {
          Accum += OutputWeights[i][j] * OutputDelta[j] ;
        }
        HiddenDelta[i] = Accum * Hidden[i] * (1.0 - Hidden[i]) ;
      }

/******************************************************************
* Update Inner-->Hidden Weights
******************************************************************/

      for( i = 0 ; i < HiddenNodes ; i++ ) {
        ChangeHiddenWeights[InputNodes][i] = LearningRate * HiddenDelta[i] + Momentum * ChangeHiddenWeights[InputNodes][i] ;
        HiddenWeights[InputNodes][i] += ChangeHiddenWeights[InputNodes][i] ;
        for( j = 0 ; j < InputNodes ; j++ ) {
          ChangeHiddenWeights[j][i] = LearningRate * Input[p][j] * HiddenDelta[i] + Momentum * ChangeHiddenWeights[j][i];
          HiddenWeights[j][i] += ChangeHiddenWeights[j][i] ;
        }
      }

/******************************************************************
* Update Hidden-->Output Weights
******************************************************************/

      for( i = 0 ; i < OutputNodes ; i ++ ) {
        ChangeOutputWeights[HiddenNodes][i] = LearningRate * OutputDelta[i] + Momentum * ChangeOutputWeights[HiddenNodes][i] ;
        OutputWeights[HiddenNodes][i] += ChangeOutputWeights[HiddenNodes][i] ;
        for( j = 0 ; j < HiddenNodes ; j++ ) {
          ChangeOutputWeights[j][i] = LearningRate * Hidden[j] * OutputDelta[i] + Momentum * ChangeOutputWeights[j][i] ;
          OutputWeights[j][i] += ChangeOutputWeights[j][i] ;
        }
      }
    }

/******************************************************************
* Every 1000 cycles send data to terminal for display
******************************************************************/
    ReportEvery1000 = ReportEvery1000 - 1;
    if (ReportEvery1000 == 0)
    {
      Serial.println();
      Serial.println();
      Serial.print ("TrainingCycle: ");
      Serial.print (TrainingCycle);
      Serial.print (" Error = ");
      Serial.println (Error, 5);

toTerminal();

      if (TrainingCycle==1)
      {
        ReportEvery1000 = 999;
      }
      else
      {
        ReportEvery1000 = 1000;
      }
    }

/******************************************************************
* If error rate is less than pre-determined threshold then end
******************************************************************/

if( Error < Success ) break ;
}
Serial.println ();
Serial.println();
Serial.print ("TrainingCycle: ");
Serial.print (TrainingCycle);
Serial.print (" Error = ");
Serial.println (Error, 5);

toTerminal();

Serial.println ();
Serial.println ();
Serial.println ("Training Set Solved! ");
Serial.println ("--------");
Serial.println ();
Serial.println ();
ReportEvery1000 = 1;

/* TRAINING SET SOLVED, error less that 0.0004
   * adding the part where you run the PROGRAM
   * STEP 1: READ THE SENSOR VALUES
   * STEP 2: RUN THE INPUT TO NEURAL NETWORK -> OUTPUT CALCULATION
   * STEP 3 ALPHA: PRINT EXE COMMAND ON SCREEN
   * STEP 4 FINAL: RUN MOTORS
   */
Serial.println(" CatDogFox Detector for Project14 ");
Serial.println(" Dubbie Dubbie ");
Serial.println(" August 2019 ");
Serial.println(" ");

while(1){

// STEP 1

float Inputs[InputNodes] =
{ 0.1, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1, // Simulating a real cat
    0.1, 0.1, 0.2, 0.2, 0.2, 0.1, 0.1,
    0.1, 0.3, 0.8, 0.5, 0.1, 0.1, 0.1,
    0.1, 0.3, 0.7, 0.4, 0.2, 0.1, 0.1,
    0.1, 0.1, 0.2, 0.3, 0.1, 0.1, 0.1,
    0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
    0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1 };

Serial.print ("Inputs "); Serial.println(" ");

    for( i = 0 ; i < InputNodes ; i++ )
      {
        Serial.print (Inputs[i], 3);
        Serial.print (" ");
        if (((i+1) % NumbInRow) == 0) Serial.println(" ");
      } /* for */
    Serial.println(" ");
// STEP 2
/******************************************************************
* Compute hidden layer activations
******************************************************************/

    for( i = 0 ; i < HiddenNodes ; i++ ) {
      Accum = HiddenWeights[InputNodes][i] ;
      for( j = 0 ; j < InputNodes ; j++ ) {
        Accum += Inputs[j] * HiddenWeights[j][i] ;
      }
      Hidden[i] = 1.0/(1.0 + exp(-Accum)) ;
    }

/******************************************************************
* Compute output layer activations and calculate errors
******************************************************************/

    for( i = 0 ; i < OutputNodes ; i++ ) {
      Accum = OutputWeights[HiddenNodes][i] ;
      for( j = 0 ; j < HiddenNodes ; j++ ) {
        Accum += Hidden[j] * OutputWeights[j][i] ;
      }
      Output[i] = 1.0/(1.0 + exp(-Accum)) ;
    }

    // STEP 4 Display the Results
    Serial.print("Output is ");
    Serial.print(Output[0]);
    Serial.println(" ");
    Serial.println(" ");

// while(1){ /* Do nothing */ } /* while */

delay(3000);
}
}

void toTerminal()
{

for( p = 0 ; p < PatternCount ; p++ ) { // compute output for each input values of training samples
    Serial.println();
    Serial.print (" Training Pattern: ");
    Serial.println (p);
    Serial.print (" Input ");
    for( i = 0 ; i < InputNodes ; i++ ) {
      Serial.print (Input[p][i], sigfig);
      Serial.print (" ");
    }
    Serial.print (" Target ");
    for( i = 0 ; i < OutputNodes ; i++ ) {
      Serial.print (Target[p][i], sigfig);
      Serial.print (" ");
    }
/******************************************************************
* Compute hidden layer activations
******************************************************************/

    for( i = 0 ; i < HiddenNodes ; i++ ) {
      Accum = HiddenWeights[InputNodes][i] ;
      for( j = 0 ; j < InputNodes ; j++ ) {
        Accum += Input[p][j] * HiddenWeights[j][i] ;
      }
      Hidden[i] = 1.0/(1.0 + exp(-Accum)) ;
    }

/******************************************************************
* Compute output layer activations and calculate errors
******************************************************************/

    for( i = 0 ; i < OutputNodes ; i++ ) {
      Accum = OutputWeights[HiddenNodes][i] ;
      for( j = 0 ; j < HiddenNodes ; j++ ) {
        Accum += Hidden[j] * OutputWeights[j][i] ;
      }
      Output[i] = 1.0/(1.0 + exp(-Accum)) ;
    }
    Serial.print (" Output ");
    for( i = 0 ; i < OutputNodes ; i++ ) {
      Serial.print (Output[i], sigfig);
      Serial.print (" ");
    }
}

}