A Beginning Journey in TensorFlow #2: Simple Image Recognition

1 Oct 2019

This is the second of a series exploring TensorFlow. Since I have no training in machine learning it does not consist of tutorials but has links to the material that I am using. I use my Windows PC to train models and plan to use a Raspberry Pi to do object classification. The main purpose will be to document the things I learn along the way and perhaps interest you in a similar journey.

Recap

TensorFlow can do two types of deep learning: Regression and Classification. In the last post Regression was described. In the model diagrammed below (taken from Udacity training material) input data feeds one or more dense hidden layers and gives a single output number.

In this post the model is extended to recognize classifications of images, in this case clothing. The output for an image after training is the probability that it is one of the classes that have been was in the training dataset.

Flattening the Image

The model takes in a vector as input. Accordingly, the pixels in the two dimensional image are converted to a one dimensional array as shown in the very simple example below.

The training dataset in this example used here has 28 x 28 pixel gray scale images. The grey scale values for each pixel will be normalized to floating point numbers between 0 and 1.

NMIST Fashion Dataset

The NMIST Fashion Dataset is easily imported into Colab which is the Jupyter notebook in the cloud that is used in conjunction with the Udacity training. It consists of low resolution 28 x 28 pixel gray scale images for 10 different types of clothing. There are 70,000 images in total with a sample shown below.

As noted above the resolution of the images is quite low.

One of my observations from facial recognition was that it also was able to do classification with low resolution images which was a surprise to me.

Developing a Model

The dataset is split into two parts:

Training Dataset: 60,000 representative images that will be used to train the model
Test Dataset: 10,000 images used to test the model after training - must be images the model has not seen before

The images are flattened as described above and pixels normalized to values between 0 and 1. The model itself is very similar to a Regression model with exceptions as outlined in the diagram below taken from the Udacity training material.

A single dense layer of arbitrary 128 units is selected
The activation function ReLU is added to the layer which allows it to solve non-linear equations
The output layer has 10 units corresponding to the items of clothing in the input dataset and uses SoftMax for the activation function

The model is compiled using the optimizer 'adam' as used before but a new loss function 'sparse_categorical_crossentropy' is introduced without much discussion. It is said to be used normally for these type models. Training is done for 5 epochs.

In Colab, which runs in the cloud, the training completes in about 2 minutes with 89% accuracy.

The test dataset has 87% accuracy and all 10,000 images are analyzed in 3 seconds.

Conclusion and Looking Ahead

The 87% accuracy may or may not be acceptable for a given application but I have looked ahead to the next lesson and know that it provides information that will improve accuracy. Accordingly, I will wait until then to develop my own dataset for training.

Useful Links

A Beginning Journey in TensorFlow #1: Regression

A Beginning Journey in TensorFlow #3: ReLU Activation

A Beginning Journey in TensorFlow #4: Convolutional Neural Networks

A Beginning Journey in TensorFlow #5: Color Images

A Beginning Journey in TensorFlow #6: Image Augmentation and Dropout

RoadTest with OpenCV

Picasso Art Deluxe with OpenCV

Udacity Into to TensorFlow for DeepLearning

Top Comments

fmilburn over 5 years ago in reply to Sean_Miller

With a complex Neural Network and especially with convolutional neural networks the model teaches itself what is important in the data it is given. Similar to what you have expressed, I think of the model as creating filters to accentuate what is important and remove what isn't.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
Sean_Miller over 5 years ago

First, my AI says that's a Creeper. :-)

A thought on Low res pictures... I wonder if the age old tech to reduce resolution already has some good algorithms to keep features of the image. For example, that the arm is separated from the center section of a shirt versus blurring it all into a rectangle.

So in a way, going from hi-res to lo-res is sort of a pre-processing that washes out the less important data that will be used to make the model.

Just a thought.
Sean
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
14rhb over 6 years ago in reply to fmilburn

It really was just me, eager to get your great content without reading the intro as carefully as I should !
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel
fmilburn over 6 years ago in reply to 14rhb

True, it is there but easy to miss. I'll make that a bit clearer when I make the next post :-).
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel
14rhb over 6 years ago in reply to fmilburn

It also made more sense when I realised this was blog #2 (although you do say so) and read your posts in the right order
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel