This is the second of a series exploring TensorFlow. Since I have no training in machine learning it does not consist of tutorials but has links to the material that I am using. I use my Windows PC to train models and plan to use a Raspberry Pi to do object classification. The main purpose will be to document the things I learn along the way and perhaps interest you in a similar journey.
Recap
TensorFlow can do two types of deep learning: Regression and Classification. In the last post Regression was described. In the model diagrammed below (taken from Udacity training material) input data feeds one or more dense hidden layers and gives a single output number.
In this post the model is extended to recognize classifications of images, in this case clothing. The output for an image after training is the probability that it is one of the classes that have been was in the training dataset.
Flattening the Image
The model takes in a vector as input. Accordingly, the pixels in the two dimensional image are converted to a one dimensional array as shown in the very simple example below.
The training dataset in this example used here has 28 x 28 pixel gray scale images. The grey scale values for each pixel will be normalized to floating point numbers between 0 and 1.
NMIST Fashion Dataset
The NMIST Fashion Dataset is easily imported into Colab which is the Jupyter notebook in the cloud that is used in conjunction with the Udacity training. It consists of low resolution 28 x 28 pixel gray scale images for 10 different types of clothing. There are 70,000 images in total with a sample shown below.
As noted above the resolution of the images is quite low.
One of my observations from facial recognition was that it also was able to do classification with low resolution images which was a surprise to me.
Developing a Model
The dataset is split into two parts:
- Training Dataset: 60,000 representative images that will be used to train the model
- Test Dataset: 10,000 images used to test the model after training - must be images the model has not seen before
The images are flattened as described above and pixels normalized to values between 0 and 1. The model itself is very similar to a Regression model with exceptions as outlined in the diagram below taken from the Udacity training material.
- A single dense layer of arbitrary 128 units is selected
- The activation function ReLU is added to the layer which allows it to solve non-linear equations
- The output layer has 10 units corresponding to the items of clothing in the input dataset and uses SoftMax for the activation function
The model is compiled using the optimizer 'adam' as used before but a new loss function 'sparse_categorical_crossentropy' is introduced without much discussion. It is said to be used normally for these type models. Training is done for 5 epochs.
In Colab, which runs in the cloud, the training completes in about 2 minutes with 89% accuracy.
The test dataset has 87% accuracy and all 10,000 images are analyzed in 3 seconds.
Conclusion and Looking Ahead
The 87% accuracy may or may not be acceptable for a given application but I have looked ahead to the next lesson and know that it provides information that will improve accuracy. Accordingly, I will wait until then to develop my own dataset for training.
Useful Links
A Beginning Journey in TensorFlow #1: Regression
A Beginning Journey in TensorFlow #3: ReLU Activation
A Beginning Journey in TensorFlow #4: Convolutional Neural Networks
A Beginning Journey in TensorFlow #5: Color Images
A Beginning Journey in TensorFlow #6: Image Augmentation and Dropout
Top Comments