element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Personal Blogs
  • Community Hub
  • More
Personal Blogs
Frank Milburn's Blog A Beginning Journey in TensorFlow #3: ReLU Activation
  • Blog
  • Documents
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: fmilburn
  • Date Created: 11 Oct 2019 1:56 AM Date Created
  • Views 1988 views
  • Likes 7 likes
  • Comments 4 comments
  • tensorflow
  • image classification
  • colab
  • vision thing
  • karas
  • image recognition
  • raspberry pi
  • relu
  • deep learning
  • regression
Related
Recommended

A Beginning Journey in TensorFlow #3: ReLU Activation

fmilburn
fmilburn
11 Oct 2019

This is the third of a series exploring TensorFlow.  The primary source of material used is the Udacity course "Intro to TensorFlow for Deep Learning" by TensorFlow.  My objective is to document the things I learn along the way and perhaps interest you in a similar journey.  I have had very little time to work on this project recently but hope to be more productive in the coming weeks.  If I get a project working in time I will make an entry in the Project14 Vision Thing Competition.

 

Recap

 

In the first post it was explained that TensorFlow can do two types of deep learning:  Regression and Classification.  The post focused on regression, and in particular on linear regression.  We learned that  a neural network is composed of layers of connected nodes.  The inputs to a node are multiplied by weights and a bias added.  An optimizing approach called "Adam" was used to guide the gradient descent (or as I think of it iteration to try and minimize error) and select the weights.

image

Figure 1 Neural Network model from Udacity Intro to TensorFlow for Deep Learning

 

In the second post the model was extended to recognize and classify images.  In regression, the model gives a single output.  In classification the model assesses the probability that an input, such as an image, is a member of all the classes in the model.  The model used determined the probability of an input being one of ten different articles of clothing.  Several new concepts were introduced including flattening of layers and activation functions. 

 

In this post I will back up a bit and cover the activation function.  This brief discussion is being added because it was barely covered in the Udacity training and is based on this article by Jason Brownlee:  A Gentle Introduction to the Rectified Linear Unit (ReLU).  In general I have found all the posts and material by Jason to be good.

http://A Gentle Introduction to the Rectified Linear Unit (ReLU)/

Activation Functions and ReLU

 

The activation function transforms the summed and weighted input to a node as shown in Figure 1 above into the output for the node.  There are a number of different activation functions available.  In the past the hyperbolic tangent and sigmoid were popular.  Remember for example that the hyperbolic tangent looks like this:

image

Figure 2  Hyperbolic Tangent

 

It introduces nonlinearity and has a readily calculated derivative but it has been shown to be inferior in many instances to simple Rectified Linear Activation, or ReLU.  The simplicity of ReLU allows it to work faster and it is less likely to incur the vanishing gradient problem.  The vanishing gradient occurs when there are many layers and the loss function approaches zero - i.e. the derivative approaches zero.

 

But why use an activation function at all?  It was seen in the first post that the basic model finds weights and bias for linear equations.  Many problems are nonlinear and among other things ReLU introduces as way for the model to address nonlinearity.  In simplest form, ReLU returns 0 if the weighted sums are negative or zero, and the weighted sums if greater than zero.  In graphical form it looks like this:

image

Figure 3 ReLU Output

 

Or, if you are more into pseudo code:

if input > 0:
     return input
else:
     return 0

 

So, the derivative of the function for positive values is 1.  It is assumed zero when the input is zero.

 

ReLU introduces nonlinearity but it might seem there is loss of information due to the simple nature.  Apparently selecting the correct number of layers and nodes allow the model to overcome that.  As mentioned above, ReLU is now the default activation function for neural networks.  It does have limitations but is said to almost always gives improved results over other methods.

 

Applying ReLU to Regression

 

In the training material ReLU is introduced as a method for classification of images, particularly where there are many layers.  As an exercise I thought it might be good to also look back at regression and the Life Expectancy as a function of Age regression done in the first blog post.  Remember that machine learning is not a particularly good application for this problem but we are using it as a learning exercise.  The resultant straight line regression looked like this.

image

Figure 4 Linear Regression Model of Life Expectancy Data

 

Applying ReLU can be done to the original linear regression model by modifying the layer description as follows:

 

layer0 = tf.keras.layers.Dense(units = 1, input_shape=[1], activation=tf.nn.relu)

 

 

The resulting plot with ReLU activation looks like this:

image

Figure 5 Regression Model of Life Expectancy Data using ReLU Activation

 

The non-linear impact is apparent, the fit is better, and the life expectancy is no longer as negative for higher ages.  However, the resulting curve is not intuitive to me and it is clear that I need more experience (or experimentation) in the use of Activation Functions, especially with complicated models.

 

Applying ReLU to Image Categorization

 

In the second post ReLU was used as the Activation Function and the training dataset took 2 minutes to train and had 89% accuracy.  The test dataset took 3 seconds to run and had 87% accuracy.  When ReLU was removed, the training still took about 2 minutes to train but accuracy was reduced to 85%.  The test dataset accuracy was reduced to 83%.  ReLU clearly improved the model with no adverse impact on speed.

 

Conclusion

 

A simplified discussion of ReLU was followed by example applications.  The Regression example was contrived and not representative of an application but the image Categorization example used a real example and demonstrated the improvements that can result - in this case the test accuracy with ReLU is improved by 4% from 83% to 87%.  In general, ReLU is recommended for categorization using neural networks.

 

Please check out the free Udacity training if you are interested in learning from the experts.  As always, comments and corrections are appreciated.

 

Useful Links

 

A Beginning Journey in TensorFlow #1: Regression

A Beginning Journey in TensorFlow #2: Simple Image Recognition

A Beginning Journey in TensorFlow #4: Convolutional Neural Networks

A Beginning Journey in TensorFlow #5: Color Images

A Beginning Journey in TensorFlow #6: Image Augmentation and Dropout

RoadTest of Raspberry Pi 4 doing Facial Detection

Picasso Art Deluxe OpenCV Face Detection

Udacity Intro to TensorFlow for Deep Learning

  • Sign in to reply

Top Comments

  • clem57
    clem57 over 5 years ago +3
    Thanks fmilburn , these blogs on tensor flow explain much that I did not know/understand. Keep up the good work!
  • fmilburn
    fmilburn over 5 years ago in reply to clem57 +3
    Thanks!
  • fmilburn
    fmilburn over 5 years ago in reply to genebren +3
    Thanks Gene, convolutional neural networks is next. That is one of the more interesting recent development areas for image recognition Frank
  • fmilburn
    fmilburn over 5 years ago in reply to genebren

    Thanks Gene,

    convolutional neural networks is next.  That is one of the more interesting recent development areas for image recognition

    Frank

    • Cancel
    • Vote Up +3 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • genebren
    genebren over 5 years ago

    Frank,

     

    Thanks for another great blog on Tensor flow.  This is something that I might need to look into, so thanks for the introduction/tutorial.

     

    Gene

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • fmilburn
    fmilburn over 5 years ago in reply to clem57

    Thanks!

    • Cancel
    • Vote Up +3 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • clem57
    clem57 over 5 years ago

    Thanks fmilburn , these blogs on tensor flow explain much that I did not know/understand. Keep up the good work!image

    • Cancel
    • Vote Up +3 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube