element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Design for a Cause - Design Challenge
  • Challenges & Projects
  • Design Challenges
  • Design for a Cause - Design Challenge
  • More
  • Cancel
Design for a Cause - Design Challenge
Blog Audio4Vision #2 - Initial research: ResNet as a starting point
  • Blog
  • Forum
  • Documents
  • Polls
  • Files
  • Events
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: pranjalranjan299
  • Date Created: 19 Jul 2018 10:58 AM Date Created
  • Views 864 views
  • Likes 2 likes
  • Comments 3 comments
Related
Recommended

Audio4Vision #2 - Initial research: ResNet as a starting point

pranjalranjan299
pranjalranjan299
19 Jul 2018

Welcome to our second blog post! In this blog, we would like to post our progress and findings so far.

Over the years, Convolutional Neural Networks, ConvNets, or CNNs have been the top choice for image processing and recognition applications. The main advantage of CNNs compared to other image classification algorithms was that the filters that were hand-engineered in traditional algorithms were learnt by the CNNs themselves, saving a lot of effort and time.

Since other models benefited by increasing the number of layers and large computational power is readily accessible today, it's no wonder that people started to make much deeper, more complex neural networks. This, however, led to a problem: The deeper a ConvNet, the more difficult it is to train it, in return for marginal improvements in accuracy. In some cases, complexification of a ConvNet can lead to reduced accuracy.

Deep residual networks was released by Microsoft, for the ImageNet and COCO 2015 competitions, which had object detection, image classification, and semantic segmentation problems. It came 1st in all the main events of those competitions, and one of the reasons why it did that is because it has a special property- it has shortcut connections, i.e. feeding the input of nth layer to an (n+x)th layer. It has been proven that doing so makes the network easier to train as well as more accurate.

image

So, naturally, for our requirement, we turned to one of the most famous resnets available- the ResNet-50. We used the model to predict a few outdoor objects:

imageimage

 

 

 

 

 

 

 

 

 

 

 

 

 

 

As well as some indoor objects:

 

     imageimage

 

And it is working as expected.

Once we're done with the simple image classification problem, the next step will be to create fleshed-out sentences based on those objects and their relations. The ultimate aim is to take an image of a surrounding, and give it a caption which describes the image sufficiently. This residual network can potentially be the first stage of our captioning system, which takes an image as input and gives the outputs to a long short-term memory network (LSTM) which will establish the relationships between them.

 

Thank you for reading this blog. The next blog will be up soon!

  • Sign in to reply

Top Comments

  • pranjalranjan299
    pranjalranjan299 over 7 years ago in reply to genebren +1
    Thanks, Gene! Will be providing further updates very soon.
  • aspork42
    aspork42 over 7 years ago +1
    Cool - Looking forward to learning more about Neural networks!
  • aspork42
    aspork42 over 7 years ago

    Cool - Looking forward to learning more about Neural networks!

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • pranjalranjan299
    pranjalranjan299 over 7 years ago in reply to genebren

    Thanks, Gene! Will be providing further updates very soon.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • genebren
    genebren over 7 years ago

    Nice update to your design challenge project.  Looks like some interesting image processing.  Good luck getting the results that you desire as you do more testing and adding further processing.

    Gene

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube