element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
RoadTests & Reviews
  • Products
  • More
RoadTests & Reviews
Blog PYNQ-Z2 Dev Kit - ImageNet Classification
  • Blog
  • RoadTest Forum
  • Documents
  • RoadTests
  • Reviews
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join RoadTests & Reviews to participate - click to join for free!
  • Share
  • More
  • Cancel
  • Author Author: ralphjy
  • Date Created: 17 Aug 2019 4:33 AM Date Created
  • Views 851 views
  • Likes 5 likes
  • Comments 1 comment
Related
Recommended
  • RoadTest
  • pynqworkshpch
  • pynq-z2
  • imagenet
  • machine learning

PYNQ-Z2 Dev Kit - ImageNet Classification

ralphjy
ralphjy
17 Aug 2019

Before I move on to object detection I thought I would try one more example of object classification using a more complex neural network based on the Multi-layer offload architecture.  The network used is a variant of the DoReFa-Net and uses the large ImageNet dataset http://www.image-net.org/  for training.  The DoReFa-Net https://arxiv.org/pdf/1606.06160 is a low bitwidth convolutional neural network that is trained with low bitwidth gradients optimized for implementation on hardware like FPGAs.

 

ImageNet Classifier:

The network topology is shown below.  The pink layers are executed in the Programmable Logic at reduced precision (1 bit for weights, 2 bit for activations) while the other layers are executed in python.

image

 

Initialize the network

  1. Import libraries
  2. Instantiate classifier
  3. Load labels and synsets of the 1000 ImageNet classes into dictionaries

 

Code for initialization:

import os, pickle, random
from datetime import datetime
from matplotlib import pyplot as plt
from PIL import Image
%matplotlib inline

import numpy as np
import cv2

import qnn
from qnn import Dorefanet
from qnn import utils

# Instantiate a classifier
classifier = Dorefanet()
classifier.init_accelerator()
net = classifier.load_network(json_layer="/usr/local/lib/python3.6/dist-packages/qnn/params/dorefanet-layers.json")

conv0_weights = np.load('/usr/local/lib/python3.6/dist-packages/qnn/params/dorefanet-conv0.npy', encoding="latin1").item()
fc_weights = np.load('/usr/local/lib/python3.6/dist-packages/qnn/params/dorefanet-fc-normalized.npy', encoding='latin1').item()

# Get ImageNet Classes information
with open("/home/xilinx/jupyter_notebooks/qnn/imagenet-classes.pkl", 'rb') as f:
    classes = pickle.load(f)
    names = dict((k, classes[k][1].split(',')[0]) for k in classes.keys())
    synsets = dict((classes[k][0], classes[k][1].split(',')[0]) for k in classes.keys())

 

Classify image

  1. Open image to be classified
  2. Execute the first convolutional layer in Python
  3. Compute HW Offload of the quantized layers
  4. Normalize using fully connected layers in python

 

Code for classification:

# Open image
img_folder = "/home/xilinx/jupyter_notebooks/qnn/images/"
img_file = os.path.join(img_folder, max(os.listdir(img_folder), key=lambda f: os.path.getctime(os.path.join(img_folder, f))))
img, img_class = classifier.load_image(img_file)
im = Image.open(img_file)
im

# Execute first layer
conv0_W = conv0_weights['conv0/W']
conv0_T = conv0_weights['conv0/T']

start = datetime.now()
# 1st convolutional layer execution, having as input the image and the trained parameters (weights)
conv0 = utils.conv_layer(img, conv0_W, stride=4)
# The result in then quantized to 2 bits representation for the subsequent HW offload
conv0 = utils.threshold(conv0, conv0_T)

# Allocate accelerator output buffer
end = datetime.now()
micros = int((end - start).total_seconds() * 1000000)
print("First layer SW implementation took {} microseconds".format(micros))
print(micros, file=open('timestamp.txt', 'w'))

# Compute offloaded convolutional layers
out_dim = net['merge4']['output_dim']
out_ch = net['merge4']['output_channels']

conv_output = classifier.get_accel_buffer(out_ch, out_dim);
conv_input = classifier.prepare_buffer(conv0)

start = datetime.now()
classifier.inference(conv_input, conv_output)
end = datetime.now()

micros = int((end - start).total_seconds() * 1000000)
print("HW implementation took {} microseconds".format(micros))
print(micros, file=open('timestamp.txt', 'a'))

conv_output = classifier.postprocess_buffer(conv_output)

# Normalize results
fc_input = conv_output / np.max(conv_output)

start = datetime.now()

# FC Layer 0
fc0_W = fc_weights['fc0/Wn']
fc0_b = fc_weights['fc0/bn']

fc0_out = utils.fully_connected(fc_input, fc0_W, fc0_b)
fc0_out = utils.qrelu(fc0_out)
fc0_out = utils.quantize(fc0_out, 2)

# FC Layer 1
fc1_W = fc_weights['fc1/Wn']
fc1_b = fc_weights['fc1/bn']

fc1_out = utils.fully_connected(fc0_out, fc1_W, fc1_b)
fc1_out = utils.qrelu(fc1_out)

# FC Layer 2
fct_W = fc_weights['fct/W']
fct_b = np.zeros((fct_W.shape[1], ))

fct_out = utils.fully_connected(fc1_out, fct_W, fct_b)
end = datetime.now()
micros = int((end - start).total_seconds() * 1000000)
print("Fully-connected layers took {} microseconds".format(micros))
print(micros, file=open('timestamp.txt', 'a'))

 

I tested the network with five images.  The shark image was included with the notebook and I used the 2 dog and 2 puppy images downloaded from the Internet that I had used to test the CIFAR-10 binary network.  Here are the results

 

Shark

image

Classification Result:

image

Execution time:

  • First layer SW implementation too 654851 microseconds
  • HW implementation took 79813 microseconds
  • Fully-connected layers took 569449 microseconds

image

     Total execution time: 1304113 microseconds

 

Full SW implementation execution time:

The network was also tested with the middle HW layer implemented in SW to determine the impact of the HW implementation.

 

     Total execution time: 397517703

 

The network with the middle HW layer is about 300x faster!

 

The execution time profile is approximately the same for all the images, so I'll only provide the classification results for the rest of the images.

 

Dog1

image

Classification Result:

image

Dog2

image

Classification Result:

image

Puppy1

image

Classification Result:

image

Puppy2

image

Classification Result:

image

 

The Classifier struggled a bit with the puppies but I'll admit that I'm not sure what breeds (could be mixed) that they are either.  I was impressed by how well it did with the other images.

 

Time to move on to what I really want to do.....  object detection and identification within an image.

  • Sign in to reply
  • genebren
    genebren over 5 years ago

    Ralph,

     

    Interesting results (accuracy and speed).  I am no dog expert, and I would have difficulty on puppies too.

     

    Gene

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube