Avnet Ultra96 Dev Board - Review

Table of contents

RoadTest: Avnet Ultra96 Dev Board

Author: nixiefairy

Creation date:

Evaluation Type: Semiconductors

Did you receive all parts the manufacturer stated would be included in the package?: True

What other parts do you consider comparable to this product?:

What were the biggest problems encountered?: NO SUPPORT FOR HDMI. ONLY SUPPORT FOR DISPLAY PORT :<. Xilinx, I know FPGAs are expensive, but have a little mercy on our student wallets.

Detailed Review:

Hey everyone,



After a really long delay I am putting up this review (about a month's time). Had to write a 70 page report on my internship ! I cry at times thinking about the pain I am doing to myself for getting my college degree. It's just life I guess.

But now all things have settled, and I am working with the little time I get away from my master's work. I still don't think I have covered enough yet, and will be updating the DNNDK part soon enough. Hopefully this review should cover everything up !!



Firstly, a big thanks to and the e14 team for firstly giving me the extra time and opportunity to tackle this review.

Just would briefly talk about the board's looks and what I actually want to do with this board.


Board's looks and what are my "expectations"


First view of the Ultra96 board

A picture says a thousand words...

Comes with a charger (UK plug) and a JTAG debugger.



The biggest problems I have realized with the Zynq boards - it takes a great amount of time to develop code in the Xilinx tools and a lot of effort to get something running on them -- You have got to write the complete stack. That is to say, from the bitstream that runs the FPGA, to the embedded software stack which runs of the FPGA

I see a lot of the other reviews cover some aspects of stacks which have already been built, and some of them jsut cover the FPGA side of Zynq.

I plan to cover everything this board has available on the internet. Do put in stuff I have failed to cover in the comments section L

  1. Linaro [Stock OS]
  2. OpenSUSE [OS]
  3. Running a few my own DL architectures and comparing it to my PC's performance [Software]
  4. Debug a few of the drivers written for the Ultra96 [Software]
  5. Using my own custom driver in the 96board [Software - Hardware]
  6. Loading the driver to PYNQ [OS]
  7. Running DNNDK and running step 3.


Even though I said I would try to cover everything - I have no plans to cover all the ready-made examples that come with this board. To me that defeats the purpose of a roadtest review and it's not something I look forward to reviewing. Also note I am trying to stay away from Vivado for this review - I don't have the time or the patience to go through an OS change - followed by a Vivado installation.



One of the bad points about this board is the fan noise. It is extremely loud , as a result I could not sneak this board into my office for a very long time without having some curious eyeing on my computer screen . I do not have a monitor with a Display Port (let alone an HDMI one) as a result I could not test few of the things in the OS distributions (which physically require a DP) .


Another annoying thing about this board is force shutdown. I have to press SW3 (Power ON button) for 10 seconds before it force shutdowns. Definitely we can be sure that one does not accidentally shuts down the board.



Powering on the board with the stock OS

One of the first things I did after receiving the board was to do the easy stuff , plug in the SD card and get a feel of the OS the board actually comes with - called Linaro. Linaro is an OS which are released for all ARM based 96boards [The Ultra96 board belonging to that manufacturing family] .

So after attaching the peripherals , I was pretty surprised by the layout :-

Realizing that the resolution was incorrect (and after changing to 800x600) :-


{gallery} Linaro


VIM :><:

Running VIM


Though there were some things that were not running - and I could not load the modules given in the module list, For instance, the temperature module did not seem to have been loaded :



A lot of basic GUI functions are not working like : Left and right click open the same menu ; Copy Paste doesn't work. I believe this started off as a simple desktop utility (xfce) and a lot of bloat was added to it, leading to a lot of incompatible software.

Still better to have a stock OS than having none at all !


OpenSuse - Tumbleweed

The next OS on the list was OpenSUSE, I started off by following the instructions given here : https://en.opensuse.org/HCL:Ultra96  . If one was to read here carefully, OpenSUSE doesn't support the mDP or uses the microUSB serial connection. Instead it realizes on an UART connection - which can be done using the JTAG/UART Pod which came along the ultra96 board.


On top here you see is the GRUB menu for Tumbleweed. When I saw it booting up, I was kind of impressed, as the image is NOT built using Petalinux. This is because whenever an image is made by Petalinux, it shows the dmesg console, until the shell prompt loads up.

These are the boot messages shown when we exit from GRUB and enter Tumbleweed itself.

And after we boot up :


Worked beautifully while I was using it to setup the wifi when disaster struck.

In one of my "hard" restarts where I usually keep the board running for a few hours then restart it - the Pod stopped working.The LED still blinks when the connection is made but heats up quickly when attached to the computer's USB port. I can assume that some part (A cap or resistor) has been short circuited but I don't really know which one - Don't have that temperature-measuring thing. As a result I had to order a new USB-UART bridge interface - which is going to 45 days !

I didn't have the patience to sit through the duration (45 days) let alone 4.5 hours. So I planned something "smart" .


Using another FPGA as an USB-UART bridge

And here I was keeping trying to keep my hands of Vivado .... heh. Call me crazy but this was probably the quickest plan I could come up with. I had two options to use - Either the onboard FTDI (Serial USB-UART brige, have a look at it online!) chip on an Arduino or the Basys 3. The main reason I went ahead with the Basys 3 was that I could control the voltage. The Arduino gives out +5Vdc as output --- something that we do not need. The input voltages requirements for the 96board connectors is a low 1.8V. Also if my beloved Basys 3 board dies on me , it would be an Honorable one.

I won't really discuss much about my design here - just used a couple of IBUFs and OUBUFs with IOStandards of course - LVCMOS18 .


But luckily in my office - a collegue of mine had a 96board shield . Talk about luck eh?


Setting up the internet

This took me a week to connect to a WPA2-PSK connection. The problem here is that OpenSUSE DO NOT ship a dhcp client package in their distro. As a result, this was a completely new experience for me. Added the commands below for everyone's reference (and mine) :

ifconfig wlan0 up
hostnamectl set-hostname kogs
wpa_supplicant -B -i wlan0 -c /etc/wpa_supplicant/wpa_supplicant.conf
wpa_cli -i wlan0
ip addr add broadcast + dev wlan0
echo 'nameserver 
nameserver' >> /etc/resolv.conf
route add default gw    #Check gateway in host system
route -n
networkctl status 
ip addr flush dev wlan0

Really wish there was a little instruction manual which came with these things. But bah, it is linux.


Getting a tensorflow stack working on OpenSUSE

So while installing all the required packages to get a few examples running (I am not using anaconda cause I don't thing this poor guy can handle an virtualenv) I could not install the latest tensorflow package through pip. The latest I could find was tensorflow (1.13.1) but 2.0.0-apha had already been released at the time of writing this review.

And well I could not install tensorflow-cpu or tensorflow-gpu in embedded OpenSUSE for a valid reason - it is not supported. However an older version - 1.12 gave no such incompatibility issues .But because I could not find older python packages for dependencies, the installer kept on erroring out.

The only publicly released versions of any boards is for the Raspberry Pi - that to a LITE version.

Now, I think we have to use a legit Xilinx released tool to run any networks on the board --> DNNDK.



Drivers running in Petalinux


One of the best ways I say to test this is by running an extremely new IP on the board. This driver is a much needed utility to the Ultra96 board.

Since there are lot of steps involved in the image building, I'll just go for the ones with substantial output. If you want to have a more in-depth understanding have a look at my little experiment I tried to perform here :


Step 1 : Original Block Diagram -- supplied by Avnet (ultra96_v1_petalinux.tcl)

Step 2 : Adding the custom made - Fan IP and System monitor IP



Step 3 : Creating package using the ELF file and the bitstream - generates the necessary BOOT files


I did the above steps got a stuck at a step which is going to take me some time - writing a device driver for the Ultra96 board fan.

For now, the driver is still full of bugs, crashes the kernel (in qemu) and is in an extremely unstable / unusable state. It will take me some time - this is my first driver after all





So this is Xilinx's newly acquired (2 years ago ) company's software - Deephi. This is said to create a DPU on top of the existing MPSoC, where a user can run his DL programs on. I have installed two versions of this software publicly available - keep in mind , Deephi's website has stopped working on the time of writing this review . The link for the downloads is here ; https://www.xilinx.com/products/design-tools/ai-inference/ai-developer-hub.html#edge


So I now take the time to explain a little about how Deephi's Deep Processing Unit works. Deep NeuralNetwork Development Kit (DNNDK) is a full-stack deep learning SDK for the Deep-learning Processor Unit (DPU). It provides a unified solution for deep neural network inference applications by providing pruning, quantization, compilation, optimization, and run time support.


The DNNDK is composed of Deep Compression Tool (DECENT), Deep Neural Network Compiler (DNNC), Deep Neural Network Assembler (DNNAS),Neural Network Runtime (N2Cube), DPU Simulator,and Profiler.



The process of inference is computation intensive and requires a high memory bandwidth to satisfy the low-latency and high-throughput requirement of edge applications. The DeepCompressionTool, DECENTTM, employs coarse-grained pruning, trained quantization and weight sharing to address these issues while achieving high performance and high energy efficiency with very small accuracy degradation.


DNNCDNNC (Deep Neural Network Compiler)

It is the dedicated proprietary compiler designed for the DPU. It maps the neural network algorithm to the DPU instructions to achieve maxim utilization of DPU resources by balancing computing workload and memory access



The Cube of Neutral Networks(N2Cube)is the DPU runtimeengine. It acts as the loader for the DNNDK applications and handles resource allocation and DPU scheduling. It provides a lightweight setof programming interfaces through a librarywhich abstractsaway the details of the underlying hardware implementation. The DPU driver runs in the kernel spaceof the Linux OS and includes DPU functions such as task scheduling, and efficient memory management to avoidmemory copy overhead between the DPU and the CPU.



The Deep Neural Network Assembler(DNNAS) is responsible for assembling DPUinstructions into ELF binary code. It is a part of the DNNC code generating backend, and cannot be invoked alone.



The DPU profiler is composed of two components: DPU tracer and DSight. DPU tracer is implemented in the DNNDK runtime N2cube, and it is responsible for gathering the raw profiling data while running neural networks on DPU. With the provided raw profiling data, DSight can help to generate the visualized charts for performance analysis