This is my second blog for the Eye On Intelligence Challenge sponsored by AMD. In my first blog, I unboxed the challenger kit and prepared the Arty Z7 board for developing applications with Python. I will share my experience with the ZYNQ SoC and the PYNQ Platform in this blog. I am not an advanced user of ZYNQ and PYNQ. I have some previous experience working with the KR260 Robotics Starter Kit. I had my first experience with PYNQ while working with the KR260 Kit. This is the second project I am using PYNQ. So, I am almost new to the world of ZYNQ and PYNQ.
The CPU, FPGA, ZYNQ, and PYNQ
The CPU (Central Processing Unit) is the main unit of every computing system and is responsible for every arithmetic calculation and logical decision. However, the CPU alone is not capable of doing any processing without the help of RAM. RAM is necessary to temporarily store the data on which operation needs to be performed and the result after performing the operation. Suppose we want to add two single-digit numbers 5 and 7. A CPU will take several instruction cycles (fetch cycle, decode cycle, execute cycle, and store cycle) to perform this addition and every cycle will consume some time.
On the other side, we can design a 4-bit adder circuit as shown in the following images (image source: https://www.instructables.com/4-Bit-Binary-Adder-1) using some basic gate (AND, OR, XOR) that is capable of adding two single digits almost instantly. No memory or other stuff is involved in this operation.
So, if we can design a specific circuit for any simple or complex operation it can give us faster results compared to the CPU. From this plot, FPGA came into the picture. An FPGA is an ocean of thousands of different logic gates. We can choose as many as we need for designing a circuit. But wait a minute, these are just the gates. For designing a circuit we need to connect them in a specific manner using wires. Hardware description language (HDL) is our wires to connect the logic gates. Designing FPGA with HDL is as complex as making a circuit with thousands of connecting wires! No worries! We have easy ways.
FPAG shows good performance where lots of parallel processing is required or we need to perform the same operation in a stream of data like video processing, FFT, etc. FPGA is also good for high I/O applications. But nothing is perfect. FPGA shows poor performance where we need to take lots of decisions or perform diverse operations. Suppose, you have data flow coming from a sensor, and based on the sensor data you need to make different decisions and operations. A CUP will perform better in such a scenario than an FPGA.
FPGA's are extremely useful when your problem
-
requires low latency
-
is highly parallelizable
-
interfaces with other hardware over a variety of communication protocols. OR
-
you want to model hardware before fabricating it
FPGA's aren't as useful if
-
the limiting factor of the problem is inherently sequential
-
the problem involves lots of branching (decision making)
For example, making a Chess game is difficult on an FPGA because it involves a lot of decision-making logic.
In modern days our demands and applications are complex and those include both decision-intensive and parallel computation-intensive works. Neither FPGA nor CPU alone is a good player in such a game. But if we can use both, FPGA can help the CPU by accelerating some work, and the overall performance of the system is improved. So, for better performance, we need a hybrid system with FPGA and CPU. We can place a separate FPGA chip and CPU chip in a single PCB. In such a system where we have two separate chips and they exchange large amounts of data latency, noise, and power consumption become the barriers. But every problem has a solution and Xilinx ZYNQ is a perfect solution to the above problem. Zynq is an SoC that includes an ARM CPU and an FPGA in a single chip. Let's explain Zynq a bit more.
What is Zynq?
Zynq is a special type of SoC developed by Xilinx also called All Programmable System-on-Chip or APSoC. While a motherboard-based PC architecture separates components based on function and connects them all via a circuit board, an SoC combines most, if not all, components of a computer in a single chip. Both have their advantages and disadvantages and can be used to accomplish the same goal.
Zynq is an APSoC – meaning that in addition to integrating most, if not all, components of a computer into a single chip, developers can also take advantage of the FPGA, or field-programmable gate array, technology present within it. Typically, FPGAs are standalone components that are used to prototype custom system chips, or design the hardware that will later be developed into application-specific integrated circuits (ASICs). Therefore the “system” in APSoC as it relates to Zynq, refers to the system of dual dedicated processors (Dual-core ARM Cortex-A9 Processors) and the FPGA technology. With access to both processor and FPGA functions, developers can leverage the best of both worlds.
Before the invention of the Zynq, processors were coupled with a Field Programmable Gate Array (FPGA) which made communication between the Programmable Logic (PL) and Processing System (PS) complicated. The Zynq architecture, as the latest generation of Xilix’s all-programmable System-on-Chip (SoC) families, combines a dual-core ARM Cortex-A9 with a traditional (FPGA). The interface between the different elements within the Zynq architecture is based on the Advanced eXtensible Interface (AXI) standard, which provides for high bandwidth and low latency connections.
The following figure shows the basic architecture of a Pynq SoC.
Why should we use Zynq?
Traditional solutions typically utilize FPGA, ASIC, ASSP (application-specific standard product), or any combination of these devices to achieve the desired function. Though these technologies are capable of fulfilling most of our desires, there are numerous drawbacks associated with them.
For example, ASICs can offer suitable performance and power at a decent price but are less than ideal because of the lack of flexibility provided to the designer once the system is completed – which can lead to a longer time to market. Additionally, an ASIC lacks any sort of scalability – meaning for each new project, new ASICs must be developed.
On the other hand, ASSPs are less risky than ASICs and can have a faster time to market. But because it is a standard product, it lacks any flexibility in design. For that reason, developers often opt to use a 2-chip solution consisting of an FPGA coupled with an ASIC or ASSP to achieve a balanced trade-off. Using two chips, however, creates a whole new list of challenges for developers.
Zynq is unique in that it provides a solution for each of the challenges described above and effectively does so in a single chip – the first of its kind. It allows programmers of FPGA hardware access to the same resources software programmers usually have (i.e. programming languages like Python, operating systems, drivers, etc.).
Software programmers can use Zynq to modify and extend the functionality of their programs onto their hardware without the need to redesign the architecture of their programs. That same hardware design can be used repeatedly due to Zynq’s FPGA capability to extend the peripheral functions of the dual ARM A9 processors. Developers simply need to modify what must be different in each iteration of their design.
Today, electronic components that can offer higher performance and a higher level of integration into a single end-user device while retaining affordability and/or reducing power requirements are frequently expected by seemingly every industry that uses electronic systems. More customers desire flexible and scalable systems that can implement their needs in a timely fashion to beat their competitors to market. This is where Zynq shines.
The PYNQ
PYNQ is an open-source project from Xilinx that makes it easy to design embedded systems with Zynq All Programmable Systems on Chips (APSoCs). Using the Python language and a wide set of libraries, designers can leverage the benefits of FPGA and microprocessors in Zynq to build more capable and exciting embedded systems. PYNQ users can now create high-performance embedded applications without having to use ASIC-style design tools (e.g. HDL language) to design hardware. The PYNQ project includes a Python-based Jupyter framework and Python APIs for using Xilinx Adaptive Computing platforms. It aims to simplify and improve Adaptive Computing system design by providing a high-level productivity language (Python), FPGA overlays with extensive APIs exposed as Python libraries, a web-based architecture served from the embedded processors, and the Jupyter Notebook framework deployed in an embedded context.
SoC design is challenging as it combines both hardware and software elements. A variety of tools must therefore be utilized to design for these components, each requiring different knowledge. The hardware accelerated portion of the system, targeting the programmable logic (PL), requires experience in writing HDL or tools that generate HDL such as Xilinx System Generator, or Vivado High-Level Synthesis (HLS). Furthermore, this design must then be integrated with the platform, where interfacing with the processing system (PS) and any on-board peripherals must be considered in addition to clocks, resets, and interrupts. This is usually achieved in Vivado IP Integrator (IPI). Targeting the PS, software can be designed as a bare metal application, written in C, or written in Python when using a PYNQ enabled platform.
PYNQ provides a range of IP and drivers which makes developing software to control the IP within the programmable logic (called overlays). PYNQ also includes several IP which ease interaction with the PYNQ drivers within the operating system.
PYNQ Overlays
The AMD-Xilinx® Zynq® All Programmable device is an SoC based on a dual-core ARM® Cortex®-A9 processor (referred to as the Processing System or PS), integrated with FPGA fabric (referred to as Programmable Logic or PL). The PS subsystem includes a number of dedicated peripherals (memory controllers, USB, Uart, IIC, SPI etc) and can be extended with additional hardware IP in a PL Overlay.
Overlays, or hardware libraries, are programmable/configurable FPGA designs that extend the user application from the Processing System of the Zynq into the Programmable Logic. These overlays are analogous to software libraries. A software engineer can select the overlay that best matches their application. The overlay can be accessed through an Python API. Overlays can be used to accelerate a software application, or to customize the hardware platform for a particular application.
For example, image processing is a typical application where the FPGAs can provide acceleration. A software programmer can use an overlay in a similar way to a software library to run some of the image processing functions (e.g. edge detection, thresholding, etc.) on the FPGA fabric. Overlays can be loaded to the FPGA dynamically, as required, just like a software library. In this example, separate image processing functions could be implemented in different overlays and loaded from Python on demand. Creating a new overlay still requires engineers with expertise in designing programmable logic circuits. PYNQ overlays are created by hardware designers, and wrapped with this PYNQ Python API. Software developers can then use the Python interface to program and control specialized hardware overlays without needing to design an overlay themselves. The key difference however, is the build once, re-use many times paradigm. Overlays, like software libraries, are designed to be configurable and re-used as often as possible in many different applications.
Loading an Overlay
By default, an overlay (bitstream) called base is downloaded into the PL at boot time. The base overlay can be considered like a reference design for a board. New overlays can be installed or copied to the board and can be loaded into the PL as the system is running.
An overlay usually includes:
- A bitstream to configure the FPGA fabric
- A Vivado design HWH file to determine the available IP
- Python API that exposes the IPs as attributes
The PYNQ Overlay
class can be used to load an overlay. An overlay is instantiated by specifying the name of the bitstream file. Instantiating the Overlay also downloads the bitstream by default and parses the HWH file.
PYNQ Libraries
Typical embedded systems support a fixed combination of peripherals (e.g. SPI, IIC, UART, Video, USB ). There may also be some GPIO (General Purpose Input/Output pins) available. The number of GPIO available in a CPU based embedded system is typically limited, and the GPIO are controlled by the main CPU. As the main CPU which is managing the rest of the system, GPIO performance is usually limited.
AMD-Xilinx platforms usually have many more IO pins available than a typical embedded system. Dedicated hardware controllers and additional soft (real-time) processors can be built in Programmable Logic. This means performance on these interfaces can be much higher than other embedded systems.
PYNQ runs on Linux. For Zynq|Zynq Ultrascale+, the following PS peripherals are used by default: SD Card to boot the system and host the Linux file system, UART for Linux terminal access, and USB. Ethernet can be used to connect to Jupyter notebook or on Zynq Ultrascale+, Ethernet over USB Gadget.
The USB port and other standard interfaces can be used to connect off-the-shelf USB and other peripherals to the Zynq PS where they can be controlled from Python/Linux. The PYNQ image currently includes drivers for the most commonly used USB webcams, WiFi peripherals, and other standard USB devices.
Other peripherals can be connected to and accessed from the Zynq PL. E.g. HDMI, Audio, Buttons, Switches, LEDs, and general-purpose interfaces including Pmods, and Arduino. As the PL is programmable, an overlay which provides controllers for these peripherals or interfaces must be loaded before they can be used.
A library of hardware IP is included in Vivado which can be used to connect to a wide range of interface standards and protocols. PYNQ provides a Python API for a number of common peripherals including Video (HDMI in and Out), GPIO devices (Buttons, Switches, LEDs), and sensors and actuators. The PYNQ API can also be extended to support additional IP.
Zynq platforms usually have one or more headers or interfaces that allow connection of external peripherals, or to connect directly to the Zynq PL pins. A range of off-the-shelf peripherals can be connected to Pmod and Arduino interfaces. Other peripherals can be connected to these ports via adapters, or with a breadboard. Note that while a peripheral can be physically connected to the Zynq PL pins, a controller must be built into the overlay, and a software driver provided before the peripheral can be used.
The PYNQ libraries provide support for the PynqMicroBlaze subsystem, allowing pre-compiled applications to be loaded, and new applications to be creating and compiled from Jupyter.
PYNQ also provides support for low-level control of an overlay including memory-mapped IO read/write, memory allocation (for example, for use by a PL master), control and management of an overlay (downloading an overlay, reading IP in an overlay), and low level control of the PL (downloading a bitstream).
Okay, We have enough theory. Now let's do some experiments with Zynq and Pynq.
I powered up the device through USB and opened the http://pynq:9090 from the web browser. Jupyter Notebook environment was opened with few examples and documentation. Using the browser we can view and run the notebook documentation interactively. PYNQ applications are developed over a Jupyter notebook server which runs on the PS processor cores.
After exploring Getting Started guides I intended to run a few example projects on my Arty 7 board. For the first try, I was looking for an example that can be run without connecting any external hardware. From the base/board directory, I opened board_btns_leds.ipynb notebook.
This buttons and LEDs demonstration project shows how to use push buttons (BTN0-3), LEDs (LD0-3), and RGB LEDs (LD4-5) on the board.
After running the Python code we need to do the following to control the LEDs or RGB LEDs:
Button 0 pressed: RGB LEDs change color.
Button 1 pressed: LEDs shift from right to left (LD0 -> LD3).
Button 2 pressed: LEDs shift from left to right (LD3 -> LD0).
Button 3 pressed: Turns off all the LEDS and ends this demo.
The base overlay is used in this project. The purpose of the base overlay design is to allow PYNQ to use peripherals on a board out-of-the-box. The design includes hardware IP to control peripherals on the target board, and connects these IP blocks to the Zynq PS. If a base overlay is available for a board, peripherals can be used from the Python environment immediately after the system boots.
Board peripherals typically include GPIO devices (LEDs, Switches, Buttons), Video, Audio, and other custom interfaces. As the base overlay includes IP for the peripherals on a board, it can also be used as a reference design for creating new customized overlays.
For loading the base overlay, we used the existing BaseOverlay
class; this class exposes the IPs available on the bitstream as attributes of this class. The following code snippet is used to instantiate the overlay.
from pynq.overlays.base import BaseOverlay base_overlay = BaseOverlay("base.bit")
Once an overlay has been instantiated, the
help()
method can be used to discover what is in an overlay about. The help information can be used to interact with the overlay. I run the help() method and got the following result in my Arty Z7 board.
Running help()
on the leds object will provide more information about the object including details of its API. The following command shows the API details of the LED GPIO.
help(base.leds_gpio)
Help on AxiGPIO in module pynq.lib.axigpio object: class AxiGPIO(pynq.overlay.DefaultIP) | AxiGPIO(description) | | Class for interacting with the AXI GPIO IP block. | | This class exposes the two banks of GPIO as the `channel1` and | `channel2` attributes. Each channel can have the direction and | the number of wires specified. | | The wires in the channel can be accessed from the channel using | slice notation - all slices must have a stride of 1. Input wires | can be `read` and output wires can be written to, toggled, or | turned off or on. InOut channels combine the functionality of | input and output channels. The tristate of the pin is determined | by whether the pin was last read or written. | | Method resolution order: | AxiGPIO | pynq.overlay.DefaultIP | builtins.object | | Methods defined here: | | __getitem__(self, idx) | | __init__(self, description) | Initialize self. See help(type(self)) for accurate signature. | | setdirection(self, direction, channel=1) | Sets the direction of a channel in the controller | | Must be one of AxiGPIO.{Input, Output, InOut} or the string | 'in', 'out' or 'inout' | | setlength(self, length, channel=1) | Sets the length of a channel in the controller | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | Channel = <class 'pynq.lib.axigpio.AxiGPIO.Channel'> | Class representing a single channel of the GPIO controller. | | Wires are and bundles of wires can be accessed using array notation | with the methods on the wires determined by the type of the channel:: | | input_channel[0].read() | output_channel[1:3].on() | | This class instantiated not used directly, instead accessed through | the `AxiGPIO` classes attributes. This class exposes the wires | connected to the channel as an array or elements. Slices of the | array can be assigned simultaneously. | | | InOut = <class 'pynq.lib.axigpio.AxiGPIO.InOut'> | Class representing wires in an inout channel. | | This class should be passed to `setdirection` to indicate the | channel should be used for both input and output. It should not | be used directly. | | | Input = <class 'pynq.lib.axigpio.AxiGPIO.Input'> | Class representing wires in an input channel. | | This class should be passed to `setdirection` to indicate the | channel should be used for input only. It should not be used | directly. | | | Output = <class 'pynq.lib.axigpio.AxiGPIO.Output'> | Class representing wires in an output channel. | | This class should be passed to `setdirection` to indicate the | channel should be used for output only. It should not be used | directly. | | | bindto = ['xilinx.com:ip:axi_gpio:2.0'] | | ---------------------------------------------------------------------- | Methods inherited from pynq.overlay.DefaultIP: | | read(self, offset=0) | Read from the MMIO device | | Parameters | ---------- | offset : int | Address to read | | write(self, offset, value) | Write to the MMIO device | | Parameters | ---------- | offset : int | Address to write to | value : int or bytes | Data to write | | ---------------------------------------------------------------------- | Readonly properties inherited from pynq.overlay.DefaultIP: | | register_map | | signature | The signature of the `call` method | | ---------------------------------------------------------------------- | Data descriptors inherited from pynq.overlay.DefaultIP: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined)
The API can be used to control the object. For example, the following cell will turn on LD0 on the board.
base.leds[0].toggle()
The next example I ran was the Arduino Analog example. I used that example because analog input can be tested without connecting any hardware or sensors. Running the example I found the following output without connecting anything.
After running the example code I want to write some code by myself. I want to control the LED0 and LED1 with the SW0 and SW1. For doing this I open a new Python notebook. We first import the sleep command which is used to generate a delay if you want to experiment. You the sleep(timeinseconds) command for experimentation. I then import the BaseOverlay which is used to use the switchs and LEDs.
from time import sleep from pynq.overlays.base import BaseOverlay base_overlay = BaseOverlay("base.bit")
Now we need a reference to the LEDs somehow. These reference are provided by the bit file. We can get excess to the leds using base.leds[LEDNo.] and then we store them in a variable. Essentially, we are just getting a reference to the INPUTS.
led0 = base_overlay.leds[0] led1 = base_overlay.leds[1]
Similarly we can get reference to the switches by using base.switches[SWITCHNo.].
sw0 = base_overlay.switches[0] sw1 = base_overlay.switches[1]
Now, I will set the logic or action in my code. Let I want to turn on the LED0 when SW0 is ON and trun on the LED1 when SW1 is ON. Similarly I want to turn off the LEDs when corresponding SW is OFF. When both of the switchs is ON then LEDs will blink.
We can use on() or off() function to power on or power off the led. Just like on() and off() function, we have toggle() function. toggle() will check the state of the LED, if it is turned ON, toggle() will turn it off and vice versa.
Reading Switches is extremely easy. We need to call read() function. They either return TRUE or FALSE depending upon the state of the variable. So if the SW0 reads TRUE, we turn on LED0. If SW0 reads FALSE, we turn them the LED.
Here is my code.
while(True): if(sw0.read() == True and sw1.read() == False): led0.on() elif(sw0.read() == False and sw1.read() == True): led1.on() elif(sw0.read() == False and sw1.read() == False): led0.off() led1.off() elif(sw0.read() == True and sw1.read() == True): led0.toggle() led1.toggle() sleep(1)
Here is the screenshot of the notebook.
This is the demo in action:
In my next blog, I will extend my experiment with external hardware like display and webcam.
My first blog: Blog #1: Third Eye for Blind - Overview
References:
[1]. PYNQ Documentation: https://pynq.readthedocs.io/en/latest/getting_started.html
[2]. PYNQ GitHub: https://github.com/Xilinx/PYNQ
[3]. Arty Z7 Reference Manual: https://digilent.com/reference/programmable-logic/arty-z7/reference-manual