Bench-marking Zynq Ultrascale+ MPSoC with custom built AES Core

Table of contents

RoadTest: Sign Up to Review the Avnet ZUBoard 1CG Development Kit

Author: yesha98

Creation date:

Evaluation Type: Development Boards & Tools

Did you receive all parts the manufacturer stated would be included in the package?: True

What other parts do you consider comparable to this product?: Ultra96 v2:

What were the biggest problems encountered?: Lack of board file support for older versions of Vivado

Detailed Review:

ZUBoard 1CG Development board review


I had previously reviewed the Avnet Microzed SoM Z7010 board with the Arduino carrier card. That gave me an opportunity to venture into the Zynq 7000 SoC platform. I wanted to explore the Ultrascale+ MPSoC platform and that’s when I saw this road-test. This board is actually intended for applications in machine learning, embedded vision, etc. So, I decided to do something different with this board – something not very conventional. Hang on to know more on what I did with this board.

Unboxing Video


Getting Started

Board powered up:


The board has a sample application that runs out of the box. The only thing we need is a USB port and an Ethernet port. Connect to the COM port with any serial terminal and reset the device to see the output as shown below in the image:

{gallery}Out of box Demo


Terminal: Output from the UART terminal once we reset the board


Webpage rendered: Just powered on


Webpage rendered: After sometime - check the temperature

Click on the link to open the webpage hosted by the device.

The webpage is a very interactive one with options for LED control (you can play with the RGB LEDs :) reading the outputs from push buttons, getting the temperature and pressure readings from the sensor. Overall this is a very good FreeRTOS example that one could start with.

But one thing I noticed is that the board gets very hot pretty soon. The temperature sensor gives a 3 degrees Celsius rise within 9 minutes of operation (Slide through the images to see the difference). The heatsink does the job, yet the board gets heated-up irrespective of any application we run. I suspect the Application core running 1.2 GHz and RT core running at 500MHz might be the cause for the heat.

The board related documents are readily available online, on the Avnet website. Starting from getting started guides, HW user manual, to board schematics. The only thing I couldn't find is the master constraints file for this FPGA SoC. But it's manageable with the board part selected in Vivado and with the HW schematics as well.

Overview and Usage

The Avnet ZUBoard 1CG offers these exclusive features:

  • 81K programmable logic cells, PL fabric max clock upto 1600MHz
  • Dual-core Arm Cortex-A53 MPCore max clock upto 1.2GHz
  • Dual-core Arm Cortex-R5F MPCore max clock upto 500MHz
  • 3 SYZYGY ports
  • Type C input power, Ethernet, MicroUSB JTAG / UART, USB 2.0 connector with PHY
  • Onboard user LEDs, Pushbuttons, switches, Pressure and temperature sensors

Given below are a couple of SYZYGY modules that can be used with this board:

Avnet Dual Camera High Speed I/O Module:
Display port with bootable EMMC module:

These two SYZYGY modules are ideal for working with vision applications that need stereo vision. Also because the Zynq U+ device supports the display port interface. Given below are the list of peripherals supported by Zynq U+:

This differentiates it from the regular Zynq 7000 series based boards giving the highest possible performance compared to the price point. Giving it the ability to run embedded vision applications on PetaLinux with a high-speed PL fabric.

Development Environment

Since this a new board, I didn’t have an option but to go with the latest version of Vivado – 2023.1. As I previously had Vivado 2018.2, and the board part was not recognised by Vivado. Later installed Vivado 2019.1, even this version did not recognise the board part. Then I figured out that only the latest versions would support it. I also built a custom desktop for running Vivado faster – with Intel i7 13700k, 32GB DDR5 memory and 1TB SSD.


Vivado 2023.1 recognises the board part instantly and has features like auto-fetching the latest board repository online. Creating a hello world application in Vitis is simple. Create a block design in Vivado with the Zynq MPSoC system and generate the bit-stream, export the hardware in .xsa format with the bit-stream. Then launch Vitis and create an application with the .xsa file and select the respective domain you want to work with – the application core or the real-time core. This would help you get started with a simple hello-world application.


Upon adding the AXI GPIOs for the on-board RGB LEDs, push button and the I2C temperature sensor, we can run the sample baremetal application to acquire the temperature and pressure from the onboard sensors. I made use of the application already provided by   who is also a roadtester for this board. His review can be found here: 


The main goal of this RT was to benchmark the Zynq Ultrascale+ MPSoC with AES128 engine. I did search for an already available core online, but the AMD Xilinx core IP was licensed and hence I would not be able to use it. So, I built my own simple AES128 core with encryption, decryption and round keys generation. I also developed AES128 in software to compare it with the time taken to encrypt and decrypt data using the HW implemented in the PL.

The system configuration to run the benchmark:

  • Application Core: (Cortex A53 MPCore) running at 1.2GHz (1200MHz)
  • Real-Time Core: (Cortex R5 MPCore) running at 500MHz
  • PL Fabric: Running at 100MHz (for less timing issues in my design)

The AES128 Core is implemented in Verilog and the design and Vivado & Vitis project files can be found here, the documentation for it is not yet complete will post it in comments once its done: its here. The AES128 core takes 128-bit input for the Plain Text and the Cipher Key, and it outputs the Cipher Text after the encryption is complete, and the same applies for decryption as well. The AES128 has 10 rounds, hence I designed the core to complete the encryption in just 10 clock cycles.

The block design for the AES128 based benchmark for Zynq U+ is given below:


The control lines are used to synchronise the loading of data into the AES core and to specify encryption or decryption. The load_key line is used to load the key into the AES128 core. The load_data line is used to load the plain text or the cipher text into the AES core. The encrypt_or_decrypt line is used to specify whether encryption or decryption is taking place inside the core.


The status lines are used to indicate the complete of a particular operation to the SW. For example 10 cycles after loading the key into the core, the key generation is complete and the key_ready line goes high. Now the SW can read this and load the data for encryption or decryption. After loading the data, the SW has to poll the cipher_ready line till it is set. The cipher_ready line will be set exactly after 10 clock cycles from loading the data. The results of the simulations will be part of the documentation repository.

Given below are the maximum clocks of the SoC:


Even though the PL fabric can be clocked upto 1.6GHz, for timing related reason with the AXI interface and the design as whole, I run the PL at 100MHz only.

The sample AES128 application (can be found here) run on the A53 MPCore gives us the following results:


Clock speed of the A53 MPCore: 1.2GHz, R5 MPCore: 500MHz, PL Fabric: 100MHz
Clock period of the PL fabric clock is 10ns, hence the time taken by HW is 10 cycles ~ 100ns. Which is the expected time taken.
Clock period of the A53 MPCore is 0.833ns, and the SW takes around ~10k cycles to perform encryption, hence the time taken is around 8000ns or 8us.
Running the same program (can be found here) on the Cortex R5 MPCore, the results will vary due to change in clock speed of the core.


So, the clock cycle frequency is still 1.2GHz while the core is running at 500MHz, that’s why we can see a difference in the clock cycles even though we run the same software and expect the number of cycles to remain the same even though the time taken is expected to increase.

But one peculiar thing to that I noticed is that for some reason there is a slight increase in the time taken by the HW (which is implemented in the PL fabric). Should have ideally been around 100ns, but it’s slightly higher. I wonder whether its inaccuracy in measurement method or whether the AXI is taking extra cycles to perform the operation. Ideally to confirm this a better approach is to toggle a GPIO and use a logic analyser to measure the exact time taken to encrypt and decrypt.

Overall experience

I was completely amazed by the performance offered by the A53 MPCore, being able to reach 1.2GHz is really necessary for PetaLinux based applications. Also, the PL can reach upto 1.6GHz which is also great for compute intensive vision applications. Overall the development experience using Vivado and Vitis 2023.1 was very easy and intuitive with block based depiction of the Zynq U+ MPSoC system. But one thing that I'd like to see is support for older versions of Vivado, which is missing for newer boards like these. Every new version of Vivado is much larger than the previous one and requires lot of compute and memory, which is a bit difficult in Personal computers while its okay while we run on Server VMs.

The biggest pro of this board is, its the cheapest Ultrascale+ MPSoC ($159.00 only) in its family of boards with upto 3 SYZYGY modules support. Hence young startups might find this pretty useful Slight smile

Future work

I haven't been able to explore the vision applications on this board. I am planning to setup the environment with dual boot in my PC and develop something like this: