The second in the PYNQ Z2 work shop series was much more involved that the first. In the first we concentrated on using PYNQ, but in the second we fired up Vivado to actually build an overlay to call from PYNQ. I can imagine that it might be quite daunting for some who'd never fired up Vivado before. Having had more experience with Vivado than Python / Jupyter, it felt a little more comfortable - if also significantly more difficult.
The core of Lab 2 was plumbing together various items of IP and calling this pipeline from PYNQ. We didn't delve too much into the content of this IP, but as the C++ source files were there, it was just too tempting not to. The pattern generator IP didn't have source code available, but the other two - Color Convert and Pixel Pack did. I decided to take inspiration from these and create my own Vivado HLS IP that could be plumbed into a PYNQ overlay.
How does the provided IP work?
I started by "reverse engineering" the Color Convert one. It's hardly reverse engineering when you have the source, but you know what I mean. To compile the IP we called build_ip.bat which just ran script.tcl in each of the IP folders. script.tcl just seemed to use vivado_hls spin up a project, compile and export it.
build_ip.bat
vivado_hls -f %%f\script.tcl
script.tcl
open_project color_convert
set_top color_convert
add_files color_convert/color_convert.cpp
add_files -tb color_convert/color_convert_test.cpp
open_solution "solution1"
set_part {xc7z020clg400-1} -tool vivado
create_clock -period 7 -name default
create_clock -period 10 -name control
csynth_design
export_design -format ip_catalog -description "Color conversion for 24-bit AXI video stream" -display_name "Color Convert"
exit
Obviously I could have just copied this process, but I decided to manually follow along and create a project in Vivado HLS.
Creating some new IP
So I fired up Vivado HLS (version 2019.1 to match everything else). I created a new project for the PYNQ-Z2 board and added a new source file and test bench file.
{gallery} Vivado HLS project |
---|
Vivado HLS: Create project |
Vivado HLS: Add source |
Vivado HLS: Add testbench |
Vivado HLS: Solution configuration |
Whilst you're at it, you should probably add the IP name and description to Solution Settings. From the top menu select Solution / Solution Settings...
From there I added content to the source and test bench file, obviously heavily inspired by the color_convert contents. Here is my main source file. My simple Posterize IP just takes each pixel and coverts it in one of eight colours. It does this by simply taking the red green and blue channels of each pixel and decides to have them full brightness or off. I could have extended my example to have a threshold or an enable/disable switch, but as I'm trying to work through the principle of creating HLS IP for use with PYNQ I have kept it as simple as possible.
#include <ap_fixed.h> #include <ap_int.h> typedef ap_uint pixel_type; struct video_stream { struct { pixel_type p1; pixel_type p2; pixel_type p3; } data; ap_uint user; ap_uint last; }; void posterize(video_stream* stream_in_24, video_stream* stream_out_24) { #pragma HLS CLOCK domain=default #pragma HLS INTERFACE ap_ctrl_none port=return #pragma HLS INTERFACE axis port=stream_in_24 #pragma HLS INTERFACE axis port=stream_out_24 #pragma HLS pipeline II=1 stream_out_24->user = stream_in_24->user; stream_out_24->last = stream_in_24->last; pixel_type in1, in2, in3, out1, out2, out3; in1 = stream_in_24->data.p1; in2 = stream_in_24->data.p2; in3 = stream_in_24->data.p3; out1 = in1 > 0x7F ? 0xFF : 0x00; out2 = in2 > 0x7F ? 0xFF : 0x00; out3 = in3 > 0x7F ? 0xFF : 0x00; stream_out_24->data.p1 = out1; stream_out_24->data.p2 = out2; stream_out_24->data.p3 = out3; }
The next step is to run the C simulation which allows your C test bench code to call you main code and check the output was as expected. You can see from my simple test bench that all I expect is colour values greater than half brightness to be full brightness.
Packaging this IP
From here we need to package the IP so it can be using in Vivado and then PYNQ. From the top menu select Solution / Run C Synthesis / Active Solution. You'll probably get an error that you need to specify a top function. This is done under Project (not Solution) Settings and in my case I needed to pick the posterize function. Once C synthesis has completed successfully, you need to Export RTL. This will package up your function as IP that can later be imported into Vivado.
Using this IP in Vivado
From here it's probably easiest to point you back to the Lab 2 workbook, where we were shown how to add items to the IP repository and then add our newly available IP to our design. I could cut and paste the steps from the workbook, but I don't think I'd be adding any value for anyone who had attended the lab. It's exactly the same as the HLS IP has been (manually) created in the same way our example IP was. Here is my new IP added in to the Vivado block design and on its way to being compiled int a bitstream.
The final result in PYNQ!
So, once we export the overlay to PYNQ, does it work? YES! This is the output from my camera passed through the posterize IP and displayed in Jupyter notebook. It's been really sunny here in the UK. Can you tell from my sunburn?
The more observant may notice it's actually added to the Lab 3 design. I'd run through Lab 3 in advance and posterizing the output of a pattern generator wasn't as visually interesting as posterizing a camera feed! The odd bars above and below are the Raspberry Pi desktop showing rather than any problems with the setup. I'm using a Pi and Pi camera as my input. (See why here: PYNQ-Z2 - Pre-workshop setup). Note that it seems to work quite happily at the default Pi resolution of 1280 x 1024.
Further thoughts and questions
I noticed that another E14 member yuricts delved into creating IP and took a look at the Pattern Generator in PYNQ Z2: Getting Up and Running - Tea Storm. Following on from some links he posted let me to a guide on creating HLS IP. This guide to a video crop IP interested me in particular as it allows you to work with a whole frame rather than a pixel at a time. However, the inputs and output as specified in C++ as AXI_STREAM& rather than video_stream* and the resultant IP works with s_axi_video rather than our stream_in_24. I'm planning to do some more digging into the differences in how these work, but if anyone wants to jump in with any insights that would be appreciated.