element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog ZYNQ Stereo Camera Platform - Part2 stereolbm with Vitis Vision libraries
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
FPGA requires membership for participation - click to join
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: dimiterk
  • Date Created: 9 Dec 2020 1:46 AM Date Created
  • Views 1969 views
  • Likes 1 like
  • Comments 1 comment
  • depth
  • stero
  • zynq 7000
  • edge ai
  • fpga
  • pynq
  • zynq mpsoc
Related
Recommended

ZYNQ Stereo Camera Platform - Part2 stereolbm with Vitis Vision libraries

dimiterk
dimiterk
9 Dec 2020

This project details how to build a stereo depth camera with AI capabilities on a ZYNQ MPSOC platform.

 

This time we'll see how to use the Vitis Vison layer L1 libraries and PYNQ framework to implement a complete stereo depth pipeline.

There are a number of past and existing bugs on Vitis Vision libraries so this procedure has not been smooth.

In addition the PYNQ framework is in a state of flux (from 2.5.1 to 2.6) so there are some API changes.

 

1. Setting up the environment

First , install Vitis on Ubuntu 18.04. This is not strictly supported so you'll need to modify /etc/release with the 18.04.4

sudo nano /etc/os-release

Now clone the Vitis libraries from this link:

git clone https://github.com/Xilinx/Vitis_Libraries.git

Install cmake :

sudo apt-get install cmake

 

2. Install Open CV

Then install OpenCV 3.4.4.

This is needed in order to compile and simulate the Vitis Vision libraries. Note the exact version above.

 

mkdir ~/opencv_build && cd ~/opencv_build
git clone https://github.com/opencv/opencv.git
git clone https://github.com/opencv/opencv_contrib.git

unzip opencv.zip
unzip opencv_contrib.zip
mv opencv-3.4.4 opencv
mv opencv_contrib-3.4.4 opencv_contrib
cd ~/opencv
ls
cd opencv
mkdir build
cd build

cd ~/opencv_build/opencv
mkdir build && cd build


make -j4
sudo make install
sudo ldconfig
pkg-config --modversion opencv
ls /usr/local/python/cv2/python-3.6
cd /usr/local/python/cv2/python-3.6
sudo mv cv2.cpython-36m-x86_64-linux-gnu.so cv2.so
cd ~/.virtualenvs/cv/lib/python3.6/site-packages/

 

3. Setup Vitis Vision IP core

 

 

The Vitis vision cores uses AXI for data transfer and AXI lite for parameter configuration.

To minimize energy consumtion and implement these algorithms on the fly for incoming imaging data we'll have to re-write the interfaces in streaming format.

 

Before we do that , test the IP by performing C simulation and COSIM.

 

There are some additional steps that have to be defined either in a settings.tcl file or added to the same tcl file as shown below in order to give the location of the opencv

library we installed before.

This is needed for co-simulation and synthesis.

 

 

#source settings.tcl

set PROJ "erosion.prj"
set SOLN "sol1"


set XF_PROJ_ROOT "/home/user/Documents/Vitis_Libraries/vision/"
set OPENCV_INCLUDE "/usr/local/include/opencv2"
set OPENCV_LIB "/usr/local/lib"
set XPART "xczu9eg-ffvb1156-2-i"
set CSIM "1"
set CSYNTH "1"
set COSIM "1"
set VIVADO_SYN "0"
set VIVADO_IMPL "0"

 

 

One can use either Vitis_HLS or Vivado HLS . There are minor differences between the two so the user has to be aware that once you create a project with Vitis HLS you won't be able to open the project on Vivado HLS.

 

To generate the core issue:

 

Vitis_Libraries/vision/L1/examplest/stereolbm

and issue:

 

vivado_hls -f script.tcl

 

Depending on the flags this will be used to synthesize , simulate and co-simulate the IP core.

 

There are two stereo vision IP cores a) stereolbm and b) stereo block matching.

When simulating the global block matgcing IP I was not able to get a proper output so that left the stereo local block matching algorithm.

 

 

3. Vitis Vision IP With PYNQ

 

Initially the following IP core was tested

 

git clone --recursive https://github.com/Xilinx/PYNQ-HelloWorld.git

 

The problem is that is uses an old version of Vitis Vision with bugs and implements custom conversion functions from xf::MAT to axi stream that are not part of the API.

 

So here one is left with two choices, a) implement the algorithm on your own or b) find a way to get the Vitis vison cores working in streaming mode.

 

Another issue is the bugs present on Vitis vision

https://github.com/Xilinx/Vitis_Libraries/issues/28

 

And the same issues when trying to interface these IPs with PYNQ framework.

https://discuss.pynq.io/t/vitis-vision-core-fails-on-pynq-v2-5-1/1822/17

 

So bottom line is that one has to re-write the interfaces with custom data types.

 

 

template <int W>
struct axis_t {
    ap_uint<W> data;
    ap_int<1> last;
};


/*
Unpack a AXI video stream into a xf::cv::Mat<> object
 *input: AXI_video_strm
 *output: img
 */

template <int TYPE, int ROWS, int COLS, int NPPC>
int AXIstream2xfMat(hls::stream<axis_t<8>>& AXI_video_strm, xf::cv::Mat<TYPE, ROWS, COLS, NPPC>& img) {
    axis_t<8> pixelpacket;
    int res = 0;

    int rows = img.rows;
    int cols = img.cols;
    int idx = 0;

    assert(img.rows <= ROWS);
    assert(img.cols <= COLS);

    loop_row_axi2mat:   for (int i = 0; i < rows; i++) {
        loop_col_axi2mat:     for (int j = 0; j < cols; j++) {
            // clang-format off
                #pragma HLS loop_flatten off
                #pragma HLS pipeline II=1
            // clang-format on
            AXI_video_strm >> pixelpacket;
                img.write(idx++, pixelpacket.data);
        }
       }
    return res;
}

// Pack the data of a xf::cv::Mat<> object into an AXI Video stream
/*
 *  input: img
 *  output: AXI_video_strm
 */
template <int TYPE, int ROWS, int COLS, int NPPC>
int xfMat2AXIstream(xf::cv::Mat<TYPE, ROWS, COLS, NPPC>& img, hls::stream<axis_t<8>>& AXI_video_strm) {

    axis_t<8> pixelpacket;
    int res = 0;

    int rows = img.rows;
    int cols = img.cols;
    int idx = 0;

    assert(img.rows <= ROWS);
    assert(img.cols <= COLS);

    bool sof = true; // Indicates start of frame

    loop_row_mat2axi: for (int i = 0; i < rows; i++) {
        loop_col_mat2axi: for (int j = 0; j < cols; j++) {
            // clang-format off
            #pragma HLS loop_flatten off
            #pragma HLS pipeline II=1
            // clang-format on
                
                ap_uint<1> tmp = 0;
                if ((i==rows-1) && (j== cols-1)) {
                    tmp = 1;
                }

                pixelpacket.last = tmp;
                pixelpacket.data = img.read(idx++);

                AXI_video_strm << pixelpacket;

            }
        }

    return res;
}

 

.

4.Simulating stereo IP core

 

There are a couple of algorithms for stereo depth perception. It's important to note that these algorithms require a lot of resources so image resolution needs to be modified for implementation,.

 

We will use the cones images from the Middlebury dataset, however the images will be downsized to 320x240 pixels and converted to grayscale before hand.

This is done in order to preserve resources as the stereo core uses a lot of fabric logic resources.

 

 

 

void stereolbm_accel(stream_t& stream_inL,stream_t& stream_inR, streamwide_t& stream_out, int height, int width) {

    #pragma HLS INTERFACE s_axilite port=height
    #pragma HLS INTERFACE s_axilite port=width
    #pragma HLS INTERFACE s_axilite port=return
    #pragma HLS INTERFACE axis port=stream_inL
    #pragma HLS INTERFACE axis port=stream_inR
    #pragma HLS INTERFACE axis port=stream_out
    xf::cv::Mat<IN_TYPE, HEIGHT, WIDTH, NPCC> imgInputL(height, width);
    xf::cv::Mat<IN_TYPE, HEIGHT, WIDTH, NPCC> imgInputR(height, width);
    xf::cv::Mat<OUT_TYPE, HEIGHT, WIDTH, NPCC> imgOutput(height, width);
    //xf::cv::Mat<IN_TYPE, HEIGHT, WIDTH, NPCC> imgOutputStream(height, width);
    xf::cv::xFSBMState<SAD_WINDOW_SIZE, NO_OF_DISPARITIES, PARALLEL_UNITS> bmState;

    // Initialize SBM State:

    bmState.preFilterCap = 31;
    bmState.uniquenessRatio = 15;
    bmState.textureThreshold = 20;
    bmState.minDisparity =  0;
// clang-format off
    #pragma HLS DATAFLOW
    // clang-format on
    // Retrieve xf::Mat objects from img_in data:
    AXIstream2xfMat<IN_TYPE,HEIGHT,WIDTH,NPCC>(stream_inL, imgInputL);
    AXIstream2xfMat<IN_TYPE,HEIGHT,WIDTH,NPCC>(stream_inR, imgInputR);
    // Run xfOpenCV kernel:
    xf::cv::StereoBM<SAD_WINDOW_SIZE, NO_OF_DISPARITIES, PARALLEL_UNITS, IN_TYPE, OUT_TYPE, HEIGHT, WIDTH, NPCC,XF_USE_URAM>(imgInputL, imgInputR, imgOutput, bmState);
    // Convert _dst xf::Mat object to output array:
    xfMat2AXIstreamwide<OUT_TYPE,HEIGHT,WIDTH,NPCC>(imgOutput, stream_out);
}

 

  imageimage

The output from CSIM and COSIM is the depth disparity map

 

imageimage

 

 

5. Testing on device

 

To test on the device a stereo camera mezzanine is needed with camera sources. I opted to simulate this using two DMA to write the stereo pair and one DMA to read the stereo output.

The input to the stereo core are 8bit grayscale images however the output is 16 bit grayscale image.

PYNQ uses 8 and 32 bit for the DMA datatype. Hence in order to deal with the 16 bit output one has either to use the convertbit depth IP and modify it into streaming mode or convert the data width of the streaming output.

 

 

The  Vivado hardware block design is shown below.

image

 

The Python jupyter notebook together with the IP cores is given on the github repo link below:

https://github.com/Q-point/StereoIPcores_MPSOC

 

 

Next step is to accelerate the RGB to grayscale and grayscale resize IP on HW following the same procedure as above and DPU image segmentation to the original color input in order to implement a depth camera with AI capabilities.

  • Sign in to reply
  • ttaa
    ttaa 6 months ago

    Hello. Thank you for sharing your work; I’m implementing something similar, and it has been very helpful. However, when I tried running the notebook I found in the repository you shared, I couldn’t get past the wait() in the DMA transfer. Have you encountered the same issue?
    Thanks.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube