element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Part 3: Use the Hardware Accelerated Code in Software
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join FPGA to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: Jan Cumps
  • Date Created: 27 Nov 2021 11:29 AM Date Created
  • Views 4108 views
  • Likes 1 like
  • Comments 3 comments
  • vitis_hls
  • zynq
  • xilinx
  • fpga
  • pynq
Related
Recommended

PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Part 3: Use the Hardware Accelerated Code in Software

Jan Cumps
Jan Cumps
27 Nov 2021

I'm following the 3-part using a HLS stream IP with DMA training on the PYNQ community. This blog will not repeat the steps. The goal is to document the experience.

Use the Accelerated function in Software

In part 1, we made the hardware accelerated function example(stream &in, stream &out).
In part 2, we created a Vivado hardware design with the accelerator IP included. This makes the function available for your programs. 
On this part, that accelerated function is used in a python program.

Refresher 1: What does the function do?

The example function accepts a stream of integers, adds the constant 5 to each value in that stream, and outputs the result. This was the accelerated IP made in Vitis HLS in training 1.

	ap_axis<32,2,5,6> tmp;
    while(1)
    {
	A.read(tmp);
	tmp.data = tmp.data.to_int() + 5;
	B.write(tmp);
     if(tmp.last)
     {
         break;
     }
    }
}

Refresher 2: What does the resulting hardware design look like?

In training 2, the FPGA design was created to allow DMA data exchange between the function and the ARM part of the Zynq

image

Next step: Load and activate the accelerated design into the FPGA

For the Zynq, the result looks identical than other FPGA designs. It's a set of IPs that are synthesized, Implemented and written to a bitfile.
We're using a PYNQ board, and the way to load the design into the hardware is by using the overlay functions.

image

For convenience, an alias is created for the DMA parts and the accelerated IP. These are the parts that we'll interact with from the Python code.
Then, the accelerated IP is enabled.

image

Run the Accelerated Function

Like any function you use, you need to declare the variables that hold input and result.
We're using a buffer of 100 unsigned integers here, for both input and output.

image

Initialise the input buffer with test values
We'll send 100 different values to the function, as test. Each position in the buffer has the value of its index. E.g.: element 14 in the buffer will have a value of 14.


image

We send the data to the IP by enabling the input DMA. The results are retrieved by enabling the output DMA.

image

That's it. We've now executed the hardware accelerated function one time. It returned the 100 processed elements. We're showing the first 10 for evaluation.

The example functionality (add 5 to a number) is intentionally kept simple. It allows to focus on the techniques.
Actual speed gain is possible for complex transformations, such as image processing.
Example:
Resizing an image from 3840x2160 to 1920x1080 using the OpenCV resize() function implemented in FPGA on my Zynq runs 4 times faster (250 ms) than the same OpenCV resize() function running as software on the ARM (1 second).

image
image

What I learned by doing this tutorial, is that the whole cycle has become more stable and integrated.
Vitis HLS and Vivado, version 2020.2, work well together. And PYNQ's examples with DMA now work reliably.

 

Pynq - Zync - Vivado series
Add Pynq-Z2 board to Vivado
Learning Xilinx Zynq: port a Spartan 6 PWM example to Pynq
Learning Xilinx Zynq: use AXI with a VHDL example in Pynq
VHDL PWM generator with dead time: the design
Learning Xilinx Zynq: use AXI and MMIO with a VHDL example in Pynq
Learning Xilinx Zynq: port Rotary Decoder from Spartan 6 to Vivado and PYNQ
Learning Xilinx Zynq: FPGA based PWM generator with scroll wheel control
Learning Xilinx Zynq: use RAM design for Altera Cyclone on Vivado and PYNQ
Learning Xilinx Zynq: a Quadrature Oscillator - 2 implementations
Learning Xilinx Zynq: a Quadrature Oscillator - variable frequency
Learning Xilinx Zynq: Hardware Accelerated Software
Automate Repeatable Steps in Vivado
Learning Xilinx Zynq: Try to make my own Accelerated OpenCV Function - 1: Vitis HLS
Learning Xilinx Zynq: Try to make my own Accelerated OpenCV Function - 2: Vivado Block Design
Learning Xilinx Zynq: Logic Gates in Vivado
Learning Xilinx Zynq: Interrupt ARM from FPGA fabric
Learning Xilinx Zynq: reuse and combine components to build a multiplexer
PYNQ version 2.7 (Austin) is released
PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Part 1: Turn C++ code into an FPGA IP
PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Part 2: Add the Accelerated IP to a Vivado design
PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Part 3: Use the Hardware Accelerated Code in Software
PYNQ and Zynq: the Vitis HLS Accelerator with DMA training - Deep Dive: the data streams between Accelerator IP and ARM processors
Use the ZYNQ XADC with DMA part 1: bare metal
Use the ZYNQ XADC with DMA part 2: get and show samples in PYNQ
VHDL: Convert a Fixed Module into a Generic Module for Reuse

  • Sign in to reply

Top Comments

  • Jan Cumps
    Jan Cumps over 3 years ago in reply to Jan Cumps +1
    Progress with the XADC sampling: The Vivado design works, and I can retrieve 128 measures at a time.
  • Jan Cumps
    Jan Cumps over 3 years ago in reply to Jan Cumps

    Progress with the XADC sampling:

    The Vivado design works, and I can retrieve 128 measures at a time.

    image

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 3 years ago in reply to Jan Cumps

    additional resource Using the Zynq-7000 XADC and signal pre-conditioning

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 3 years ago

    What this example doesn't how, is that you don't have to use this as an accelerator for software.
    You can also use it directly in an FPGA flow.

    As an example, you could flow the data from the on-board ADC into your FPGA datastream, or to a MicroBlaze with DMA.

    Check this blog: https://www.hackster.io/adam-taylor/signal-processing-with-xadc-and-pynq-3c716c. 
    I'm going to try this one of these days.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube