element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog The New DSPFP32 Primitive in Versal FPGAs
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join FPGA to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: fpgaguru
  • Date Created: 19 Dec 2023 7:27 PM Date Created
  • Views 2402 views
  • Likes 8 likes
  • Comments 7 comments
  • xilinx
  • fpgafeatured
  • vhdl
  • dsp
  • guest writer
Related
Recommended

The New DSPFP32 Primitive in Versal FPGAs

fpgaguru
fpgaguru
19 Dec 2023

The New DSPFP32 Primitive in Versal FPGAs

The DSP primitive in the latest Versal FPGA family is called DSP58 and it already has a number of improvements over the latest DSP48 flavors, mainly an increase from 27x18 signed multiplier and 48-bit post adder to 27x24 and 58 bits. But on top of that there are two more operating modes of the DSP58 called DSPCPLX and DSPFP32. The last one, a hardened floating point adder and multiplier will make the object of this post.

The DSPFP32 includes a single precision floating point adder and multiplier. They can be used either independently or combined as a multiply-accumulate operation.

The following diagram shows the internal architecture of the DSPFP32:

image

The DSPFP32 is somehow similar to the DSP58, the real differences, apart from using single precision floating point vs. fixed point, are the fact the we have now two outputs, FPA and FPM, instead of just the post-adder P port, and that there is no pre-adder. This diagram shows the FP32 adder and multiplier used independently and the color highlighting indicates the minimum amount of pipelining required to achieve the maximum possible speed of 805MHz. You basically get a latency 2 FP32 adder and a latency 3 multiplier in every DSP58. The signs of both input operands for the adder can be optionally inverted, there is a wide selection for these operands, ZERO, C, D and PCIN inputs, as well as the FPA output itself, which can be used to build accumulators. The PCIN/PCOUT cascade chain lets you cascade multiple DSPFP32 adders and build sums of more than two terms. If you connect the FPA output externally to the B input using fabric routing you can compute something like FPM=A*(C+D) with a latency of 5 clocks.

The second image shows the FP32 multiplier and adder connected internally as a MAC, so FPA=C+A*B or FPA=FPA+A*B can be computed with a latency of 4 clocks. The optional extra pipeline registers in the C and FPOPMODE input paths can be used to compensate for the extra latency of the multiplier path so that the entire MAC has a total latency of 4 clocks for all its data inputs.

image

Although not shown in these diagrams, both FPA and FPM can be routed to the PCOUT port, so using the P cascade output to borrow one multiplier from a neighboring DSP you can also compute FPA=C+A1*B1+A2*B2 in four clocks of latency, so a full complex multiplier plus a complex adder can be built with 4 DSPFP32s and no other fabric resources.

Floating point designs were always possible in earlier FPGA families, Xilinx has provided fabric based soft floating point IP for years, but the hardened DSPFP32 offers now that option using a single DSP58 primitive and virtually no fabric resources, with much lower latency (3-4 clocks instead of 8-11), lower power consumption and clock speeds up to 805MHz in the fastest two speed grades.

In the third and final post of this short series on the new DSPFP32 I will discuss how this primitive can be instantiated and used efficiently in HDL designs.

Back to the top: The Art of FPGA Design

  • Sign in to reply
  • flyingbean
    flyingbean over 1 year ago

     Xilinx DSP primitive is an essential building block for adaptive computing.  I am starting to read your season 1&2 blogs regarding to use Xilinx DSP primitives for AI/HLS design now.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • michaelkellett
    michaelkellett over 1 year ago in reply to fpgaguru

    I can't find the VE2002 listed by a distributor, but the V2102 is (by Mouser) at £336 but no stock.

    It's the second smallest in the AI Edge series (I think) .

    It has quite a lot of FP blocks (176).

    But I think we have rather different perspectives on "low cost edge" applications - my current design project has a <£10 FPGA (with 0 FP blocks Slight smile)

    MK

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • fpgaguru
    fpgaguru over 1 year ago in reply to michaelkellett

    I do not work in sales so I am definitely not qualified to discuss prices, but historically speaking, all new FPGA families start with very high prices and as they become mainstream the prices become more competitive.

    Another point worth making, there are multiple sub-families within Versal, ranging from very large parts in the Prime, Premium and HBM series to smaller, lower cost parts in the AI RF, AI Core and AI Edge series. In particular, the AI Edge series targets low cost edge applications and contains really small devices like VE2002 which I expect would be much more accessible.

    The main idea is that starting with Versal, all FPGAs in all sub-families have now single precision floating point capabilities hardened in every DSP58.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • javagoza
    javagoza over 1 year ago in reply to michaelkellett

    I go through life as a naïve person. In 2021 I participated in the Adaptive Computing Challenge 2021 - Hackster.io  I did not get a Versal but I did get a Xilinx Kria KV260 Vision AI Starter Kit, but there was the possibility of getting a VCK5000 Versal Development Card (xilinx.com) I could have competed for one in a field that I master, AI application for fraud control in financial transactions since I work for a bank but I prefer to do more fun things in my free time and applied for the Kria.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • michaelkellett
    michaelkellett over 1 year ago in reply to javagoza

    Wish you luck. Cheapest available part on Digikey is £6600 for one. It has1596 pins so its going to need quite a  pcb to work.

    Out of my range (by a long way).

    MK

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
>
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube