element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet & Tria Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog The New DSPFP32 Primitive in Versal FPGAs
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join FPGA to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: fpgaguru
  • Date Created: 19 Dec 2023 7:27 PM Date Created
  • Views 2522 views
  • Likes 8 likes
  • Comments 7 comments
  • xilinx
  • fpgafeatured
  • vhdl
  • dsp
  • guest writer
Related
Recommended

The New DSPFP32 Primitive in Versal FPGAs

fpgaguru
fpgaguru
19 Dec 2023

The New DSPFP32 Primitive in Versal FPGAs

The DSP primitive in the latest Versal FPGA family is called DSP58 and it already has a number of improvements over the latest DSP48 flavors, mainly an increase from 27x18 signed multiplier and 48-bit post adder to 27x24 and 58 bits. But on top of that there are two more operating modes of the DSP58 called DSPCPLX and DSPFP32. The last one, a hardened floating point adder and multiplier will make the object of this post.

The DSPFP32 includes a single precision floating point adder and multiplier. They can be used either independently or combined as a multiply-accumulate operation.

The following diagram shows the internal architecture of the DSPFP32:

image

The DSPFP32 is somehow similar to the DSP58, the real differences, apart from using single precision floating point vs. fixed point, are the fact the we have now two outputs, FPA and FPM, instead of just the post-adder P port, and that there is no pre-adder. This diagram shows the FP32 adder and multiplier used independently and the color highlighting indicates the minimum amount of pipelining required to achieve the maximum possible speed of 805MHz. You basically get a latency 2 FP32 adder and a latency 3 multiplier in every DSP58. The signs of both input operands for the adder can be optionally inverted, there is a wide selection for these operands, ZERO, C, D and PCIN inputs, as well as the FPA output itself, which can be used to build accumulators. The PCIN/PCOUT cascade chain lets you cascade multiple DSPFP32 adders and build sums of more than two terms. If you connect the FPA output externally to the B input using fabric routing you can compute something like FPM=A*(C+D) with a latency of 5 clocks.

The second image shows the FP32 multiplier and adder connected internally as a MAC, so FPA=C+A*B or FPA=FPA+A*B can be computed with a latency of 4 clocks. The optional extra pipeline registers in the C and FPOPMODE input paths can be used to compensate for the extra latency of the multiplier path so that the entire MAC has a total latency of 4 clocks for all its data inputs.

image

Although not shown in these diagrams, both FPA and FPM can be routed to the PCOUT port, so using the P cascade output to borrow one multiplier from a neighboring DSP you can also compute FPA=C+A1*B1+A2*B2 in four clocks of latency, so a full complex multiplier plus a complex adder can be built with 4 DSPFP32s and no other fabric resources.

Floating point designs were always possible in earlier FPGA families, Xilinx has provided fabric based soft floating point IP for years, but the hardened DSPFP32 offers now that option using a single DSP58 primitive and virtually no fabric resources, with much lower latency (3-4 clocks instead of 8-11), lower power consumption and clock speeds up to 805MHz in the fastest two speed grades.

In the third and final post of this short series on the new DSPFP32 I will discuss how this primitive can be instantiated and used efficiently in HDL designs.

Back to the top: The Art of FPGA Design

  • Sign in to reply
Parents
  • flyingbean
    flyingbean over 1 year ago

     Xilinx DSP primitive is an essential building block for adaptive computing.  I am starting to read your season 1&2 blogs regarding to use Xilinx DSP primitives for AI/HLS design now.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Comment
  • flyingbean
    flyingbean over 1 year ago

     Xilinx DSP primitive is an essential building block for adaptive computing.  I am starting to read your season 1&2 blogs regarding to use Xilinx DSP primitives for AI/HLS design now.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Children
No Data
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube