element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog The Art of FPGA Design - Post 28
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join FPGA to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: fpgaguru
  • Date Created: 27 May 2019 5:07 PM Date Created
  • Views 2493 views
  • Likes 2 likes
  • Comments 6 comments
  • fpga_featured
  • xilinx
  • vhdl
  • guest writer
Related
Recommended

The Art of FPGA Design - Post 28

fpgaguru
fpgaguru
27 May 2019

The DSP48 Primitive - Inferring larger multipliers

 

The DSP48E2 primitive contains a signed 27x18 multiplier, any signed multiplier up to this size can be implemented with just one such primitive. If we need larger multipliers we can achieve that with multiple DSP48s.

 

The way larger multipliers are built uses a feature of the DSP48 primitive in which the 48-bit dedicated P cascade output of one DSP48 is right shifted by 17 bits before being added to the partial product calculated in the next DSP48. Using this technique a 27x35 multiplier can be decomposed into a 27x(18+17) one, where the a 27x17 partial product is computed with one DSP48 and the result is right shifted by 17 bits before being added to the second 27x18 partial product.

 

Fortunately, Vivado Synthesis is able to infer such a multi-DSP48 multiplier, including the proper pipelining from behavioral HDL code so there is no need to use primitive instantiations. The following code example shows a generic behavioral multiplier, where the two operands and the product can be arbitrary precision fixed point numbers of any size. For multipliers up to 27x18 a single DSP48 will be inferred, while for up to 35x27 ones two DSP48s are used. The pipelining of the design is controlled with the LATENCY generic, for full speed (meaning for example 891MHz in an UltraScale+ FPGA, fastest sped grade -3) performance LATENCY should be set to 3 for single DSP48 multipliers and 4 for higher precision, two DSP48 ones:

 

library IEEE;

 

use IEEE.STD_LOGIC_1164.all;

use IEEE.NUMERIC_STD.all;


use work.TYPES_PKG.all;


entity GENMULT is

  generic(LATENCY:INTEGER:=4); -- should be 3 for one DSP48 (up to 27x18) and 4 for two DSP48s (up to 35x27)

  port(CLK:in STD_LOGIC:='0';

       A:in SFIXED(34 downto 0);

       B:in SFIXED(26 downto 0);

       P:out SFIXED(61 downto 0)); -- P can be any size, the result will be resized

end GENMULT;


architecture FAST of GENMULT is

  signal RA:SFIXED(A'range):=TO_SFIXED(0.0,A);

  signal RB:SFIXED(B'range):=TO_SFIXED(0.0,B);

  signal PL:SFIXED_VECTOR(2 to LATENCY)(P'range):=(others=>TO_SFIXED(0.0,P));

begin

  process(CLK)

  begin

    if rising_edge(CLK)then

--first pipeline level

      RA<=A;

      RB<=B;

--second pipeline level

      PL(2)<=RESIZE(RA*RB,P);

--the rest of pipeline levels

      for K in 3 to LATENCY loop

        PL(K)<=PL(K-1);

      end loop;

    end if;

  end process;

  P<=PL(LATENCY);

end FAST;

 

The port sizes are set for the largest multipliers that can be built with two DSP48s but they could also be left unconstrained if you want and then any arbitrary fixed point multiplier up to 35x27 with any LATENCY greater or equal to 2 could implemented with this single design. It is worthwhile pointing out the way the proper pipelining is achieved here. Rather than explicitly describing the DSP48 internal pipeline registers, which gets really tricky when you have two cascaded ones like it is the case here, we can describe the behavioral single multiplication operation as a combinatorial function between two register levels, add one or more pipeline registers at the output and if the synthesis retiming option is turned on the synthesis tool will take care of both decomposing the single multiplication into two DSP48s and pushing the output registers into them to achieve optimal pipelining.

 

The implementation result is very good, two DSP48 and 17 FFs, which are needed because the lower 17 LSBs of the result arrive one clock earlier from the first DSP48 and need to be delayed by one clock in the fabric, but otherwise no other fabric resources are needed. The highlighted bus between the two DSP48s connects the PCOUT output port of the first to the PCIN input port of the second and this is where the partial sum is also right shifted by 17 bits:

image

In the next post we will look at efficient implementations of complex multipliers.

 

Back to the top: The Art of FPGA Design

  • Sign in to reply

Top Comments

  • fpgaguru
    fpgaguru over 4 years ago in reply to Jan Cumps +1
    Hi Jan, Please make sure you set the source file types to VHDL-2008. By default Vivado uses file type VHDL, which is VHDL-93. Reading an out port is a syntax error in VHDL-93 but it is legal in VHDL-2008…
  • fpgaguru
    fpgaguru over 4 years ago in reply to Jan Cumps +1
    I am pretty sure nothing has changed in recent versions of Vivado. The option to set the file type is still there and the default is still VHDL. In the Vivado GUI, when you select an HDL file source used…
  • Jan Cumps
    Jan Cumps over 4 years ago in reply to Jan Cumps

    I was able to build it.

    For Vivado users: you can't add VHDL 2008 to the block design. I created a wrapper around it in a VHDL 93  source, and that can be added.

    Discuss if I did it wrong ...

     

    library IEEE;
    use IEEE.STD_LOGIC_1164.all;
    use IEEE.NUMERIC_STD.all;
    use work.TYPES_PKG.all;
    
    entity genmult_wrapper is
      port(CLK:in STD_LOGIC:='0');
    end genmult_wrapper;
    
    architecture Behavioral of genmult_wrapper is
    
    component genmult is
      generic(LATENCY:INTEGER:=4); -- should be 3 for one DSP48 (up to 27x18) and 4 for two DSP48s (up to 35x27)
      port(CLK:in STD_LOGIC:='0';
           A:in SFIXED(34 downto 0);
           B:in SFIXED(26 downto 0);
           P:out SFIXED(61 downto 0));
    end component;
    
    signal A: SFIXED(34 downto 0);
    signal B: SFIXED(26 downto 0);
    signal P: SFIXED(61 downto 0); -- P can be any size, the result will be resized
    
    begin
      g1: GENMULT
        generic map (LATENCY => 4)  
        port map (
          A => A,
          B => B,
          P => P      
        );
    
    end Behavioral;

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 4 years ago in reply to fpgaguru

    Found it image

    image

     

     

     

    Do you happen to have the code for the function "*"(X,Y:SFIXED) ? That's the last manco I have to replicate the exercise.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • fpgaguru
    fpgaguru over 4 years ago in reply to Jan Cumps

    I am pretty sure nothing has changed in recent versions of Vivado. The option to set the file type is still there and the default is still VHDL. In the Vivado GUI, when you select an HDL file source used in the project, in the properties pane you can set its type to VHDL, VHDL-2008, Verilog, SystemVerilog, etc.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 4 years ago in reply to fpgaguru

    Catalin, I looked at that. In older versions of Vivado there's an option to set 2008 when selecting VHDL.

    In the 2020 version that option is gone. I read on the Xilinx forum that it's always 2008 in recent Vivado versions.

    I'm going to double-check...

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • fpgaguru
    fpgaguru over 4 years ago in reply to Jan Cumps

    Hi Jan,

     

    Please make sure you set the source file types to VHDL-2008. By default Vivado uses file type VHDL, which is VHDL-93. Reading an out port is a syntax error in VHDL-93 but it is legal in VHDL-2008.

     

    Catalin

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
>
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube