element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet & Tria Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      • Japan
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Vietnam
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
FPGA
  • Technologies
  • More
FPGA
Blog Fast VHDL CORDIC Sine and Cosine Component on Lattice XP2 Device Using Diamond 3.12 Part 2
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join FPGA to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: jc2048
  • Date Created: 16 Feb 2026 10:02 PM Date Created
  • Views 175 views
  • Likes 10 likes
  • Comments 6 comments
  • xp2
  • sine
  • fpga
  • cordic
  • vhdl
  • Bravia 2
  • i2s
  • lattice
  • jc2048
  • cosine
Related
Recommended

Fast VHDL CORDIC Sine and Cosine Component on Lattice XP2 Device Using Diamond 3.12 Part 2

jc2048
jc2048
16 Feb 2026
Fast VHDL CORDIC Sine and Cosine Component on Lattice XP2 Device Using Diamond 3.12 Part 2

Introduction

This is a follow-up to Fast VHDL CORDIC Sine and Cosine Component on Lattice XP2 Device Using Diamond 3.12

I thought I had actually used the CORDIC component for real, but I was getting confused with other blogs, so now I'm going to try out the CORDIC component that I developed there with a physical FPGA board.

I'm going to utilise the Lattice XP2 Brevia 2 board, along with a Pimoroni audio CODEC, that I previously used in this blog:

Quadrature Sinewave Generator on Lattice XP2 using Brevia 2 Development Board

replacing the Taylor series calculation with the CORDIC one. There's nothing special about the XP2 - the CORDIC component should run on any FPGA that has sufficient logic elements to suffice for the resolution chosen - it's just convenient because it saves me building anything fresh for the moment.

A small complication (for me, not you) is that 2026 is going to be my 'year of Linux on the workbench', so I've moved all this to Xubuntu 22.04 LTS instead of the Win 8.1 laptop that I was previously using. I needed to do something because neither Radiant (Lattice) nor Libero (Microchip), both of which I want to start using, will run on old Windows systems. Curiously, I found Lattice Diamond the most straightforward one to get going on Linux, the others needing a bit of manual assistence to sort out missing libraries. Doing Diamond first also seemed to help with getting the licensing in place for Radiant. Something that I initially struggled with was the USB board-programming side of things, but there are scripts to put the UDEV rules in place that you run for yourself after the software installs, and then it becomes plug-and-play, just like Windows. (Tip: use the script as it does more than just put the rules file in place, and also do a full reboot of the system afterwards to bring everything into effect.) I now have the machine working happily with the Brevia 2 board, programmed from within Diamond, and a Lattice iCE40UP5K EVB, programmed from within Radiant. I was going to install iCEcube2 as well, but the Linux version is too old to be 64-bit and there are a lot of old libraries that would need to be added, including some not in the standard repositories, so I haven't for the moment.

What I'm going to do

I'm not going to do anything too complicated to start with, just compute full-amplitude, 16-bit, fixed-frequency sine and cosine with the CORDIC component and feed them as right and left samples to an I2S component that outputs to the DAC on the Pimoroni board. That fits comfortably in the 5k logic elements of the XP2 part. I've chosen 440Hz for the frequency and this time I'm working 48ksps - no particular reason, other than to try something a little different.

Here's the VHDL. Two component files, and a top-level one to tie it all together. On Lattice, you may need to use Synplify for the synthesis, rather than LSE.

------------------------------------------------------------------
--              ***** cordic_top.vhd *****                      --
-- Physical test of fast CORDIC sine and cosine component using --
-- Brevia 2 (XP2) evaluation board connected to I2S codec.      --
-- Nothing too complicated: just the CORDIC component, working  --
-- 16-bit, generating 440Hz sine and cosines, and output over   --
-- an i2s interface running at 48ksps.                          --
------------------------------------------------------------------
-- JC 13th January 2026                                         --
------------------------------------------------------------------
-- Rev    Date         Comments                                 --
-- 01     13-Jan-2026                                           --
------------------------------------------------------------------

library ieee; 
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

--- top level port that connects to the device pins

entity cordic_top is port(
        --- clocks
		clk_in:			in std_logic;								--- system clock in (50 MHz oscillator)
		clk_12_288:	    in std_logic;								--- clock from my 12.288MHz crystal oscillator
		--- I2S connections - these are the connections out to the Pimoroni Pico Audio board
		mute_n:			out std_logic;								--- mute
		i2s_data:		out std_logic;								--- i2s data
		i2s_bck:		out std_logic;								--- i2s bit clock
		i2s_lrck:		inout std_logic;							--- i2s left/right word 'clock'
		--- misc control signals on Brevia 2 evaluation board that it might be good to hold at fixed levels
		spi_csn:		out std_logic;								--- 
		holdn:			out std_logic;								--- 
		sram_cen:		out std_logic;								--- 
		sram_oen:		out std_logic;								--- 
		sram_wen:		out std_logic;								--- 
		uart_tx:		out std_logic);							--- 
	
end cordic_top;

architecture arch_cordic_top of cordic_top is

constant sig_resol: POSITIVE := 16;             --- signal resolution (bits)
constant pha_resol: POSITIVE := 32;             --- phase resolution (bits)
signal theta: SIGNED(pha_resol-1 downto 0) := X"00000000"; 
signal phase_increment: SIGNED(pha_resol-1 downto 0); 
signal sine: SIGNED(sig_resol-1 downto 0); 
signal cosine: SIGNED(sig_resol-1 downto 0); 
signal delay_i: STD_LOGIC := '0'; 
signal delay_o: STD_LOGIC; 
signal i2s_load,i2s_load_del: STD_LOGIC; 

--- declare the cordic component

component cordic is
    generic(
    	input_resol: POSITIVE;          --- input resolution
        output_resol: POSITIVE);        --- output resolution
    port(
        clk_in: in STD_LOGIC;           --- clock in
        delay_in: in STD_LOGIC;         --- delay in
        delay_out: out STD_LOGIC;       --- delay out
        theta: in SIGNED(pha_resol-1 downto 0);       --- phase in
        sine: out SIGNED(sig_resol-1 downto 0);       --- sine out
        cosine: out SIGNED(sig_resol-1 downto 0));    --- cosine out
end component;

--- declare the I2S component

component i2s is
    generic(
        input_resol: POSITIVE);                       --- input resolution
    port(
        clk_in: in STD_LOGIC;                         --- clock in
        i2s_ldata_i: in SIGNED(sig_resol-1 downto 0); --- left data in
        i2s_rdata_i: in SIGNED(sig_resol-1 downto 0); --- right data in
        i2s_bck_o: out STD_LOGIC;                     --- bit clock out
        i2s_load_o: out STD_LOGIC;                          --- load out
        i2s_lrck_o: out STD_LOGIC;                    --- left/right out
        i2s_data_o: out STD_LOGIC);                   --- data out
end component;

begin

    --- instance of cordic component

    cordic_1: component cordic
        generic map(
            input_resol => pha_resol,  --- input resolution
            output_resol => sig_resol) --- output resolution
        port map(
            clk_in => clk_12_288,      --- clock in
            delay_in => delay_i,       --- delay in
            delay_out => delay_o,      --- delay out
            theta => theta,            --- phase in
            sine => sine,              --- sine out
            cosine => cosine);         --- cosine out

    --- instance of i2s component

    i2s_1: component i2s
        generic map(
            input_resol => sig_resol)  --- input resolution
        port map(
            clk_in => clk_12_288,      --- clock in
            i2s_ldata_i => sine,       --- left data in
            i2s_rdata_i => cosine,     --- right data in
            i2s_bck_o => i2s_bck,      --- bit clock out
            i2s_load_o => i2s_load,    --- s/r load
            i2s_lrck_o => i2s_lrck,    --- left/right out
            i2s_data_o => i2s_data);   --- data out


	fpga_sinewave_stuff: process (clk_12_288)
		begin

			if (clk_12_288'event and clk_12_288 = '1') then
			
			    i2s_load_del <= i2s_load;

				--- phase accumulator update (immediately after load of right data)

				if ((i2s_load = '0' and i2s_load_del = '1') and i2s_lrck = '1') then
					theta <= theta + phase_increment;
				end if;

            end if;

            --- for now, I'm just going to have a constant increment for the phase
            --- at 48ksps, this should result in a 440Hz sine and cosine
			
            phase_increment(31 downto 0) <= b"0000_0010_0101_1000_1011_1111_0010_0110"; 

	    end process fpga_sinewave_stuff;

	--- Hold these device control pins at a fixed level to stop them flapping around

	spi_csn <= '1';
	holdn <= '1';
	sram_cen <= '1';
	sram_oen <= '1';
	sram_wen <= '1';
	uart_tx <= '1';
	mute_n <= '1';

end arch_cordic_top;

-------------------------------------------------------------------------------
-- cordic.vhd                                                                --
--                                                                           --
-- VHDL component to implement a fast, pipelined CORDIC sine                 --
-- and cosine calculation.                                                   --
--                                                                           --
-- Two generics specify the desired resolutions for input and output.        --
-- A delay chain sits alongside the CORDIC pipeline to relate the output to  --
-- the input.                                                                --
--                                                                           --
-- Developed for XP2 using LSE in Diamond 3.14, but fairly                   --
-- standard VHDL and no Lattice IP components so should work                 --
-- with any FPGA.                                                            --
--                                                                           --
-- Number of CORDIC stages is one more than the output resolution.           --
-- Internal data width is (output resolution * 1.25) + 3 bits.               --
--                                                                           --
-- More information at project page:                                         --
-- https://community.element14.com/technologies/fpga-group/b/blog/posts/fast-vhdl-cordic-sine-and-cosine-component-on-lattice-xp2-device-using-diamond-3-12                                                              --
-------------------------------------------------------------------------------
-- (c)2023 Jon Clift 7th April 2023                                          --
-- Free to use however you want. No warranty as to correctness.              --
-- No guarantee of fitness for any purpose. No obligation to support.        --
-------------------------------------------------------------------------------
-- Rev Date         Comments                                                 --
-- 01  31-Mar-2023  internally overflows with sin or cos close to 1          --
-- 02  07-Apr-2023  added extra bit of headroom                              --
-- 03  05-Nov-2023  added another bit to internal resolution                 --
-------------------------------------------------------------------------------

library ieee; 
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.math_real.all;

entity cordic is
    generic(
        input_resol: POSITIVE;                     --- input resolution
        output_resol: POSITIVE);                   --- output resolution
    port(
        clk_in: in STD_LOGIC;                      --- clock in
        delay_in: in STD_LOGIC;                    --- delay in
        delay_out: out STD_LOGIC;                  --- delay out
        theta: in SIGNED(input_resol-1 downto 0);       --- phase in
        sine: out SIGNED(output_resol-1 downto 0);       --- sine out
        cosine: out SIGNED(output_resol-1 downto 0));    --- cosine out
end entity cordic;

architecture arch_cordic of cordic is

--- declare the addsub component

component addsub is
    generic(
        resol: POSITIVE);                --- resolution (bits)
    port(
        clk_in: in STD_LOGIC;                 --- clock in
        a: in SIGNED(resol-1 downto 0);      --- a in
        b: in SIGNED(resol-1 downto 0);      --- b in
        d: in STD_LOGIC;                      --- d=0 add, d=1 subtract
        s: out SIGNED(resol-1 downto 0));    --- sum out
end component addsub;

constant WORD_SIZE: POSITIVE := output_resol + (output_resol/4) + 3;

type MY_STD_LOGIC_ARRAY_TYPE is array(output_resol downto 0) of STD_LOGIC;
type MY_SIGNED_ARRAY_TYPE is array(output_resol downto 0) of SIGNED(WORD_SIZE-1 downto 0);

signal temp_phase: SIGNED(WORD_SIZE downto 0);
signal start_angle: SIGNED(WORD_SIZE-1 downto 0);
signal del: MY_STD_LOGIC_ARRAY_TYPE;
signal sin, cos, angle: MY_SIGNED_ARRAY_TYPE;
signal angle_coeff: MY_SIGNED_ARRAY_TYPE;
signal sin_start_value, cos_start_value, cos_start_value_p, cos_start_value_n: SIGNED(WORD_SIZE-1 downto 0);
signal initial_dir, not_initial_dir: STD_LOGIC;
signal dir, not_dir: MY_STD_LOGIC_ARRAY_TYPE;
signal shift_cos, shift_sin: MY_SIGNED_ARRAY_TYPE;

-- function to resize fractional binary numbers (note: numeric_std RESIZE doesn't work for this because the assumed binary point is down the other end)

function fractional_resize (arg: SIGNED; new_size: NATURAL) return SIGNED is
    variable result: SIGNED(new_size-1 downto 0) := (others => '0');
    begin
        if (new_size = arg'length) then 
	        result := arg;
        end if;
        if (new_size < arg'length) then
            result(new_size-1 downto 0) := arg(arg'left downto arg'length - result'length);
        end if;
        if (new_size > arg'length) then
            result(new_size-1 downto new_size-result'length) := arg(arg'left downto 0);
        end if;
        return result;
    end fractional_resize;
	
-- now for the component code

begin

 	temp_phase <= fractional_resize(theta,temp_phase'length);
    start_angle(start_angle'length-1 downto 0) <= temp_phase(temp_phase'length-2 downto 0);

    --- process to calculate the inverse-tangent coefficients and the overall gain (which will determine the cos start values)
	--- synthesis will understand to just calculate the values and then hardwire them into the final logic
	--- none of this floating-point calculation stuff will end up as logic in the FPGA

    calc_process: process
	    variable temp: REAL;
	    begin
        coeff_calc: for i in 0 to output_resol loop
            angle_coeff(i) <= to_signed(integer(round((2.0**real(WORD_SIZE-1)) * (arctan(2.0**(-1.0 * real(i))) / math_pi_over_2))),WORD_SIZE);
        end loop coeff_calc;
		temp := 1.0;
        gain_calc: for i in 0 to output_resol loop
            temp := temp * sqrt(1.0 + (2.0**(-2.0 * real(i))));
        end loop gain_calc;
		temp := (0.5 - (2.0**(-1.0 * real(output_resol-1))/2.0)) / temp;   --- adjustment to stop overflow (not very scientific!)
        cos_start_value_p <= to_signed(integer(trunc((2.0**real(WORD_SIZE-1)) * temp)),WORD_SIZE);
        cos_start_value_n <= to_signed(integer(trunc(-1.0 * (2.0**real(WORD_SIZE-1)) * temp)),WORD_SIZE);
		sin_start_value <= (others => '0');
		wait;
    end process calc_process;
	
	--- now generate the logic for the cordic stages

    cordic_stages: for k in 0 to output_resol generate

    begin
        first_stage: if(k = 0) generate
        begin
            first_stage_process: process (clk_in,theta)
            begin
                if (clk_in'event and clk_in='1') then
                    del(0) <= delay_in;
				end if;
			end process;
			cos_start_value <= cos_start_value_p when ((theta(theta'length-1) xor theta(theta'length-2)) = '0') else cos_start_value_n;
			initial_dir <= theta(theta'length-2);
			not_initial_dir <= not theta(theta'length-2);
            addsub_1: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => sin_start_value, b => cos_start_value, d => initial_dir, s => sin(0));
            addsub_2: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => cos_start_value, b => sin_start_value, d => not_initial_dir, s => cos(0));
            addsub_3: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => start_angle, b => angle_coeff(0), d => not_initial_dir, s => angle(0));
		end generate first_stage;

		other_stages: if(k /= 0) generate
		begin
            other_stages_process: process (clk_in)
            begin
                if (clk_in'event and clk_in='1') then
                    del(k) <= del(k-1);
				end if;
			end process;
    		shift_cos(k) <= shift_right(cos(k-1),k);
	    	shift_sin(k) <= shift_right(sin(k-1),k);
			dir(k) <= angle(k-1)(WORD_SIZE-1);
			not_dir(k) <= not angle(k-1)(WORD_SIZE-1);
            addsub_4: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => sin(k-1), b => shift_cos(k), d => dir(k), s => sin(k));
            addsub_5: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => cos(k-1), b => shift_sin(k), d => not_dir(k), s => cos(k));
            addsub_6: component addsub generic map(resol => WORD_SIZE) port map(clk_in => clk_in, a => angle(k-1), b => angle_coeff(k), d => not_dir(k), s => angle(k));
		end generate other_stages;

    end generate cordic_stages;

    --- connect outputs to signals in design
	--- sine and cosine results need resizing (this is crude truncation)
	--- also need to exclude the additional overhead bit
	
    delay_out <= del(output_resol);
    sine(output_resol-1) <= sin(output_resol)(WORD_SIZE-1);
    sine(output_resol-2 downto 0) <= sin(output_resol)(WORD_SIZE-3 downto (WORD_SIZE-output_resol)-1);
    cosine(output_resol-1) <= cos(output_resol)(WORD_SIZE-1);
    cosine(output_resol-2 downto 0) <= cos(output_resol)(WORD_SIZE-3 downto (WORD_SIZE-output_resol)-1);

end arch_cordic;

-------------------------------------------------------------------------------
-- addsub                                                                    --
--                                                                           --
-- VHDL component to implement 2's complement add or subtract                --
-- no output carry                                                           --
-- registered output for the pipeline                                        --
-------------------------------------------------------------------------------
library ieee; 
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity addsub is
    generic(
        resol: POSITIVE);             --- desired resolution (bits)
    port(
        clk_in: in STD_LOGIC;                 --- clock in
        a: in SIGNED(resol-1 downto 0);      --- a in
        b: in SIGNED(resol-1 downto 0);      --- b in
        d: in STD_LOGIC;                      --- d=0 add, d=1 subtract
        s: out SIGNED(resol-1 downto 0));    --- sum out
end entity addsub;

-- this version uses numeric_std addition and subtraction.
-- synthesis seems to build both and place a mux on output to select which result we want.
-- not necessarily good for space
-- but synthesis knows how to use fast carry-chain logic to good advantage

architecture arch_addsub of addsub is

signal result: SIGNED(resol-1 downto 0):= (others => '0');

begin

    add_sub_process: process (clk_in)
    begin
        if (rising_edge(clk_in)) then
	        if(d = '0') then
                result <= a + b;
            else
                result <= a - b;
            end if;
	    end if;
    end process;

    s <= result;

end arch_addsub;

-------------------------------------------------------------------------------
-- i2s.vhd                                                                   --
--                                                                           --
-- VHDL component to implement an I2S digital sound interface                --
--                                                                           --
-- Developed for XP2 using LSE in Diamond 3.12                               --
-- This is for use with a 12.288MHz input clock and 48ksps sampling          --
--                                                                           --
-------------------------------------------------------------------------------
-- (c)2023 Jon Clift 5th November 2023                                       --
-- Free to use however you want. No warranty as to correctness.              --
-- No guarantee of fitness for any purpose. No obligation to support.        --
-------------------------------------------------------------------------------
-- Rev Date         Comments                                                 --
-- 01  05-Nov-2023                                                           --
-- 02  13-Jan-2026  Added i2s_load_o to port                                 --
-------------------------------------------------------------------------------

library ieee; 
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity i2s is
    generic(
        input_resol: POSITIVE);                            --- input resolution
    port(
        clk_in: in STD_LOGIC;                              --- clock in
        i2s_ldata_i: in SIGNED(input_resol-1 downto 0);   --- left data in
        i2s_rdata_i: in SIGNED(input_resol-1 downto 0);   --- right data in
        i2s_bck_o: out STD_LOGIC;                           --- bit clock out
        i2s_load_o: out STD_LOGIC;                          --- load out
        i2s_lrck_o: out STD_LOGIC;                          --- left/right out
        i2s_data_o: out STD_LOGIC);                        --- data out
end entity i2s;

architecture arch_i2s of i2s is

signal prescale_count: UNSIGNED(1 downto 0) := b"00";
signal i2s_bit_count: UNSIGNED(5 downto 0) := b"000000";
signal i2s_sr: SIGNED(31 downto 0);
signal i2s_bit_en_falling,i2s_load: STD_LOGIC;

begin

	i2s_stuff: process (clk_in)
		begin

			if (clk_in'event and clk_in = '1') then
			
				--- prescaler divides by 4 to give I2S bit rate of 3.072M (64 x sample rate)
				--- prescale_count(1) will be the I2S bck
				
				prescale_count(1 downto 0) <= prescale_count(1 downto 0) + 1;	--- count up

				--- generate an enable that precedes the I2S bit clock falling edge for one clock cycle

				if (prescale_count = "10") then				--- is this count 2?
					i2s_bit_en_falling <= '1';					--- yes: high on next cycle (count 3)
				else
					i2s_bit_en_falling <= '0';
				end if;
			
				--- bit_count now counts off the 64 bit times in the I2S cycle
				--- that's 2 x 32-bit samples for dual-channel 48ksps
				--- i2s_bit-count(5) will be the I2S lrck
				
				if (i2s_bit_en_falling = '1') then					--- qualify with the enable
					i2s_bit_count(5 downto 0) <= i2s_bit_count(5 downto 0) + 1;	--- count up
				end if;

				--- my shift register for the data is 32 bits
				--- this load signal then has to occur twice during the I2S cycle
				--- once for the left sample and once for the right

				if (i2s_bit_en_falling = '1') then					--- qualify with the enable
					if (i2s_bit_count(4 downto 0) = b"11111") then
						i2s_load <= '1';
					else
						i2s_load <= '0';
					end if;
				end if;

				--- i2s output shift register
				--- for any bit cycle, either loads a new sample or does a shift
				--- the sample can be either left or right value, the multiplexing is bundled into the code
				
				if (i2s_bit_en_falling = '1') then									--- qualify with the bit enable
					if (i2s_load = '1') then										--- load?
						if (i2s_bit_count(5) = '0') then							--- use lrclk to select what to load...
							i2s_sr(31 downto 32-input_resol) <= i2s_ldata_i(input_resol-1 downto 0);		    ---   left data
							i2s_sr(31-input_resol downto 0) <= (others => '0');					---   pad with zeroes
						else														--- or
							i2s_sr(31 downto 32-input_resol) <= i2s_rdata_i(input_resol-1 downto 0);		    ---   right data
							i2s_sr(31-input_resol downto 0) <= (others => '0');					---   pad with zeroes
						end if;
					else															--- else 
						i2s_sr(31 downto 1) <= i2s_sr(30 downto 0);	            ---   shift out the register contents
						i2s_sr(0) <= '0';	            							---   lsb
					end if;
				end if;

			end if;

			--- connect the external control signals to signals in the design
			
			i2s_bck_o <= prescale_count(1);		--- i2s bit clock
			i2s_lrck_o <= i2s_bit_count(5);		--- i2s left/right word clock
			i2s_data_o <= i2s_sr(31);				--- i2s data
			i2s_load_o <= i2s_load;				--- i2s load

		end process i2s_stuff;
	

end arch_i2s;

Results

Here are the output waveforms displayed on an oscilloscope.

image

This is the line output from the Pimoroni board. To achieve those high levels, the DAC part has capacitor switchers to generate the voltage rails.

This is the FFT of one of the waveforms.

image

It's not all that pure - there are obvious traces of the harmonics poking up from the noise floor of the 8-bit scope sampling if you work across from the fundamental at 440Hz. This is the same thing I saw when the waves were generated with a Taylor series, so it strongly suggests that the problem lies with the output DAC and not the calculation.

Also, the frequency is a little bit off. It could well be that I calculated the phase increment wrongly, but I think that it's actually the fault of the 12.288MHz xtal oscillator. So I'll need to have a look at that.

Usage is about 30% of the '5K' XP2 device on the Brevia 2 board.

image

Miscellaneous Notes

The Taylor series converges more quickly, but needs multipliers.
The CORDIC needs a stage for each bit of resolution, but only needs modest amounts of logic (add/subtract of constants).
With sensible pipelining they'll both run very fast, but, with increasing resolution, carry chains become an issue, and the Taylor will slow once multiple multipliers need to be combined.
Both need embedded constants - for the CORDIC I did that 'automagically' with the VHDL math library, but be aware that VHDL only stipulates that maths be done to whatever the underlying platform can manage (presumably 64-bit floats from the FPU on a PC?), so for very high resolutions you'd probably need a different way to derive those constants.
The Taylor series is an approximation that's accurate around zero and steadily becomes less accurate as you move away.
The CORDIC is a form of successive approximation, but because the underlying algorithm is essentially a rotating vector, will only work over half a turn (but is easy to extend to a full turn by having two choices as to the start point of the vector). Depending what your use of the CORDIC is, it can make sense to scale everything so that the phase is a binary fraction (of a complete turn). I did that here to simplify the phase handling for waveform generation. If you want it for calculation purposes, you'd need to remove the normalisation I did and take it back to radians.)
Potentially, both can be extended to cover a wider range of trig functions and not just the sine and cosine.

What Next?

Although I'm tempted to develop this further on the XP2 board, I'm actually going to quickly port it to an iCE40UP5K evaluation board. That's because I want to do a simple 'roadtest' of that board and I also want to get familiar with using Radiant. To make that one a little bit more interesting, though, I'm going to rework the output to SPDIF rather than I2C. More 'trailing edge' than 'leading edge', as everyone now seems to be abandoning it in favour of USB, but there we go.

  • Sign in to reply
  • jc2048
    jc2048 1 day ago in reply to scottiebabe

    That sounds like it might be quite challenging to do. At least, to me it does - I don't have much knowledge of radio techniques, though I can sort of follow how quadrature mixing gives enough information to distinguish positive frequencies from negative ones after sampling and digitising. Good luck and be sure to show it to us when you get it going.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • jc2048
    jc2048 1 day ago in reply to DAB

    Thanks DAB

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • jc2048
    jc2048 1 day ago in reply to michaelkellett

    They do seem like useful parts for small projects.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • scottiebabe
    scottiebabe 2 days ago

    Very neat, I am experimenting with running an sdr on a pico and eventually will have to pick a fast atan2 approximation

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB 3 days ago

    Nice update.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
>
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2026 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube