Summer of FPGAs -- Lattice MACHXO3LF Starter Kit - Review

Table of contents

RoadTest: Summer of FPGAs -- Lattice MACHXO3LF Starter Kit

Author: _david_

Creation date:

Evaluation Type: Development Boards & Tools

Did you receive all parts the manufacturer stated would be included in the package?: True

What other parts do you consider comparable to this product?:

What were the biggest problems encountered?:

Detailed Review:

MachXO3LF Starter Kit - RoadTest Review




This is a review for the MachXO3LF Starter Kit which features a low-cost, low-power FPGA from Lattice Semiconductor.  Other features of the kit include:  an FTDI chip for USB communication, an external oscillator, SPI flash, an EEPROM, and voltage regulators.  Additionally, there are four DIP switches and a button which can be used for input, and eight LED's which can be used for output.  With over 150 pins routed from the FPGA to through-holes on the PCB, there is very little in the way of prototyping with this EVM.



Note:  This is an image of the MachXO3L board which has the same layout as the MachXO3LF.



MachXO3LF Chip Overview


This section will describe the properties of the FPGA device used in the starter kit, but first, it is worthwhile to decipher the part number used.  Specifically, there are a few naming conventions to be aware of:

  1. Board: LCMXO3LF-6900C-S-EVN
  2. Device:  LMXO3LF-6900C-6BG256I
  3. Part Number:  LCMXO3LF-6900C-6BG256I


The names are all very similar, but have slight distinctions.  The "board" name is what appears on the PCB.  The "device" is the specific FPGA being targeted for development with Lattice Semiconductor software.  The "part number" is useful for ordering and is the best way to compare devices within the same family.  As shown below, the LMXO3LF-6900C-6BG256I device, which is used in this EVM, is one of the higher end devices, utilizing approximately 6900 LUT's while reaching the highest speeds available.  I say approximately 6900 LUT's because the device only actually has 6864 LUT's.




In many ways, the secret sauce for Lattice Semiconductor has always been its low-power silicon.  This can be further realized using Static/Dynamic Power Consumption management, which allows programmable low swing differential I/O and the ability to turn off I/O banks, on-chip PLL's, and oscillators.


What distinguishes the MachXO3 family from other families is its ultra-low density footprint, allowing it to achieve the lowest cost per I/O while supporting the industry's latest standards.  These I/O standards allow neat features such as drive strength control, slew rate control, PCI compatibility, bus-keeper latches, pull-up and pull-down resistors, open drain outputs, and hot socketing.



PL Architecture


This section will explore the MachXO3LF architecture within the programmable logic (PL) fabric.  The fundamental unit of logic within the MachXO3 family is the slice.  Each slice contains the same resources:  2 LUT4's and 2 registers.  The diagram below shows a detailed diagram of a slice, detailing the resources available and the buses used for control logic.



It may be tempting to think that a slice can be instantiated into an FPGA design directly so that each of these signals can be controlled, however, that is not the case.  Lattice Semiconductor only uses a subset of the possible functions by creating "black box" primitives for various functions.  Thus, as a design decision, they use another level of hierarchy called the programmable function unit (PFU) which contains four slices, as shown below.



The main subtly with the PFU is that although each slice contains the same dedicated resources, only slices 0-2 support RAM mode.  This is summarized in the table below.




Using these slices, Lattice Semiconductor is able to implement quite the assortment of "black box" primitives which can be used in designs (174 primitives to be exact!).  See the attached spreadsheet for more details about these primitives (primitives.xlsx).



PL Resources


As described in the "PL Architecture" section, the MachXO3LF FPGA is composed of four slice PFU blocks.  The mode of operation of these slices is what determines the resources that are available on the FPGA.


Now for the numbers.  The MachXO3LF device used in this kit contains 3432 PFU blocks.  Since each block contains four slices, there are 13,728 slices available.  If every slice was configured as a LUT4, then there would be 27,456 LUT4s because each slice contains two.  Mathematically, 27456 LUT4s is equivalent to 6864 LUT8s, which is why the "part number" for this device is labelled LCMXO3LF-6900C-6BG256I.


Likewise, when implementing RAM, only slices 0-2 are used.  Slices 0-1 are used to create the 16x4-bit ram address space, and Slice 2 is used for control signals.  This means that if every slice was configured as 4-bit wide distributed RAM (single or dual), then there would be 54k addresses of 4-bit data.


Numbers aside, here is a summary of PL resources and peripherals:


LUTs:  6900

Distributed RAM (kb):  54

EBR SRAM (kb):  240

UFM (kb):  256

Number of PLLs:  2

Hardened Functions:

        [2x] I2C

        [1x] SPI

        [1x] Timer/Counter

        [1x] Oscillator



Design Approaches:


There are two main approaches for creating a design using Lattice Diamond primitives.  One is IPexpress, and the other is PMI.  IPexpress is for people who prefer a GUI, and PMI are for people who prefer to work with source code.  In reality, most people use some combination of both.  In fact, even though PMI is considered the more "advanced" option, Lattice Semiconductor themselves even say that some features can only be implemented using IPexpress.


Here is an example of creating a pll module that can generate a 48MHz clock from a 12MHz reference clock.



As you can see, with just a few clicks, Lattice Diamond's IPexpress can generate a very useful module that can be instantiated into any design.


The alternative approach is to use PMI.  There are several ways to go about this.  One is to use the template editor (View->Template Editor).  The other way is to manually instantiate modules from the reference library.

For synthesis, use:  C:\lscc\diamond\3.12\cae_library\synthesis\verilog\pmi_def.v & C:\lscc\diamond\3.12\cae_library\synthesis\verilog\machxo3lf.v

For simulation, use:  C:\lscc\diamond\3.12\cae_library\simulation\verilog\pmi\* & C:\lscc\diamond\3.12\cae_library\simulation\verilog\machxo3l\*

My personal preference is to use the template editor whenever possible.  For instance, in one of my SPI implementations (I did several iterations), I used a PMI FIFO like the one below.




This to me, feels like I have more control over the design which is what I prefer.  My opinion is that the GUI is kind of busy, but the upside is that it is much better at catching errors.



Project 0:  Blink Reference Design


Out of the box, the MachXO3LF Starter Kit comes preprogrammed with a blink reference design.  The design teaches the user how to use the internal and external oscillators as clocks, the DIP switches as a data input, the button as an asynchronous reset, and the LED's as a data output.  The design should be quite intuitive for anyone who has worked with an HDL.  The table below shows the functional behavior of the reference design.  Although not a big deal, I noticed a small discrepancy between the user guide and the actual implementation, so I marked those changes in red.  Below the first table, I also provide a visual representation for the four blink modes that can be selected.  Later on in this review, I share a video which runs through each of these configurations.





Internal vs External Oscillator:


By default, the internal oscillator is enabled in every design even if it is not instantiated.  It can be inserted into the design using parameterized module instance (PMI) to override its default value.  The internal oscillator is good for a general purpose reference clock, but is perhaps not the most precise clock.


The external clock is a crystal that is fixed at 12MHz, and is probably the most precise clock in the design.  It can be very useful when working with the PLL's of the MachXO3LF's clocking system.  Below are some images of the 7M-12.000MAAJ-T crystal used in the design.  Additionally I provided an oscilloscope reading to verify its frequency.




Project 1:  UART


For my first project, I wanted to use the FTDI chip to communicate with a PC over UART.  It took me a while to figure out how to do this, so I will spend some time describing how this is done.  The first step is to look at the schematic diagram for the FTDI chip on page 21 of the MachXO3 Starter Kit User Guide.  Looking at this diagram, you can see signals for RS232_Rx_TTL and RS232_Tx_TTL.  If you look at page 23, you can also see that the RS232_Rx_TTL and RS232_Tx_TTL signals are connected to the FPGA at pins A11 and C11 respectively.  If you are like me, and have not come across RS232 before, it is pretty much just an electrical standard commonly used on USB's and can be used to implement the UART protocol.  One thing that took me a while to realize, is that the RS232 nodes R14 and R15 are not actually connected.  In order to use these signals, you must connect a 0-ohm resistor or solder the traces together.  The schematic diagrams and soldered joints are shown below.




Once the board was modified to support RS232, it was time to write a UART controller.  I decided to keep things very simple by modifying the reference design that was provided with the EVM.  I wanted the UART instance to have the following specs:  9600 baud, 8 data bits, and 1 stop bit.  Since the reference design uses a reference clock of 12MHz, I used a clock enable to reduce the speed of the transfer to 9600Hz.  To keep the transfers simple I assigned the 8 data bits to be {4'b0011, DIP_SW[3:0]}.  The reason for this is that the ASCII characters for 0-9 have hexadecimal values of 0x30-0x39.  Thus, using the DIP switches, you can easily correlate the input to the UART output.  To test the full system, I configured PuTTY as shown below.




Once PuTTY is setup, the FPGA can be programmed and the terminal will receive data.  Note:  You will have to configure the UART port on the FTDI chip's EEPROM using FTDI's FT_Prog software as shown in the image below:




I have included the project source files as an attachment:  Additionally, a demo of this project is included in the video that I made for this RoadTest.



Project 2:  SPI LCD & BLDC Motor Control


This section will provide an overview for my SPI LCD and BLDC motor project.  I had much bigger plans than I was able to achieve in the time frame of this Roadtest, so I plan to provide an update to this project in the future.  Originally, I was planning to use an IMU to control a BLDC motor and display statistics to an LCD screen.  Since I was running out of time, I decided to focus on just the LCD and BLDC motor.


For now we will focus on the SPI LCD.  For the LCD, I chose to interface Mikroe's LCD mini click, which has two SPI slaves:  the MCP23S17 port expander and the MCP4161 digital potentiometer.  The digital potentiometer was rather trivial (only needs one instruction), so the focus will be on the port expander.  A pinout of the LCD mini click is shown below.  Since I am only interested in displaying text to the screen, the SDO, PWM, and INT signals were unused in my implementation.




In an effort to spare the reader most of the details of the implementation, I have shared a timing diagram and the packet format needed to communicate with the LCD mini click.  Both the port expander and digital potentiometer slaves use SPI MODE 0, meaning the sampling polarity and phase are both zero.




When developing this design, most of my time was spent using ModelSim Lattice-Edition.  I chose to use this simulator because this is what I am familiar with, as this was what I used in undergrad last year.  As someone who used it in conjunction with Quartus (Altera/Intel), I always felt like ModelSim was kind of clunky, but thankfully this is done much better in Lattice Diamond in my opinion and is not nearly as buggy.  Within Lattice Diamond, you can easily specify and distinguish between sources for simulation and sources for synthesis.  This removes a lot of the headache of using low-level primitives.


In my design, I have a ROM that stores SPI instructions.  In order to simulate, it is necessary to load a .mem file.




Note:  The .mem file used in the simulator is slightly different from the final version used in Lattice Diamond.



The full simulation takes about 500,000ns to complete.  I have included the full waveform as well as a close-up view below.  In this implementation, most packets send three bytes at a time.  The first two bytes are generally 0x40 (device opcode - write to address 0x000) and 0x15 (register address of port buffer), and the third byte is the actual data to send.  The ROM is configured to send 4-bit data in the third byte which means it takes several byte transfers to send a byte of printable data.  There are probably better implementations, but I did not want to spend too much time interfacing an LCD.



Unfortunately, like anything else, simulations never work the first time in the real world...  I spent a long time trying to debug the issue.  Turns out I was overclocking the SPI LCD.  After reducing the SPI clock and introducing some delays to meet device-specific timings, I was able to get a functional design.  One of the most helpful features to figure all of this out was Lattice Diamond's Reveal.  The tool is flat-out awesome, and is probably my favorite integrated logic analyzer of the vendors I have used (Altera/Intel, Xilinx).  From a GUI standpoint, I love the fact that I can just choose a net like I would in a simulator, and then Diamond just kind of does the rest.  Not to mention, since these devices are smaller, it doesn't take much time at all to insert these into the design and rebuild everything.  Below is an example of reveal capturing what is going on in the hardware.




With the help of Lattice Reveal, I was able to rule out any issues with RTL which led me to experiment with slower clock speeds to communicate with the LCD mini click.  After revamping the design, I was able to get the LCD to work and display messages like "hello".  As long as the ROM is configured properly, every function within the LCD can be implemented.  My plan going forward, is to add a FIFO which receives IMU statistics (ie post-processed IMU data) and displays them to the LCD screen.


Youtube Video


As mentioned in some of the previous sections of this RoadTest, I compiled all the footage taken during the RoadTest into a single video.  See the embedded video below.





I had a really good experience with the MachXO3LF Starter Kit.  Like anything else in electronics, the best FPGA is the one that meets your requirements at the lowest cost.  For low-power, low-cost applications, the MachXO3LF could be an absolute steal.  You could spend hundreds of dollars on an energy-inefficient chip with 250k LUTs or about ten dollars on a power-saving chip with 6.9k LUTs.  As someone who has worked with Xilinx and Intel chips, for a long time I thought they were the only real contenders in the FPGA market, but Lattice Semiconductor offers so many high value chips that would be a shame not to explore.



  • They give the developer a ton of control and they do so without forcing them into any overly complicated workflows.
  • The tools have a very intuitive user interface.  Most things can be figured out without looking over a user manual.
  • The HTML documentation within Lattice Diamond is very well organized.  I can find the information I want extremely quickly.
  • The architecture is fairly easy to understand so there isn't a huge learning curve to understand it.
  • The tools are QUICK.  You can do full-time development without ever feeling like you are killing time having to wait for a build to complete.  I also found the active syntax checker (or whatever it is called) extremely helpful.
  • The board starts up instantly.
  • The board is extremely cheap.
  • The hardware is reliable.



  • No forums to get help.
  • Some software bugs appear here and there.  Most of the time it could be fixed by restarting Lattice Diamond.  On one occasion, I closed a project without saving changes.  It seems to have sent Lattice Diamond into a panic, and the only way to fix it was reinstalling the software.  Thankfully it only takes a few minutes to reinstall.  None of the other bugs were as severe.
  • I tried using some other reference designs like ones for SPI and UART, but I did not find them very intuitive.  Perhaps I needed to spend more time reading the docs, but I decided to not use them because of this.


These past several weeks have been a great experience, but I also feel like I barely scratched the surface.  In the future, I would love to find out more about the MachXO3LF's capabilities with MIPI and PCIe interfaces.  For instance, the MIPI DSI and CSI breakout board seems pretty cool ( ).  I plan to make some follow-up posts as I continue to work with the board after the RoadTest!

Parents Comment Children
  • Hi DAB!


    Depends on a lot of factors, but generally it's not too daunting.


    If you already have FPGA experience, even if it is with another vendor, then knowledge transfers very well.  I would recommend watching the following video straight through as a starting point: .  I watched this once at 1.5x-2.0x speed, and then I'd refer back to it whenever I was using a feature for the first time.  This will get you setup with everything you need to write basic HDL & constraints so that you can program the device.  The only thing I thought was lacking from the video was how to simulate your designs.  The UI is fairly intuitive if you are familiar with simulators like ModelSim, but how to source files might not be (ie simulation files vs synthesis files).  Since simulation is important, I think it's best just to read Lattice Semi's manual for it.  In total, I'd say 30m to watch the video, 2hrs to run through your own project without simulation, and then dedicate 5hrs to read the simulation manual and work some practical examples.


    After following those steps, I'd say you would have "learned" the tools, the next major step would be to learn the architecture and resources available.  My review describes some of this, so some useful keywords might be:  PFU, slices, PMI, black box primitives (see the excel sheet from my review), etc.  You could spend a long time learning how to use every individual resource, but nobody has time for that.  The trick is learning how to find the resource you need, pulling up the documentation, and reading about it enough so you can simulate it, and fit it into your design.  For instance, with the FIFO primitive, it took me ~15 minutes to understand the module I/O, ~30minutes to simulate it and understand the behavior, and ~1hr to fit it into a design intelligently.  There are more complicated primitives available that would take more time, but that just gives you an idea when you first start out with the tools.


    If you have not used an FPGA before or are new to HDL, then the learning curve is much steeper, but that doesn't really have anything to do with Lattice Semi in my opinion.  I think the MachXO3 would make for a great starter board for a student for example or a hobbyist willing to read documentation.  I guess what I am getting at, is once you are comfortable with HDL using a simulator or something, the actual act of learning Lattice Semi's FPGA tools is not that bad at all.  If however, you do not have experience with HDL, then I think it would be a daunting process.


    When I read Lattice Semi documentation, it almost feels like it was written knowing that this isn't you're first rodeo with an FPGA.  That's one reason why I felt like it was so quick to learn, but I could see how that would have the opposite effect for someone just starting out.


    Let me know if you would like to know anything else or need further clarification!