This ’blog is the first part of Chapter 12 of The XXICC Anthology rev 0.0k. For more information on XXICC, see the ’blog post XXICC (21st Century Co-design) release 0.0m and XXICC’s home page: xxicc.org.
Update 15 November 2014: XXICC release 0.0m adds a second Flavia implementation: FlaviaP48 for the Papilio One 500K. Rev 0.0m also allows the user to set the Flavia clock frequency values from 1 Hz to 32 MHz instead being fixed at the 2 Hz "proof of concept" frequency. See Section 12.5 of The XXICC Anthology rev 0.0m for details.
Update 20 January 2015: XXICC release 0.0n adds two Xilinx Spartan-6 implementations: FlaviaLP56 for the ValentF(x) LOGI-Pi board and FlaviaLB56 for the ValentF(x) LOGI-Bone. Both boards have a Spartan-6 LX9 FPGA. See Sections 12.6 and 12.7 of The XXICC Anthology rev 0.0n for details.
Update 15 May 2015: XXICC release 0.0p adds a Flavia implementation for the Gadget Factory Papilio DUO. This board has a Xilinx Spartan-6 LX9 FPGA. See Raspberry Pi 2 meets Papilio DUO for details.
Update 26 June 2015: The author is now aware of IceStorm, a wonderful project to reverse-engineer the Lattice iCE40 FPGA internal architecture and bitstream so we can write FaiF tools. IceStorm was first released in March 2015. For more information, see these Hackaday articles: Reverse-Engineering Lattice's iCE40 FPGA bitstream and An Open-Source Toolchain for iCE40 FPGAs.
Update 29 June 2015: XXICC rev 0.0q has improved the Papilio One Flavia implementations. They are now FlaviaP40 for the 250K and FlaviaP60 for the 500K. See Chapter 12 of The XXICC Anthology rev 0.0q.
Update 1 September 2017: XXICC rev 0.0r has added integer operators and nets so you don't have to express exterything as Boolean expressions. Rev 0.0r also adds Flavia capability for Lattice iCE40 FPGAs using the open-source IceStorm tools. See Chapters 10 and 15 of The XXICC Anthology rev 0.0r.
Abstract: Flavia is a family of logic arrays that can be designed and programmed entirely using free-as-in-freedom (FaiF) software. This is in contrast to standard FPGA (Field-Programmable Gate Array) tools from vendors such as Xilinx, Altera, Lattice, Cypress, and MicroSemi (formerly Actel) where you must use the vendor’s software to design the FPGA. Except for a part from Atmel that never caught on [and now IceStorm: see above], the author is not aware of any commercial FPGA or CPLD (Complex Programmable Logic Device) that can be designed without running its vendor’s tools.
Disclaimers: The Flavia and XXICC software described in this chapter or article are distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for details.
An incorrect FPGA bitstream could cause damage to an FPGA and/or its connected circuitry, or cause a system to malfunction. Users must agree that the author(s) and copyright holder(s) of Flavia and XXICC software have NO LIABILITY for any damages. See the GNU General Public License for details.
Flavia is not associated in any way with Xilinx Inc. If you have problems with Flavia, please do not blame Xilinx and please don’t ask them for Flavia support. Instead, go to xxicc.org and follow the links you find there.
Flavia does not reverse-engineer secret parts of Xilinx bitstreams and does not provide a way to reverse-engineer other Xilinx-based products. The locations of FPGA look-up tables and flip-flops used by Flavia are from files openly documented by Xilinx, supplemented with suggestions provided by Xilinx engineering at forums.xilinx.com. They are only useful for Flavia and do not endanger the security of other Xilinx-based designs. This is described in detail in Taming the Wild Bitstream.
This ’blog references XXICC, XOE, and GCHD. For details on those topics see The XXICC Anthology rev 0.0k.
Flavia’s Origin
Flavia was inspired by a great discussion on element14 called: “Role for FPGA or CPLD with Raspberry Pi”. The discussion was mostly about how to use Raspberry Pi for teaching programmable logic, which is a challenge because vendor FPGA/CPLD design tools do not run on ARM computers like Raspberry Pi.
On 27 October 2012, Morgaine Dinova asked the question that inspired Flavia:
What is the most complex programmable logic device (in the most generic sense of the phrase but not including CPUs) on the market today (old is OK as long as the device is still sold) that can be fully examined and programmed with open-source EDA tools? Since no modern CPLDs let alone FPGAs are open enough for that, what’s the best that can be done?
I thought about Morgaine’s question for a day and I came up with this:
In terms of what can be done right now, there’s always Wheeler’s Law: “All problems in computer science can be solved by another level of indirection.” It’s an awful way to do it, but you could take a large FPGA and dedicate 80-90% of its LUTs* to form a routing matrix implemented as an array of multiplexers implemented using distributed RAM cells. The remaining LUTs would be RAMs for implementing n-input logic functions. I believe you can update the contents of distributed RAM cells over JTAG using Xilinx documentation, so you can update both the logic function and the routing. Since all routing would be through RAM cells instead of pass transistors, it would be way slower than using an FPGA properly, but as a teaching tool it’s “something that can be done now”.
I think I heard of this being done some decades ago, but I thought it sounded too silly to remember the details. Actually, it was I who was silly, thinking FPGA vendors would see the wisdom of opening up their configuration formats.
* LUT = Look-Up Table. Most FPGAs implement logic functions with small RAMs, each of which implements an n-input arbitrary logic function using brute-force table lookup, where n is 3 to 6 depending on the FPGA.
After thinking about it more, I decided that such an array could be a terrific resource for learning about programmable logic. I did some preliminary design and estimated that an array of 32 6-input arbitrary functions could fit into a 200K-gate Xilinx Spartan-3A or 250K-gate Spartan-3E. This is a terribly inefficient use of FPGA logic, but a 32-block PLD is large enough for learning about programmable logic. The free logic array would also be be a fine addition to XXICC, maybe even a “killer application” (well, one can dream, right?)
Before I could do this yet-unnamed free logic array, I first needed to complete workable versions of XXICC’s hardware design tools like its figure editor and GCHD (GalaxC for Hardware Design). Workable versions were released on USA π Day 3.14.2014. After that, I turned my attention to the free logic array, hoping to have a first release on European π Day 22/7/2014.
The result is Flavia, which was completely inspired by Morgaine’s question for which I am extremely grateful. The name Flavia is short for “Free Logic Array via —”, where “—” identifies the FPGA chip or board used for a particular Flavia implementation. For example, the first Flavia implementation is FlaviaP32, where P stands for the Gadget Factory Papilio One 250K and 32 is the number of function blocks implemented. The name Flavia may refer to the project’s hardware or software.
Flavia Advantages
Flavia is designed for new FPGA designers who want to play with FPGA technology without having to climb the steep learning curve associated with an FPGA vendor’s tool suite. Flavia arrays are fairly small, with the capabilities of a CPLD (Complex Programmable Logic Device) rather than a large FPGA. Flavia is intended to be an easy and fun way to learn basic programmable logic design so that learning vendor tools won’t be as daunting.
Here are Flavia’s key advantages:
- Flavia tools can run on non-Intel architectures. Specifically, they can run on ARM-based single-board computers like Raspberry Pi and BeagleBoard/Bone. Vendor tools usually only run on Intel x86 PCs. Most vendors offer both Microsoft Windows and GNU/Linux versions, but they don’t provide a version of the tool suite for ARM or other non-Intel processors.
You can download an FPGA configuration bitstream using any processor, but vendor tools for compiling your design and producing that bitstream usually only run on x86 PCs. - Flavia tools are fast. Compiling a small design and downloading it to an FPGA using the Xilinx ISE tools generally takes around a minute on a reasonably fast PC. Flavia can compile and download a small design in less than a second.
- Flavia tools are FLOSS (Free-as-in-Liberty Open-Source Software). This means that you and the community can improve Flavia tools and adapt them to other purposes. Tools are not limited to the capabilities provided by FPGA vendors. Most vendors do provide a “free-as-in-beer” version of their tools, but you cannot modify the vendor tools and if there are bugs you cannot fix them yourself.
Flavia is licensed under GPLv3. Basically, this means you can use Flavia for any purpose -- commericial or non-commercial -- with the understanding that Flavia has no warranty and limits liability. If you redistribute Flavia, you must include source code, along with other requirements listed in GPLv3 [www.gnu.org/licenses/gpl.html]. - Flavia tools are small. The entire source code -- which includes all of XXICC -- is about a megabyte. In contrast, the Xilinx tools are gigabytes that must be downloaded from the Internet or loaded from a DVD-ROM.
- Vendor tools generally require designs to be written in Verilog or VHDL. Many people have found the languages difficult to learn and to use, though many people like one or both. Chacun a son goût (YMMV). There are books available to learn either language.
Flavia uses the much simpler GalaxC extensions for Hardware Design (GCHD). GCHD is similar to Verilog, but has (in the opinion of the author) cleaner syntax and semantics.
Flavia tools also include XOE’s figure editor so small designs can be drawn as logic diagrams, simulated using cartoon simulation, and then downloaded into an FPGA. Many vendor tool suites also include schematic editing. Logic diagrams can be a great tool for learning about logic design, but once designs get complex most designers find an HDL is more useful.
Since Flavia tools are FLOSS, you are not limited to GCHD as the only HDL. You or the community can create or adapt compilers for any hardware description language or other scheme to create designs and then interface them into the Flavia tools.
Flavia does have disadvantages, which pretty much limits its use to learning about FPGAs and small projects with loose performance requirements.
- Flavia designs are small: Flavia provides the capabilities of a CPLD, not an FPGA. This is because the multiplexers that route signals between logic functions consume most of the FPGA’s LUTs. If you need lots of gates, use vendor tools.
- Flavia designs are much slower than FPGA implementations. Again, this is because the routing is implemented using multiplexers, but also because the outputs of function blocks need to be routed throughout the FPGA. If you need to meet specific timing requirements, use vendor tools.
- While you can use Flavia for any purpose, Flavia designs are not intended for final products. They make very inefficient use of FPGA LUTs, and do not provide bitstream security if that’s important to you. For products, use smaller, cheaper FPGAs or CPLDs, and vendor tools.
- Flavia designs have functional limitations. For example, Flavia does not currently support bidirectional pins. FlaviaP32, the first Flavia implementation restricts I/Os pins to 3.3V LVTTL with pull-ups, and only has a single 2 Hz clock for flip-flops.
- XXICC 0.0k is the first release of Flavia software and there are plenty of rough edges and probably undetected bugs. Some capabilities are preliminary. For example, logic synthesis does not detect common sub-expressions so it’s up to the designer to enter efficient designs.
Even with these restrictions, you should find Flavia to be a good platform for learning about FPGAs. 0.0k is the “proof of concept” release. Later releases will have improved capabilities.
FlaviaP32 Architecture
We will now look at a specific Flavia implementation: FlaviaP32 for the Gadget Factory Papilio One 250K. We chose this board because at US$38 it’s the cheapest FPGA development board we could find that’s readily available in the USA and provides capabilities needed for a reasonably-sized Flavia, specifically:
- Xilinx XC3S250E Spartan-3E FPGA with 250K gates, which is large enough for 32 6-input arbitrary function blocks (AFBs) with full input multiplexing. This can probably be expanded to 40 AFBs, which will be the FlaviaP40.
- 48 I/Os on 0.1" sockets, which are easy to interface with external circuits. Papilio also has a nice collection of inexpensive “wings” such as the Button/LED Wing with 4 LEDs and 4 push-buttons, and a solderless Breadboard Wing for adding external ICs.
- FTDI FT2232D full-speed USB controller for programming the FPGA. FTDI is supported by the FLOSS library libftdi.so which provides I/O for sending bitstreams and other messages over USB. Reprogramming the XC3S250E takes a few tenths of a second and is done directly from the Flavia tools without having to run a separate program.
There’s also a free-as-in-beer Microsoft Windows driver, provided by FTDI. Flavia software dynamically links to the appropriate driver for the host operating system.
In the author’s opinion, Gadget Factory has a nice business model with open-source hardware and a friendly forum for answering questions.
Here is the high-level block diagram for FlaviaP32:
FlaviaP32 consists of 32 identical AFBs, each connected to an FPGA pin which can be either an input or an output. (Flavia software does not support bidirectional pins in release 0.0k, though the FlaviaP32 hardware does.) Each AFB implements an arbitrary 6-input Boolean function and has a D flip-flop (DFF). Each of the six inputs can come from any of the FPGA pins: each of the horizontal lines on the left side of the diagram represents a 32:1 (32-to-1) multiplexer, with each possible input shown as ‘x’.
In addition, each AFB has a set/reset (sr) input which is the asynchronous init signal for the AFB’s DFF, and an output enable (oe) input which controls whether the AFB’s pin is an input or an output. All FlaviaP32 I/O pins have 2.4 to 10.8 KW pull-up resistors, so unconnected inputs are pulled high.
Each AFB needs eight 32:1 multiplexers, which must be implemented using LUTs. Here is the block diagram for one of these multiplexers:
Each of the eight LUTs selects one of four inputs. Flavia can program a LUT with a single-input logic function that passes one of the inputs -- perhaps inverted -- and ignores the rest. Alternatively, Flavia can program a LUT with all zeros or all ones to produce a 0 or 1 output and ignore all inputs. The eight LUT outputs are combined using 2:1 multiplexers built into the Spartan-3E logic blocks. These 2:1 muxes are in addition to the LUTs and have very fast local connections that add minimal delay to the 32:1 mux.
The select lines for the 2:1 muxes (cfg0-cfg2) are FFs that are in the same logic cells as the LUTs. The FF states are also programmed by Flavia. Basically, to program a 32:1 multiplexer to select one of the 32 inputs, Flavia sets one of the LUTs to a single-input function and the rest to all zeroes, and then sets the FFs to select that LUT.
So, how does Flavia program these LUTs and FFs? Well, each LUT and FF is in a well-defined location in the Xilinx bitstream. Normally, Xilinx does not give you the bitstream format, but they make an exception for LUTs and FFs. This is primarily so you can read back the FPGA bitstream and get the current values of LUTs (which can be used as RAMs) and FFs, but the information provided by Xilinx lets you write your own values into those locations. Flavia simply takes a generic Flavia bitstream (FlaviaP32.bit), writes the LUT and FF values required for each 32:1 mux into the proper locations in the bitstream, calculates a new checksum, and downloads the result into Papilio over the USB connection. All this takes less than a second on a reasonable PC.
Now that we have 32:1 muxes worked out, let’s take a look at the 6-input AFB. Here’s the block diagram:
The four LUTs on the left and their 2:1 muxes calculate an arbitrary 6-input Boolean function of the six inputs p0-p5. Here we make use of the Shannon expansion -- called development by George Boole -- which states that any n-input function can be decomposed by multiplexing two (n-1)-input functions as follows:
Applying this expansion twice, we can calculate any Boolean function of 6 inputs by multiplexing the outputs of four 4-input functions. We implement the 4-input functions using LUTs, all of which use the same inputs p0-p3. The remaining inputs p4 and p5 are the select inputs for the multiplexers.
An AFB’s output can be the output of the 6-input function – i.e, combinational logic (C/L) – or the value of the DFF which is updated by the rising edge of a shared 2 Hz FlaviaP32 clock. We use a LUT as a multiplexer to support future applications. The DFF has an asynchronous init signal sr which sets the DFF to 1 or 0, configured by Flavia software. Xilinx FFs have clock enable inputs, but FlaviaP32 doesn’t use them: AFB FF clocks are always enabled.
As you can see, FlaviaP32 has a very simple architecture which means that synthesis tools can be simple. All AFBs are identical, and there are no routing restrictions so Flavia software can assign any 6-input function to any AFB, restricted only by how the user wants to assign pins to Papilio I/O sockets.
There are a number of improvements planned for future releases, and we hope others will be suggested by the community. For example, the AFB output LUT can implement some logic functions like detecting a state change. A major improvement is to replace each 32:1 multiplexer with a 32-input AND gate or OR gate with invertible inputs, i.e., a minterm or maxterm. This can significantly improve Flavia’s logic capacity with a modest increase in software complexity.
Using FlaviaP32
You can create FlaviaP32 designs using GCHD (GalaxC for Hardware Design) text, logic diagrams, or a combination. This section shows several example designs.
XXICC supports hierarchical design, where a large design is partitioned into smaller modules in the same way that a large program is partitioned into smaller functions. The top-level or root module shows how signals are connected to Papilio One I/O sockets. In a future release you will be able to create a table to show how your root signals are mapped to Papilio I/O, but in 0.0k the root module must use Papilio One signal names.
Papilio One has 48 I/Os arranged in three rows each with 16 I/Os. Paplio One I/Os are numbered A0-A15, B15-B0, and C15-C0 from bottom to top. Each group of 8 I/Os has four terminals for power and ground. These are for powering Papilio add-on boards called “wings”.FlaviaP32 has 32 I/Os, which are assigned to A0-A15 and C15-C0. Each I/O can be an input or an output. FlaviaP32 has pull-ups on all pins. Xilinx does not document where the bits are that control whether an I/O has a pull-up or pull-down, so we had to make a choice and pull-ups let you do open-drain I/Os.
Example 1: Majority/Minority Circuit
Here is a simple GCHD example that produces the majority and minority functions. The inputs are A0, A2, and A4. Output A1 is high if a majority of A0, A2, and A4 are high. Output A3 is the opposite.
include "gchd.gi"; // This line is needed for GCHD designs.
digital hardware (synthesize)
{ // Boolean equations for majority and minority functions.
net {A0, A2, A4, A6}; // Undriven nets are inputs.
net A1 = A0 & A2 | A0 & A4 | A2 & A4;
net A3 = ~A1;
net {A5 = FALSE, A7 = FALSE}; // Turn off extra LEDs.
};
GCHD consists of extensions to the GalaxC programming language, so the above example is actually a GalaxC program. The first line includes the gchd.gi library, which is needed for GCHD programs. The “digital hardware” construct creates a root module for the logic. The net statements declare signals and optionally give them values. If a net does not have a value, Flavia software treats it as an input. If a net is assigned a value, it’s an output. We named the signals A0-A7 to assign them to specific Papilio One pins.
In the example, we set A1 to the majority function maj(a, b, c) = a b + a c + b c. A3 is the minority function, which is simply the complement of the majority function. A5 and A7 are FALSE to turn off extra LEDs.
To run this example, start up XXICC with the fla option, e.g.,
xxicc fla
where “xxicc” is the XXICC executable on your system – see Installing and Running XXICC rev 0.0k for details.
XXICC will compile and/or load the XXICC source code needed for Flavia and link in the run-time library to talk to Papilio’s FT2232D chip: libftdi.so on GNU/Linux or ftd2xx.dll on MS Windows. If the library is not present, you will get an error message. See Installing and Running XXICC rev 0.0k for how to deal with this.
When XXICC completes loading, it will bring up a file selection dialog. Open “majmin.gal” which contains the above example. You are now in the XXICC Object Editor, or XOE. Compile the example by pressing F6. GalaxC should compile the example, do Flavia logic synthesis (trivial in this case), and download the resulting bitmap to your Papilio One board, all in less than a second.
To see if the FPGA is programmed correctly, hook up LEDs to A1 and A3 through suitable current-limiting resistors. One or the other LED should be on. Then ground A0, A2, and A4 in various combinations. Since A0, A2, and A4 have pull-ups, if you don’t connect them to anything they’ll have the value 1 and the majority output A1 will be high and minority A3 will be low. If you ground one of A0, A2, and A4, you still have a majority high so A1 should still be high. If you ground two of them, then A1 should go low and A3 should go high.
Using Button/LED Wings
The easiest way to connect LEDs and manual inputs is Gadget Factory’s Button/LED Wing, currently US$8. FlaviaP32 uses B15-B0 for Button/LED Wings. The LEDs observe adjacent A1-A15 pins and the buttons invert adjacent A0-A14 inputs. For LED pins, FlaviaP32 simply connects odd-numbered A pins to adjacent B pins according to this table:
LED1 | LED2 | LED3 | LED4 | LED1 | LED2 | LED3 | LED4 |
---|---|---|---|---|---|---|---|
B14 | B12 | B10 | B8 | B6 | B4 | B2 | B0 |
A1 | A3 | A5 | A7 | A9 | A11 | A13 | A15 |
For button pins, odd-numbered B pins invert the input values of adjacent A pins:
PB1 | PB2 | PB3 | PB4 | PB1 | PB2 | PB3 | PB4 |
---|---|---|---|---|---|---|---|
B15 | B13 | B11 | B9 | B7 | B5 | B3 | B1 |
A0 | A2 | A4 | A6 | A8 | A10 | A12 | A14 |
A pins have pull-ups, so an unconnected A input has value 1. Odd-numbered B input pins for buttons have pull-downs. For example, if you press the PB1 next to A0, that inverts A0. If A0 is unconnected, PB1 inverts its 1 value to produce 0. If A0 is grounded, PB1 inverts that 0 to produce 1.
The LED and PB connections via B0-B15 are built into FlaviaP32 and do not appear in Flavia designs.
Example 2: Binary Counter
Here’s a more interesting example, a 4-bit binary counter:
include "gchd.gi"; // This line is needed for GCHD designs.
digital hardware (synthesize)
{ // 4-bit binary up-counter.
clock net clk; // 2 Hz FlaviaP32 clock.
net reset; // Reset counter if high, otherwise enable counter.
net up; // Count up if !reset and up, otherwise hold.
reg {s3, s2, s1, s0}; // Counter state.
// If reset, asynchronously reset s3-s0.
if reset then s3 = s2 = s1 = s0 = FALSE else
if clk rises then
{ // On rising clk, increment s3-s0 if up, otherwise hold s3-s0.
s3 ^= up & s2 & s1 & s0;
s2 ^= up & s1 & s0;
s1 ^= up & s0;
s0 ^= up;
};
// Assign counter I/Os to Papilio One pins.
net {A0, A2}; reset = !A0; up = A2; // PB inputs.
net {A1 = s3, A3 = s2, A5 = s1, A7 = s0}; // LED outputs.
};
In this case we have used meaningful net names. “reset” is an asynchronous reset signal that connects to AFB sr inputs. “clk” is the clock signal: FlaviaP32 only has a single 2 Hz “proof of concept” clock. It is built in and does not use any of the 32 I/Os.
The last lines of the example connect the reset and up signals to push-button inputs, and the counter state s3-s0 to LEDs. If A0 and A2 are not connected, they are pulled up to 1 and reset = 0 and up = 1. This makes s3-s0 begin counting immediately. If you ground A0 or press PB1, s3-s0 resets to 0000 and counts from 0000 when you release A0 or PB1. If you ground A2 or press PB2, s3-s0 stops at its current state and resumes counting when you release A2 or PB2.
To run this example, open “Bcount4.gal” in XOE and compile it using F6. If it compiles correctly, Flavia will download the bitstream and it will begin counting.
Example 3: Möbius Counter
The third example describes the design as a logic diagram instead of text. This is a four-bit Möbius or Johnson counter that counts in the sequence 0000, 1000, 1100, 1110, 1111, 0111, 0011, 0001, and back to 0000.
To run this example, open “Moebius4.xoe” in XOE, but don’t compile using F6. Since this design only uses standard gates and flip-flops from gchd.gi, you compile it by clicking in the figure and then pressing F to issue the Flavia command. If it compiles correctly, Flavia will download the bitstream and it will begin counting. Ground A0 or press PB1 to reset to 0000.
Basically, that’s all there is to using Flavia. In rev 0.0k, Flavia is particularly easy because there's only one implementation – FlaviaP32 – so you don’t need to select an implementation, and there’s no pinout table. If you want to play with Flavia, you’ll need to read the relevant chapters of The XXICC Anthology rev 0.0k to learn more about XOE and GCHD.
This ’blog is the first part of Chapter 12 of The XXICC Anthology rev 0.0k. The rest of that chapter describes how Flavia synthesizes logic and configures FPGAs. It ends with a Hints section with various suggestions for using Flavia, and an Issues section that lists known problems with the current revision.
Top Comments