Storage Insights #4 - NAND Flash Based SSD Drives and the Flash Controller

7 Jan 2019

In Storage Insights #3, we highlighted the technical components of CompactFlash (CF) and Secure Digital (SD) cards. In Storage Insights #4 we will dive deeper into the primary components of SSD drives, with an emphasis on the Flash Controller.

General

Solid State Disk drives (SSD Drives) are becoming more and more common in personal computers and enterprise server systems, and in industrial applications. They either replace mechanical drives or can be used in a mixed system using both types of drives, depending on factors such as cost and reliability, which often are a tradeoff.

NAND Flash Based SSD Drives

SSD drives eliminate the mechanical failures from shock, vibration and other causes of malfunction. Depending on configuration, they can also take much less space and power than mechanical drives. However, using NAND Flash (SLC, MLC or TLC), presents a challenge to the NAND Flash controller designer to match the speed and corrected bit error rate of mechanical spinning drives. This is due to the nature of the architecture of NAND Flash. As we shall see, this becomes more difficult as process geometries shrink and especially with MLC and TLC types of Flash.

The makeup of an SSD is simple in hardware architecture:

NAND Flash Controller
NAND Flash devices (of type SLC, MLC (PSLC), TLC)
Power Supply Regulation
PCB usually at least 4 layers.
Enclosure – depends on physical form factor.

NAND Flash Controllers (HW)

Next to the NAND Flash itself, the controller is the most important component in the SSD. Both the HW (Hardware) and FW (Firmware) work together to get a very difficult task accomplished.

The controller hardware contains:

Host Interface (PATA, SATA, SD, SDIO,MMC,eMMC, USB, PCIE, Etc.)- Communicates in both directions with the host device. The Host interface performs the required protocol and uses Direct Memory and/or Flash access to offload the main FW.
Flash Bus Interface(s) HW- Interfaces with one or more NAND FLASH devices, be it SLC, MLC, PSLC, TLC or rarely a combination of such. This is usually a multi-channel bus (often with interleaving within each channel) which creates parallelism for data transfers increasing throughput.
Direct Memory and Flash Access, DMA DFA HW- Sector buffers can be directly transferred to and from RAM and Flash without CPU intervention. This also increases throughput, since it offloads the CPU.
Error Detection and Correction HW (ECC)- This HW block sits between the Flash and the CPU. It is used to detect and correct data that contains bit errors “on the fly”. This is a critical piece of HW and works hardest with small process MLC/TLC. 96+ bits of correction is required.
Data Scrambler HW- Better designed controllers use a dedicated scrambler unit required for modern MLC/TLC Flash. Some controllers use their encryption unit to accomplish this task. Use of this with Direct memory and Flash Access really speeds things up.
Data Encryption HW- Data Encryption/Decryption using one of techniques such as AES to and from Flash is becoming a requirement for secure applications. This must be done in HW. Keys are generated and used to access the secure data, even the FW.
ISO 7816 Secure Serial Data Port (Optional but becoming required in applications such as SD and SDIO cards)
SRAM (With Parity becoming a must)- FW runs in SRAM. Mapping tables are cached to SRAM. Temporary data storage and sector buffers reside here as well. SRAM is very limited in most controllers but 256K+ is common. FW overlays are therefore common, loaded as needed by the resident FW.

NAND Flash Controller FW Basics

At the top of the importance list, is the NAND FLASH controller FW. Having the most sophisticated HW does no good if the FW isn’t written properly and make optimum use of the HW features.

Every controller has a built-in ROM (or a Flash that is locked after programming).

The ROM code performs like a BIOS in a PC. It does basic CPU initialization, places the host port on a busy status, and Basic FLASH reset and initialization. It scans for the presence of initialized Flash devices, it then searches the Flash devices looking for a key indicating that FW has been installed, and basic needed structures are installed. This would have been already done by a preformatting process performed prior to use. This is provided by the controller vendor.

If all goes right, the ROM loads the resident part of the FW that usually resides in the first Flash device.

At this point, the initialization part of the FW completes its power up procedure, scanning for Flash errors in the last written data, correcting what is needed, and then releases the busy at the host port. Now the controller is able to receive commands from the host. It is important that this power up initialization be kept as short as possible to avoid host/device issues.

Firmware Structure

FW should be written in modular form, for example:

Power-up Procedures
Host Interface Procedures
Flash Translation Layer (FTL)*
Flash Read/Write procedures (aided by DMA and DFA HW)**
Encryption/Decryption (aided by HW)
Hooks for customer specific add ons
Debugging procedures

*Flash procedures include a common part that is applicable to all Flash, and Flash specific (SLC, MLC, TLC, PSLC – and various vendors) procedures loaded at preformat time.

**Although all aspects of FW affect performance, the FTL is the most critical. It can make or break an SSD performance and life. The remainder of this document will cover FTL basics.

Flash Translation Layer (FTL)

The FTL bears the brunt of the work of Controller FW. It is made up of the following:

Logical to Physical Mapping Procedures- The basic unit of transfer in disk drives is referred to as an LBA (Logical Block Address), or Sector. This must be stored in physical media which is, in the case of an SSD, the NAND Flash. As mentioned previously, this can be Single Level Cell (SLC), Multi-Level Cell (MLC -2 Levels), TLC (Triple Level Cell), Pseudo SLC (a version of MLC, made to simulate SLC, special usage).
Logical to Physical Mapping Tables- These tables hold the information that allows locating and placing LBAs (Logical Blocks) in the PBAs (Physical Blocks) of the NAND Flash. These tables can be quite large, depending on the mapping scheme uses. There are 2 basic mapping schemes and many variants of such. Block Based mapping and Page Based mapping.
Defective Flash Block Tables- These tables hold the initial manufacturer marked defects and could also be augmented with additional defects as blocks go bad. Some controllers only hold manufacturer defects here, and simply remove the dynamic defects from the mapping tables as they occur. Generally, Flash vendors guarantee no more than 2 % of the total blocks in the Flash device will be defective. This includes initial defects plus dynamic defects.
Flash Log Blocks- Part of the management tables, these blocks hold a history of the latest transactions as they occur. Usually loaded in ram and flushed to Flash after a transaction is completed.
Spare Flash Block Tables- These hold physical addresses of spare blocks used to replace dynamic defects as they occur. Enough spares must be allocated to cover anything over the initial defects.
Wear Leveling- This function ensures that all blocks in Flash devices are used as evenly as possible. This is as important, or more so than the mapping scheme used. Without it, the SSD would not last too long. This because of the nature of NAND Flash. By itself it has a pretty limited life span. The life of Flash is expressed in Block Program/Erase (P/E) cycles. The Flash is made up of Blocks of cells, and these Blocks can only be erased and programmed a limited number of times.

Error Detection and Correction (ECC)

This function is handled in HW, on the fly. The supporting FTL procedures are invoked and handle the administration of errors when they occur. The better controllers read the number of bits corrected on each read (HW is designed to contain counters for this.) When the errors reach a percentage of the correction capability, the block is scheduled for a refresh (Erase and Swap). Usually an interrupt is used to alert the CPU that this has occurred.

By doing this, errors caused by read/write disturbs and soft errors can be handled more efficiently. This especially required with MLC or TLC Flash with ever shirking processes.

Read Disturb Management

NAND Flash blocks support a limited number of reads without an intervening P/E cycle. This function maintains a read counter for each block. When the number of reads approaches a percentage of this number, the block is erased and the data moved to a block with less reads. Typically, 1M reads is specified for SLC, less for MLC or TLC.

NAND Flash and Logical to Physical Mapping

NAND Flash, regardless of type (MLC, TLC ,SLC) consists of Flash blocks. Each block, depending on density, contains a number of pages. Pages contain a number of sectors (each 512 Bytes), plus overhead. Page sizes vary, 4k (8 sectors), 8k (16 sectors) 16k (32 sectors).

To learn more:

Look for the next Storage Insights post soon, if you missed the previous Storage Insights, check them out here:

Storage Insights #1 - Why Not All SD Cards are Created Equal

Storage Insights #2 - Optimizing your Storage Device

Storage Insights #3 -Technical Guide for CompactFlash (CF) and Secure Digital (SD) Cards

Also, feel free to reach out to Delkin with your storage questions!

Delkin Devices or https://www.delkin.com/contact/

Buy Delkin CompactFlash

Buy Delkin mSATA

Buy Delkin microSD

Buy Delkin SD

Excerpts from the original Delkin document: