A Comparison of Flash Devices for Embedded System Booting
April 23, 2019
Story
This article explores the pros and cons of different types of Flash memory in the context of embedded system booting.
Many embedded systems utilize non-volatile memory to store boot code, configuration parameters, and other data that persist when the system is powered down. Today, Flash memories fulfill this role in the majority of embedded systems. Given the wide range of applications that require Flash, there are many kinds of architectures and feature sets to match the varying requirements of the applications.
Common Flash technologies include Parallel or SPI NOR flash, SLC NAND, and eMMC devices. Most recently, Universal Flash Storage (UFS) has also become an option. This article explores the pros and cons of these different types of Flash memory in the context of embedded system booting.
Complexity of Embedded System Booting
Booting up an embedded system may look easy from an outside point of view. However, booting up involves many steps and warrants careful design considerations if fast and reliable booting is required.
Typically, an embedded system boot-up sequence (see Figure 1) involves the following steps:
Power Up or Hardware Reset: This is the first step for booting an embedded system. It can either be turning on the system power or triggering the system’s hardware reset. From this point onwards, the system starts its code execution.
Boot ROM or Bootstrapped: The core jumps to the reset vector and attempts to execute the first code. Some processors have a small internal boot ROM that can be programmed at manufacturing time. Boot ROM code can perform some essential initialization of the processor, such as setting the clocks, stacks, interrupts, etc. The boot ROM can also detect where the bootloader was stored; for example, in an external NOR or NAND flash device.
Some processors can be bootstrapped to directly execute code from an external Flash device. This normally requires the processor’s hardware to natively support the particular bus interface that communicates with the external Flash device since no software initialization has been completed yet.
Bootloader XIP or Shadowing: For the processor to execute code, random access to code storage is necessary. If a NOR Flash device is used to store the bootloader, the processor can directly run off the Flash device. This method is commonly called eXecute In Place (XIP). If a NAND or eMMC device is used, boot code first needs to be copied to the system’s RAM. Then the processor can jump to the RAM space and execute. This method is called shadowing or Store-and-Download (SnD).
The bootloader at this stage is sometimes referred to as a 2nd stage bootloader (for example, U-boot for Linux applications). It is used to set up the system and load the remaining software such as the operating system and file system. It may also perform system initialization and continue the boot process via a peripheral device that is not yet natively supported by the boot ROM or hardware.
After bootloader initialization, the system can start handling basic interrupts and simple operating tasks.
Kernel OS and/or Filesystem: This is an optional step, depending on the system. If the embedded system uses an operating system or a file system, those software components also need to be loaded into the RAM memory. Due to the larger size of the software for an OS and file system, it will take longer for the system to complete this step and run in full operation mode.
After all software components are loaded, user applications can start to run: A common use case is to use Flash to store the 2nd stage bootloader and the OS and file system software. After the bootloader comes up, the system has limited functionalities and continues the boot process to load the OS and file system.
Priorities in Different Target Applications
Before deciding the type of Flash to use for booting up an embedded system, consider the target application requirements and user expectations. Table 1 summarizes the top priorities of different market segments.
For Automotive and Industrial applications, the top priorities are as follows:
Functional Safety: Functional safety is about eliminating unreasonable risk due to hazards caused by malfunctioning behavior of electronic systems. It is a major consideration when designing an automotive or industrial application. Choosing a device that is designed with functional safety in mind helps to achieve the required automotive or industrial safety standards, such as ISO 26262. In cases where advanced levels of functional safety are required, using a device designed for functional safety is essential.
Reliability: When a flash device is used to store boot code, the correct data must be read reliably upon every power-up for the lifetime of the product. Data retention time of the Flash is important for the usually static bootloader code. For consumer products such as a cellphone, the expected life span is short. However, for automotive and industrial applications, Flash devices must last more than 15 years and are required to maintain data integrity during the entire life span of the system.
Security: Data security is becoming more and more important as devices continued to be interconnected. Data storage with robust security technologies can protect critical contents such as proprietary information and commercial secrets. Newer flash devices provide different levels of security to protect data from being overwritten, inadvertent erases, and copied to a cloned device. Through encryption and other cryptographic techniques, Flash devices can be used in the chain of trusted boots.
Performance: The performance of the booting device directly impacts system startup time and operation, especially in applications when the system must be guaranteed to be up and functioning within a certain time after power up. For a flash device used for booting, the performance factor relates to not only how fast the data can be read out of the device, but also how fast the device itself can be initialized by the system power.
Endurance: Endurance in a Flash device defines how many times the memory can be programmed and erased while still maintaining its specified retention time. For many embedded systems, data must be reliable for years, even if the Flash is repeatedly read, erased, and programmed.
While NOR flash and SLC NAND typically have endurance cycles in 10K to 100K range, MLC NAND may have only 5K cycles or fewer. TLC NAND can even offer cycles only on the order of hundreds. Typically, the denser Flash cells are, the fewer erases and writes that can be performed before permanent cell failure.
Comparison of NOR Flash, SLC NAND, eMMC, and UFS
By the nature of their underlying technology, each type of Flash device has characteristics that may be suitable for one application while not for another. Table 2 offers a comparison of characteristics that are relevant to embedded system design.
Some factors mentioned in the table are typically well-known, such as density, supported temperature range, and read bandwidth. Depending on a specific bootloader, designers can choose an appropriate booting device. For example, if the bootloader size is very big, a larger device is needed to store such a bootloader. Most bootloaders however, such as u-boot, are in the order of megabytes and well within the density range of NOR flash. This means users may have many options to consider.
Other important factors to consider include device initialization time, XIP capability, and data reliability.
Device Initialization: This is the time between when the device is powered, and it can output data reliably. If the system requires a very fast boot time, initialization time may be a significant factor. If the system needs to execute code directly from Flash (i.e., instead of shadowing to RAM), running on NOR flash is the only suitable choice, as shown below.
eXecute In Place XIP: XIP capabilities allow the system to reduce expensive RAM size. Instead of shadowing code to RAM, the processor can execute directly from a NOR flash device. This approach can reduce the number of pins the processor needs to support the DRAM device, thus reducing PCB and overall system cost significantly.
Booting Requirements
Different applications have different requirements for booting. Here we choose an example from automotive applications to discuss specific booting requirements.
Figure 2 shows a typical automotive system. All subsystems are connected via the CAN bus or other networking protocols.
In automotive applications, the CAN bus has a startup requirement of 100 ms. This means a subsystem ECU (Electronic Controller Unit) must be able to reply to a CAN message within 100 ms after POR. If a subsystem cannot boot within 100 ms, it may miss critical CAN messages, an outcome that is not acceptable. When designing an automotive subsystem, fast boot-up time is an important requirement to be considered in addition to all the usual requirements for automotive applications, such as functional safety, temperature range, etc.
For applications requiring very fast boot time, such as the automotive case illustrated above, a fast memory is needed as the booting device. One mistake that can be made is to associate fast read bandwidth with fast booting time as this focuses only on how long it takes to move code and data from the boot Flash to RAM. However, if one takes into consideration the device initialization time and the boot loader size, it becomes clear that read time from the boot flash is not the main bottleneck in the booting sequence.
Modern NOR Flash devices, such as Semper NOR Flash from Cypress, offer fast initialization time and high bandwidth to minimize boot up. The bandwidth of Semper NOR can go as high as 400 MB/s when it is used with the JEDEC xSPI interface in either the Octal or HyperBus bus protocol. Considering a typical U-boot size of between 1 MB to 2 MB, a read bandwidth of 400 MB/s translates to 5 ms read time, plus a maximum 300 µs device initialization time for the Semper NOR Flash. Compare this with eMMC initialization time of about 100 ms and UFS initialization time of 50 ms. The total system boot-up using Semper NOR Flash is significantly under automotive 100 ms boot-up requirements. The NOR Flash device also meets the ISO 26262 standard and is compliant with ASIL-B.
In some applications such as Industrial or IoT, it is desirable to execute directly on the flash device (XIP) instead of copying the bootloader to RAM. Compare this to a store-and-download boot-up scheme using eMMC for storage and LPDDR2 RAM for code execution. The wide data bus of the DRAM requires many layers of the PCB design to accommodate. If the processor runs XIP directly on NOR Flash using, for example, x8 Octal SPI Flash, the number of pins is significantly reduced (see Figure 3). The result is a savings up to 2 to 4 layers of PCB design, leading to lower overall system cost.
As previously mentioned, automotive and industrial applications require Flash devices to operate reliably for more than 15 years and maintain stored data integrity. Typically, SLC NAND and MLC NAND have worse bit error rates than NOR devices. Bit errors may occur in writing to the memory array, or from electron leakage caused by read disturbs or other factors. To compensate for the risk of losing data, a high degree of ECC correction schemes are required in NAND devices. Raw SLC NAND devices may even require ECC functionality on the host side. eMMC has its own controller that handles such functions. The need for error correction and bad block management in SLC NAND and eMMC devices adds to the overall system complexity and cost. It is also an important consideration in meeting functional safety and data reliability requirements.
NOR Flash can provide the endurance required for these types of applications. For example, the EnduraFlex technology implemented in Semper NOR Flash provides over 1 million cycles of endurance in 512 Mb density devices, and over 2.5 million cycles in 1 Gb devices. These devices can also be partitioned and provisioned to have high endurance and long retention time regions that guarantees 25 years of data integrity. Therefore, a single NOR Flash device is able to provide the flexibility to store bootloader code and file system code, both at opposite ends of retention and endurance requirements, on a single device; i.e., developers can provision long retention regions for the bootloader code while reserving other memory areas as high endurance regions for the file system.
In conclusion, although SLC NAND, eMMC, and UFS have a lower cost per bit, NOR Flash devices can still be the best choice for an embedded system booting device, especially in applications that require very fast system boot-up time. With NOR Flash technology offering important reliability features such as fast initialization time, XIP capabilities, and the flexibility of configuring long retention and high endurance regions, it is quickly becoming the non-volatile memory of choice for systems that require fast and reliable boot-up.
Zhi Feng is the Senior Principal Applications Engineer in Cypress’ Flash group. He completed his bachelor’s degree in EE from South China University of Technology, and his Master of Applied Science degree from University of Ottawa in Canada. He has been with Cypress for 15 years, working in the NOR and NAND memory fields.