This learning module will discuss the functionality of an OS, key components of Linux, and Linux as an OS for embedded devices.
Embedded Software II: Embedded Linux
Sponsored byelement14 Community1. Introduction | 2. Objectives | 3. What Is an Operating System? | 4. What is Linux? | 5. What Is Different in Embedded Linux? | 6. Trying Out Embedded Linux | Related Components | Test Your Knowledge
In the late 1980s, Andrew Tanenbaum, Computer Scientist and Educator, developed an educational operating system (OS) named MINIX, a POSIX-compliant, UNIX-like OS with reduced functionality and a microkernel. The first version of MINIX was small enough that the full source code could be printed in Tanenbaum's Operating Systems: Design and Implementation textbook.
Some time later, Linus Torvalds encountered MINIX as a student in Helsinki, Finland. Inspired by MINIX and learning important concepts from it, Torvalds developed a monolithic kernel that came to be known as Linux. Around the same time, Richard Stallman, who shepherded the GNU Project through the Free Software Foundation was looking for an acceptable kernel for the GNU Project (GNU is a recursive acronym for “GNU's NOT UNIX”). Linux seemed like a great fit for the project, and so, eventually, GNU/Linux (a combination of GNU Project software and the Linux kernel) was released. Stallman strongly advocates that the OS now known as Linux be called GNU/Linux for this reason, though this name does not exactly roll off the tongue.
Because of its versatility, stability, open source nature, and large community of users, Linux has emerged as one of the most widely used operating systems for embedded devices. This Essentials will discuss the functionality of an OS, key components of Linux, and Linux as an OS for embedded devices.
2. Objectives
Upon completion of this module, you will be able to:
- Explain what Linux is
- Explain what the term distribution means in the world of Linux
- Understand some of the core elements of Linux
- Understand the differences between Desktop and Embedded Linux
- Try out an Embedded Linux distribution
- Get started developing your own Embedded Linux Distribution
In general, when we use the term Operating System, we are referring to all of the software required to get the computer functional; functional in this case means more than just turning the computer on, as it has to also be usable by an end user. The end user should be able to perform tasks, and this will typically involve additional software, such as a spreadsheet, to get useful work done. The OS is thus a bundle of closely related software that makes the computer usable.
The OS includes the kernel, which provides software that controls and gives access to the different hardware resources of a computer, and provides services to the application software that a user may want to install to accomplish different tasks.
In a typical desktop OS, where multiple programs are running at the same time, the OS is also creating the virtual environment that gives the appearance that dozens, if not hundreds, of programs are running at the same time. Running programs are composed of processes and threads, many of these running simultaneously. In reality, depending on the number of cores in the microprocessor, only one or a handful of processes can truly execute simultaneously, but through multiprocessing or multitasking, the OS manages to execute multiple processes on a small number of shared processors. In simple terms, the OS switches between dozens of processes multiple times a second, and thereby presents the illusion that many, many processes are running simultaneously. These days, microprocessors execute billions of instructions per second. Even if the number of processes running is in the hundreds, each process is likely getting millions of its instructions executed on the microprocessor. The reality is, of course, much more complicated, and interested readers are encouraged to look into multiprocessing and process scheduling as areas to explore further.
In an embedded system, the number of required processes that run simultaneously are far fewer than in a typical desktop OS. As such, the complexity of that multiprocessing system is simplified, and in some cases the embedded system can also run a single execution environment, meaning a system that is only capable of running one process (or program) at a time. Of course, with a single execution environment, there is no need for the complexity of an OS like Linux; something extremely simple will do in these cases. An example of a single execution environment embedded system that readers may be aware of is the popular Arduino system; please note, however, that there are multi-core Arduino variants, such as Shieldbuddy, available now as well. We are using the example of Arduino for illustrative purposes.
Multiprocessing capabilities aside, a more important consideration for the software is making the entire system function seamlessly. On a desktop system, that means that, if I move the mouse, I should get some response from the system. The same goes for other input/output devices. All of the hardware works seamlessly in contemporary systems, because the OS takes charge of managing them and controlling access to them. The OS manages these tasks by having the kernel work closely in conjunction with specialized software called drivers. Drivers enable the control, access, and coordination of the devices and peripherals connected to a computer. Drivers are often provided by the manufacturer of the device or peripheral; however, standard drivers that can be used interchangeably between similar types of devices have become commonplace. One example is a mouse; you usually do not have to install a manufacturer's driver for a mouse.
Figure 1: OS Hardware Abstraction
The OS rarely lets application software directly access devices and peripherals; instead, access is usually controlled by the kernel acting as an intermediary. An application seeking to access a device, for example, a word processor saving a document, will request the OS to perform that function via a System Call.
The kernel also controls access to the microprocessor; this is called process management. As noted above, process management is generally what makes multiprocessing or multitasking possible. We will look at the common functions and components of an OS in the context of Linux in a later section.
Linux is a free, open source Operating System. As noted above, the name Linux actually applies to the kernel; however, it is now common to refer to both the kernel and a particular variant of the OS (kernel plus other software) as being Linux. There are Linux variants (distributions) that have fewer or almost no components of the GNU project. A popular example of this is the Android OS, which uses a modified Linux kernel and a variety of open source software, but very little from the GNU project.
In the case of Embedded Linux, because the target computer is radically less complex than a typical desktop computer, the bundle is composed of a reduced set of non-kernel utilities, as well as potentially a reduced kernel. This reduction in the footprint of the OS will be discussed in the section on “What's Different in Embedded Linux” once you have a better understanding of the main components of an OS in the next section.
- 4.1 What Are the Components of Linux?
The discussion in this section applies generally to Operating Systems; however, some Linux-specific details are noted throughout. The main components of the Linux OS are discussed below:
Process Management
We discussed earlier how the OS manages access to the microprocessor or cores. Every program is ultimately composed of executable instructions, which are executed on the microprocessor or cores. A running program is composed of processes and threads. At any given time, there are dozens or even hundreds of processes that are running. These can be seen via Windows Task Manager or by using the Linux ps or top commands (typically, an OS will not break down the threads executing and instead only list processes). When to run which process, which process needs to wait, which process is to be given priority – these are all decisions made by the kernel through Process Management. Process Management, at a high level, involves the use of a Scheduling algorithm, as well as infrastructure that maintains the relevant information related to running processes. For example, when one running process is removed from the microprocessor to be replaced by another process, several pieces of information must be saved: which instruction was executing (Program Counter), the state of local variables (Registers), the state of the microprocessor (flags), etc. All of this information must be saved before the process can be flushed out and the new process brought in. This information is saved to a Process Table or Process Control Block (PCB). Any relevant information from the PCB is loaded into the processor so that, if necessary, the next process can use it. As complicated as this sounds, it is even more complicated in practice. Curious readers can begin with an Operating Systems book, such as those from Tanenbaum or Stallings. There are also entire books devoted to Process Management and Linux Process Management and Scheduling. In short, the OS, via the kernel, manages this complex function in close conjunction with the hardware.
Please note that in OS and Process Management books the term running may refer to the currently executing process in the processor, with other states for a process called waiting, suspended, and more. We are using the term running to mean a program that is loaded into Main Memory (RAM) and is awaiting execution.
Memory Management
Keeping track of running programs and their memory footprint is actually quite a complex task. Programs will typically have a stack space and a heap space, as well as other segments that go in conjunction with a code space. In addition, due to historical limitations on Main Memory capacity, most contemporary OSes support Virtual Memory (Paging System) – that is, the total memory footprint of all running programs can exceed the Main Memory capacity available. In a Virtual Memory environment, a program is now not only divided into segments, but further subdivided into pages. Not all pages of a program may actually reside in the Main Memory at all times. In a Virtual Memory system, the physical memory is virtually extended by utilizing some of the primary storage space (in Windows this is called the Page File; in Linux/Unix systems this is the Swap Space). Pages are swapped in and out of Main Memory as needed. For example: the computer that this Essentials was written on has 16 GB of Main Memory. The Windows OS reports 12.4 GB as being currently used (in physical memory), while 25.6 GB is actively committed. That means that there are a lot of pages sitting in the page file.
Caching and buffering are used to optimize the performance of memory. Files that are currently open are being cached in memory, and the OS uses a series of buffers in memory to manage data transfers between devices and the system (for example, Network Buffers). All of this complexity has to be managed by the OS through the Memory Management function.
Please note that there are anywhere between 3-7 layers of system cache in modern computing systems; these are typically handled by the hardware, with the OS fitting into the Virtual Memory translation process.
Storage Management and the File System
On a modern computer, active programs rely on the Main Memory for execution. Main Memory is typically volatile; it only retains its contents while the computer is powered on. In order for information to persist, it needs to be stored on media that is capable of storing that information when the computer is powered off. At a minimum, a computer has resident storage that enables it to store the OS programs and user applications. On a personal computer, the resident storage is also used to store user files such as documents and images. There may be more than one storage device, and the varieties of storage devices vary widely. The most common storage devices have been magnetic Hard Disk Drives (HDDs), and more recently electronic Solid State Drives (SSDs).
No matter what the storage device type, the OS presents a consistent interface for the storage and handling of files through the File System. Like the multiprocessing that we discussed earlier, the File System is an abstract creation that tells us what information (in the form of files organized in folders) is stored on a device. The OS also creates the illusion or virtual view of the File System. Even though all the necessary information to produce that view is stored on the storage device itself, the OS is the software that deciphers that information and presents the File System to the user and applications at runtime. The idea that the File System is a virtual/runtime entity may be difficult to comprehend; however, consider that an NTFS formatted drive is often unreadable natively on a Mac. It is not that the Mac doesn't see the storage device, the MacOS is just not designed to natively decipher the contents of an NTFS drive and create the runtime File System view.
Even though files and folders appear contiguous to us, they are often scattered in blocks all over the storage device, particularly in HDDs and SSDs. Formatting is essentially the process of setting up a storage device with the essential structure needed to support a File System. This includes both the block creation and the indexing information that is used to keep track of the blocks, such as which blocks belong to which files, and which folders files belong to. All of this is managed by the File System of the OS. The reading and writing of files by user applications is handled by the OS via system calls. Advanced information such as symbolic links and shortcuts are also maintained on the storage device by the OS.
The underlying storage technology is irrelevant, because the File System provides a consistent interface. Whether a file is stored on an SSD, a USB flash drive, an SD card, or a CD, it always looks like a file to the end user. The OS utilizes device drivers in order to handle the different hardware technologies and present us with a consistent interface through the System Calls Interface.
When working with Embedded Linux, you'll mostly be working through the Command Line interface, so you'll need to know the basic commands. Every function in Linux is accessible through the command line. While it isn't as elegant as a GUI, once users learn the commands, they often find it to be faster. Additionally, Linux commands can be executed remotely, via tools like SSH or FTP, making remote administration of a Linux device very convenient.
File Management
One of the most important functions of an OS is file management. With any device, you'll need to know how to copy and delete files and directories.
ls – list all of the regular files in the directory
mkdir – create a directory
cd – change to a directory
cp – copy a file from one directory to another
rm – remove directory
cat – display a file
less – display a file by pages
head – display first 10 lines of a file
tail – display last 10 lines of a file
grep – search for a file within a directory
chmod – changing file or folder permissions
diff – comparing files or directories
Managing Processes
ps – show running processes
kill -9 <PID> kill a process (with a process ID)
Getting Help
<command> --help – for example, chmod --help shows the help for chmod
Other Helpful Commands
sudo – execute command as root
ifconfig -a – display network interfaces
ping <IP address> – test if another machine is connected to network
tar czf – create archive
tar xvf – extract archive
wc – show size of file
df – show size and usage data for partitions
du – show size of directory
Editing Text Files
Configuration files and other text files can be edited using vi, the built-in text editor.
vi <file name> - opens the file
i – enter Editing Mode. In Editing Mode, text can be navigated and edited using standard keystrokes, such as cursor keys, backspace, etc.
ESC – enter command mode. Command Mode commands are as follows.
G – go to end
NG or :G – go to line N
vv – copy current line into buffer
p – paste buffer after current line
x – delete current character
dd – delete current line
u – undo
/<string> – find string after cursor
?<string> – find string before cursor
n – find next
w – save file
wq – save and quit
Device Management
Device Management refers to the management and control of all the hardware resources that are part of or connected to the computer. This is carried out through the device drivers. Device drivers, in turn, enable the OS to present a consistent interface for access to hardware resources.
Apart from using drivers, there are also other things that the Device Management subsystem of an OS has to deal with in order to make hardware use seamless. If a file is being downloaded from an FTP server, the OS is not only using the drivers of the network interface card (NIC), but also buffering the fragments of the file that are coming in piecemeal, and in some cases performing an integrity check. Similarly, as we mentioned in the Storage Management section, storage media may be storing information in blocks. When opening a file, the OS is buffering the contents so that enough contiguous information is available for useful work before presenting it to the user.
The easiest way to understand this is to think about streaming video, and the spinning animation that appears when the video freezes. The entire video file (which may be gigabytes or even terabytes in size) doesn't have to be downloaded completely before you can view it, because the OS, in conjunction with the browser, is able to buffer enough content that the video playback can begin. While frames are being read from the buffer, more frames are being downloaded into it. If you run out of frames to play because the Internet connection is slow or unstable, playback stops. The process of playing a video from local media is the same; the entire contents of a Blu-ray/4K movie are never actually loaded into memory. Frames of the movie are buffered by the OS, this time in conjunction with the video player. Because playing a Blu-ray does not rely on an Internet connection, all frames are readily available and the video will not freeze (unless there are other issues, such as a scratched disc).
In computing systems, there are often temporary stores of information in buffers or caches that are necessary for communication to and from devices. Even a simple keystroke can potentially be written into several locations.
The File System on a contemporary Linux system (for example, ext4 or XFS) also has advanced logging and journaling functionality to improve File System integrity.
Network Management
Nowadays, with the widespread use of the Internet and the use of Web applications, as well as the rise of the Internet of Things (IoT), it makes sense to discuss network management separately. Network Management includes not only the management of the Network Interface Devices (Ethernet, Wi-Fi, Bluetooth), but also the software that supports networking: the TCP/IP stack. For a brief discussion on the TCP/IP stack, refer to the Essentials module on Wireless MCUs. Finally, as we mentioned in the previous section, there are buffering functions that are supported by the OS. In modern computers, the network can also be used to extend the File System through network drives. In these cases, the Network Management and File System work in conjunction with each other.
Development Environment
In an open source system like Linux, it is expected that the end user can and will make changes to the OS. Even apart from OS modifications, a user may be interested in developing their own programs. For both of these reasons, a Development Environment comes standard in most Linux distributions. For example, gcc (GNU Compiler Collection) and glibc (GNU C Library) are included as part of most Linux distributions and can be used to compile C and C++ code. Standard Linux distributions often include support for development in other languages such as Python, Perl, PHP, and more.
The kernel can be modified and recompiled in Linux. When you recompile the kernel, the modified kernel typically becomes one of the available options in the bootloader.
Utilities
In addition to all of the above, all contemporary OSes have other utilities that come standard. Examples of this range from a file compression utility to a text file editor. LibreOffice, an open source productivity suite, is included with many distributions of desktop Linux. Many utilities are included in standard Linux distributions that may not be part of commercial home OSes, such as a DHCP server and PostgreSQL.
An important part of a desktop OS is the graphical user interface (GUI). GUIs facilitate interaction with the computer by presenting an easy to use interface. Gnome is a popular GUI for Linux.
On the other hand, for an embedded system, a GUI may not be necessary. If you are working with a wireless router that has Linux installed, the graphical interface provided through a web server (via remote access) is more than sufficient to access all of the functions of the embedded system.
Distributions and Versions
A distribution (or distro) is a bundle of the Linux kernel and all of the other OS components. Some common distributions include Ubuntu, Debian, SUSE, Mint, CentOS, Kali Linux, Red Hat for enterprise, and its consumer variant Fedora. There are literally thousands of distributions. You can visit https://distrowatch.com/ to learn more about Linux distributions. When the term version is used in the Linux context, it is used to refer to the version of the Linux kernel. Distributions also carry their own version numbers, although these are more commonly referred to as releases. To reduce confusion, releases are often given a nickname. For example, release 20.04 of Ubuntu is called focal, and it comes with Linux kernel version 5.4.0.
Loading the OS: The Boot Process
To prepare us for the discussion of Embedded Linux, we will briefly discuss the boot process of a computer.
The term boot is derived from bootstrapping. Booting refers to the bootstrapping of the OS, so that the complete OS can be loaded onto the running system from the time that it powers up.
A simplified chain of events is as follows:
- When the system is powered ON, a basic software system resident in ROM or firmware starts the basic functions of the hardware. This was originally done by the BIOS (Basic Input Output System), but is being gradually replaced by the UEFI (Universal Extensible Firmware Interface). In essence, this basic system has enough functionality to run basic input and output – at least the keyboard and display – and get a base set of hardware going through the use of simple drivers. It also scans to see what hardware, including storage devices, is actively connected to the system. Control is then handed over to the bootloader. The BIOS/UEFI has a preferred sequence of storage devices and goes through them sequentially, to determine if one of them has a bootloader. A basic diagnostic may also run here.
- The bootloader is software that inherits the basic “started up state” of the computer and loads up or bootstraps the OS from storage or network into Main Memory. The bootloader is a very small piece of software (on the order of 100 MB) and loads the kernel and other essential elements of the OS. Having a bootloader in place is what makes a given storage device bootable.
- At this point, the main OS begins executing the rest of the drivers and utilities, as well as loading the application software that the user has specified to begin at startup.
The booting process is essentially the same even for much simpler computers. On a heavily simplified Linux system, there may only be a single storage device, and the bootloader may simply be a pointer to where the beginning of the OS is; however, the booting process will still be very similar. Please note that for devices as simple as an Arduino, you would not be trying to run Embedded Linux, because there is no need for the advanced functionality.
Many of the functions discussed in the previous section are significantly simplified in an embedded system. There may not be a need for Virtual Memory/Paging, and there may only be a handful of simultaneous processes. Device Management is similar; a significantly reduced set of drivers is needed, because far fewer hardware devices will need to be connected. All of this reduces the footprint of the OS-bundle.
An embedded system is a simpler computer and does not need much of the functionality built into a desktop OS. The utilities on an embedded OS bundle will also be significantly reduced. There may be no need for a DHCP server or GUI capability. Utilities such as SSH or Telnet are necessary, however, because they may be the only means of accessing the functionality of the embedded system.
At a minimum, we need the kernel and the bootloader, as well as any necessary drivers and utilities. In the “Getting Started with Embedded Linux” section, we will briefly go over how to build an Embedded Linux distribution.
There are a variety of commercially supported Embedded Linux distributions, as well as several community-driven Embedded Linux projects. In this section, we look at some of the available distributions. For example, OpenWRT can be installed on an older wireless router, as well as a Raspberry Pi. Android can also be installed on a single board computer.
Raspberry Pi OS (formerly Raspbian)
The Raspberry Pi remains one of the most popular single board computers for DIY enthusiasts. The Raspberry Pi OS is a Linux distribution based on Debian. The Pi board can also support a multitude of other distributions, including Android. The Raspberry Pi supports typical input and output (I/O) devices, such as keyboards and monitors. In addition, the Raspberry Pi has built-in networking capabilities (both Wi-Fi and Ethernet) and is easily accessible through SSH. All of this for under $100 USD, typically.
Although Python is often used as a language for building functionality on a Pi, it also supports C/C++, as well as other languages.
Get started with the Raspberry Pi page here.
BeagleBoard
BeagleBoard is another single board computer platform that features basic I/O support and the capability to support a number of Linux distributions. There are a number of BeagleBoard variants available under the BeagleBone and BeagleBoard trademarks, ranging in price from $50-$150 USD, typically.
Check out the Getting Started with the BeagleBone Black page here.
OpenWRT
OpenWRT project is targeted mainly towards wireless routers. OpenWRT makes functionality in a wireless router available to the end user, making it a much more useful tool. Both SSH and a web-based interface are available for command-and-control. Although intended mainly for networking devices, OpenWRT is a good candidate for projects that rely heavily on networking for their functionality, such as IoT applications. OpenWRT replaces a device's firmware, typically extending the capabilities of the device on which it is installed. As it is open source, customization is relatively easy. OpenWRT can also be installed on the Raspberry Pi.
Android
Android is one of the two most popular mobile phone OSes. Android was developed from Linux, and can be installed on many devices that support Linux, including the Raspberry Pi and the MaaXBoard, a single board quad-core processor-based computer capable of running Android. See example here.
- 6.1 Getting Started with Embedded Linux
If you are looking to use an Embedded Linux distribution for a specific embedded application, build your own (BYO) / roll your own (RYO) is a common trend in the IT field. Building an embedded distribution requires creating the right package of bootloader, kernel, and utilities.
The good news is that there are open source projects that make the process of building your own Embedded Linux distribution easier. These are two of the most popular:
Yocto/OpenEmbedded
Yocto is a toolset for packaging Embedded Linux distributions. Yocto uses OpenEmbedded as the build system. The end result of a Yocto project contains everything needed to deploy an Embedded Linux Distribution. Yocto uses a Layer Model; you can build on the base OS by adding features and additional functionality, even after the base OS has already been deployed on your target device. This provides great flexibility and the potential for future development.
PetaLinux
PetaLinux is a development toolset for creating Embedded Linux distros for Xilinx-based targets. To get started with PetaLinux, we recommend a guide by Adam Taylor: MicroZed Chronicles: PetaLinux Edition
Buildroot
Buildroot is a simpler tool. It's great for DIY enthusiasts, because one can easily get an Embedded Linux distro going using it. The main drawback is that additional features and functionality cannot be added to a deployed system. To add additional functionality, a new version of the distro must be created, then reinstalled onto your target device. However, if you've tinkered with single-board computers like the Raspberry Pi, and standard distributions, such as Raspberry Pi OS and Android, using Buildroot to create your own Embedded Linux distribution is the logical next step.
To get started setting up an Embedded Linux distro with Buildroot, use the Mastering Embedded Linux, Part 1: Concepts guide.
Have fun embedding!
Related ComponentsBack to Top
Listed are the recommended Linux compatible Single Board Computers.
Raspberry Pi 4 Model B+
BeagleBoard AI
BeagleBoard Black
Zedboard
Ultra96 V2
Test Your KnowledgeBack to Top
Are you ready to demonstrate your Embedded Linux knowledge? Then take this 15-question quiz. To earn the Embedded Software II Badge, read through the module, attain 100% in the quiz, and leave us some feedback in the comments section.
Top Comments