How to Accelerate Edge AI with the i.MX 8M Plus Processor

Have You Wondered What "AI at the Edge" Means?

A typical industrial network produces a colossal amount of data from field devices, sensors, and actuators, mostly combined with IoT capabilities. Artificial Intelligence (AI) processes this massive data and recognizes patterns in it using powerful algorithms to make accurate decisions.

For decades, AI was located in data centers, where there was sufficient computing power to perform processor-demanding cognitive tasks. As such, AI has focused on the cloud-run massive centralized computer servers located virtually at remote places, for example Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and a variety of other service providers. While this approach has proven its reliability, the amount of time it takes to complete data transfer to and from the cloud introduces latency issues that can influence real-time decision making.

Moreover, this requires substantial processing power, connectivity, and storage; this is where edge computing comes in. AI at the edge can be thought of as a decentralization mechanism of sorts.

Figure 1. Real-time data processing at the Edge

Moving Away from the Cloud to the Edge

In simple terms, edge computation moves the collection, storage, and analysis of data on-device for real-time decisions away from the cloud. Edge AI means that AI algorithms are processed locally on hardware, such that even the tiniest devices and machines can sense their environment, comprehend, and react to it for cogent decision making. Edge computing drastically reduces cloud payloads when used in collaboration with AI. It becomes non-essential for any device using Edge AI to be connected to work correctly, as it can independently process data and make decisions without a connection based on the trained datasets.

AI at the Edge

Shop our wide range of i.MX 8M application processors, dev boards, single-board computers, and accessories, including the Nano, Nano Dual, Nano Quadlite, Nano Quad, Mini Quadlite, Mini Quad, and Quad Series.

Shop Now Shop Now Shop Now Shop Now Shop Now

Benefits of AI at the Edge

Edge computing manages AI or Machine Learning (a subset of AI) algorithms locally on-device and unlocks additional application areas, with an increase in productivity. We will now summarize the benefits edge computing offers in AI processing capabilities:

Edge AI allows real-time decision making with reduced latency, necessary for mission-critical applications.
It provides local data storage and processing, and hence secures information.
Edge AI reduces costs for data communication, storage, and bandwidth, as little data is transmitted.
Applications continue to run even when the system is offline.
Edge-based AI doesn’t require expertise to operate and maintain, as its devices are self-contained.

While it makes sense to incorporate AI with edge computing, the hardware and AI software components face multiple challenges. A significant problem exists in processing vast amounts of data and power consumption. Usually, the energy-intensive training occurs at the cloud, and then the trained software is deployed to the edge for the relatively low-energy task of prediction (or inference). In edge computing, the training shifts to the edge, putting more demand on the edge hardware’s processing capability. For edge AI-enabled IoT devices, the repetitive memory access decelerates the system and also depletes the battery.

Deploying Edge-based AI Solutions

A way to overcome the mentioned constraints is to rethink the hardware design and software architecture. Many companies are devising hardware with higher processing power and reduced energy consumption, along with software that performs improved learning and inferencing. Hardware platforms for AI include high-performance Application-Specific Integrated Circuits (ASICs), Graphical Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), or Central Processing Units (CPUs). To bridge the gap between the data center and edge, manufacturers are forging niche, purpose-built accelerators that significantly accelerate model inferencing. These modern processors assist the CPU of the edge device by taking over complex situations and promoting complex tasks. Other attempts are based on using pre-trained models for inferencing, model compression techniques, and using AI co-processors to execute heterogeneous computations at the edge.

Let's take a closer look at the NXP i.MX 8M Plus heterogeneous application processor

An excellent example of the possibilities is the NXP i.MX 8M Plus heterogeneous application processor, with an integrated machine learning accelerator that can process neural networks about thirty times faster than Arm processor cores. It provides dedicated machine learning (ML) hardware, in the form of a neural processing unit (NPU) from VeriSilicon (the Vivante VIP8000). Developers can off-load machine learning inference functions to the NPU, allowing the high-performance Cortex-A53 and Cortex-M7 cores, DSP and GPU, to execute other system-level or user applications tasks, as shown in Figure 2.

Figure 2: Block Diagram of i.MX 8M plus Application Processor

So now, given the need for ML in the edge, the question is its extent. One way to measure ML accelerators is the number of tera operations per second, usually referred to as TOPS, an acronym for Tera (trillion) Operations Per Second. It turns out that i.MX 8M Plus offers 2.3 TOPS of acceleration for inference in endpoint devices in the consumer and industrial internet of things, adequate for applications like multiple object identification, speech recognition up to 40,000 words, or even medical imaging (MobileNet v1 at 500 images per second). It can execute various intricate neural networks simultaneously.

The NPU, combined with the i.MX 8M Plus dual image signal processors (ISPs) and GPU, enable real-time image processing applications. To make this a complete solution, NXP has given it eyes and ears by adding two camera ISP inputs of 12 Megapixel with high dynamic range (HDR), fisheye lens correction support, and eight microphone inputs. The ISP integrated within the applications processor offers high-quality imaging, while also being an optimized imaging solution, particularly at 2 Megapixel and higher resolutions. It incorporates an independent 800 MHz Arm Cortex-M7 for real-time tasks and low-power support, video encode and decode of H.265 and H.264, and 800 MHz HiFi4 DSP and 8 PDM microphone inputs for voice recognition. Industrial IoT features include Gigabit Ethernet with time-sensitive networking (TSN), two CAN FD interfaces, and ECC.

The i.MX 8M family application processors support high definition of audio and video capabilities, high-performance 3D graphics acceleration, dual camera inputs proving best in class multimedia performance with multiple high-speed interfaces. The architecture was until now restricted to dual-core power-optimized Cortex A-53, Cortex M4 processors that enabled concurrent execution of various software environments. All previous i.MX8 chips use their Cortex-A53s for local AI inference, and four of those general-purpose CPUs can manage only 0.03 TOPS. In comparison to its legacy family versions, the i.MX 8M plus application processor with its ML accelerator capabilities is the unrivaled fit for AI edge applications.

Figure 3. Object Detection with Artificial Intelligence at the Edge

Running AI at the edge, the i.MX 8M Plus enables voice, face, speaker, and gesture recognition, object detection and segmentation, augmented reality, environmental sensors, and control for anomaly detection. The i.MX 8m Plus, with its critical industry feature, is expected to bring a new level of intuitive human interface to the smart factory by enabling image, voice, and gesture control. As an example, the NPU Machine learning processing capability not only provides the ability to recognize people with accuracy comparable to human vision, it also goes beyond by predicting behavior and making decisions with the same intelligence.

Security at the edge is a critical feature and excels when compared to the cloud. Needed capabilities include on-chip cryptography, secure provisioning, mutual device authentication, secure device management, over-the-air (OTA) secure updates, and life cycle management. To support this, the i.MX 8M Plus has the scalable EdgeLock portfolio that includes a resource domain controller, TrustZone technology, HAB, encrypted boot, and public-key cryptography with RSA and elliptic curve algorithms. The i.MX 8M Plus application processor is attested to meet -40C to 105C industry temperature range, power-on profile (100 percent power-on), and is planned to be part of NXP’s industry-best longevity commitment of up to 15 years.

AI software is as essential as the hardware. The i.MX 8M Plus, supported by NXP’s eIQ machine learning development environment, can seamlessly switch between running their models on CPU, GPU, or NPU. The eIQ ML software environment includes inference engines, neural network compilers, and optimized libraries leveraged from advancements in open-source machine learning technologies. It is fully integrated into NXP’s Yocto development environments. NXP has also deployed technologies such as TensorFlow Lite, OpenCV, CMSIS-NN, and Arm NN that serve as alternatives for implementing trained NN models.

Developers can work on their existing TensorFlow models and convert them to the TensorFlow Lite model for edge applications. The eIQ software is accompanied by sample applications in object detection and voice recognition, to provide a basis for ML at the edge. Optimizing specific algorithms for resource-constrained devices being deployed at the edge is a better solution when it comes to accelerating the move to the edge. As an example, MobileNet is an image classification pre-trained neural network model, with a focus on achieving high accuracy while significantly reducing the number of computing resources needed. The model reduces the amount of required computing at the edge by 50 times. This enables a resource-constrained hardware solution at the edge to do sophisticated ML processing in less time.

Let's Recap: AI at the Edge

AI and ML have engendered a seismic shift in the computer industry by importing intelligence to the device. Edge computing, with specialized hardware, software, and developer environments, is likely to increase operational reliability, real-time predictions, and improve data security. Edge AI chips not only tremendously enhance the capabilities of existing devices; they also permit radically new kinds of devices with unique abilities and markets. Incorporating products optimized for AI and ML applications at the edge, such as the i.MX 8M Plus applications processor, will help advance our path.