AI Prototyping on the Edge with the Intel Neural Compute Stick 2

IoT applications typically include any application that monitors a physical environment. Examples include airplane telemetry, tracking fauna, manipulating blinds via voice commands, and surveillance using drones. Many such applications require functional Artificial Intelligence (AI) capabilities, including image, audio, and video analysis. AI is an umbrella term, and all AI machines are trained using Machine Learning Algorithms (MLAs).

The term "Edge AI" is loaned from edge computing, which implies computation closer to the data source. In AI terms, it generally means any action that happens outside data centers or bulky computers, such as drones, cellphones, and autonomous vehicles. These Edge AI devices have diverse physical size and are designed or supported by multiple vendors. In this article, we will focus on pocket-sized platforms that individuals and small companies purchase and use.

In such Edge use-cases, processing time-delays determine that data not be dispatched to the cloud; instead, algorithms armed with sufficient computing power are required. Edge devices such as vehicles or drones lack such computational power, necessitating the presence of dedicated hardware like the Intel Neural compute Stick 2 (NCS2), employed as a neural network inference accelerator providing additional performance.

Buy Now Buy Now

Figure 1: Intel Neural compute Stick 2

Movidius is chiefly designed to execute AI workloads built on inference or trained models. NVIDIA's GPU serves an identical function plus training. It thus depends on whether the device under consideration will work in execute-only-mode or be capable of streamlining/re-training its models (brains) as well. However, these are all valid options, as long as such tasks are completed within a reasonable period.

As per the work cycle, developers train AI models and transfer results to the NCS2, connecting it to a low-cost computer like an Arduino or Raspberry Pi dedicated to its immediate task. Intel-provided use case examples include image classification, object detection, and motion detection. AI even helps to stabilize videos.

Edge device processing provides the significant advantage of time and cost avoidance. This is particularly evident when you upload streams of data, like photos of a vehicle that drives past a security gate. The computer vision verifies the number plate characters and communicates them to the device. The outcome either raises the boom gate (authorized vehicle) or lowers it (unauthorized/unwelcome entity). If you use a connected device, only updates need to be pushed to the registered number plate database (a compact dataset). If business requirements compel log entries, as well as the need for secure areas, a compact data log can be transmitted from the Edge device back to the server, with expensive processing being done on the NCS2.

AI Prototyping

Shop our wide variety of embedded computers, including single-board computers, development boards, and accessories for prototyping.

Shop Now Shop Now Shop Now Shop Now Shop Now

The Intel Movidius Myriad X VPU powers the Intel NCS2 and is the first to be equipped with a neural compute engine – an exclusive hardware neural network inference accelerator, generating additional performance. Paired with the Intel Distribution of the OpenVINO toolkit supporting additional networks, the Intel NCS2 provides exceptional prototyping flexibility to developers.

Working with Neural Compute Stick 2

With merely a laptop and the Intel NCS2, developers can get AI and computer vision applications up and running in minutes. The Intel NCS2 runs on a standard USB 3.0 port and needs no additional hardware, enabling users to seamlessly convert and then deploy PC-trained models to a broad range of devices natively. Internet or cloud connectivity becomes superfluous.

Intel has simplified project deployment on embedded devices through its OpenVINO toolkit, which is designed to profile, tune, and subsequently deploy convolutional neural networks. It targets applications that need real-time inferencing using low power. This software tools set simplifies the deployment process on various Intel AI solutions and supports models in ONNX, Caffe, TensorFlow, and MXNet formats.

OpenVINO, a primary development software toolkit for NCS2 and other Intel hardware, allows the development and deployment of machine vision solutions delivering high inferencing speed and accuracy. OpenVINO combines camera processing, CV acceleration tools, and optimized DL computation for heterogeneous execution environments. This implies that CNN-based solutions using this toolkit can maximize performance by extending their workloads across the Intel hardware (including CPUs, GPUs, FPGAs, VPUs, and IPUs) only using a standard API. Equated with NCSDK, OpenVINO also allows for CNN-based inference at the edge, but with superior pre-optimized kernels and calls for OpenCV API. The OpenVINO development workflow depicted in Figure 2 initially includes training a CNN model in one of the machine learning (ML) libraries. The model optimizer is subsequently used to produce the Intermediate Representation (IR) model graph. The IR contains two files: topology description in XML format and binary data of the model weights. The IR is used to read, load, and infer using the Inference Engine with the latter accommodating unified functions to span multiple Intel platforms. User applications integrate this API to use the model IR to execute deep learning inference.

Figure 2: The workflow diagram of the OpenVINO toolkit

Getting Started with the Neural Compute Stick 2 and Inference on the Raspberry Pi

We will now learn how to use the OpenVINO toolkit in conjunction with OpenCV for quicker DL inference on a Raspberry Pi. A TinyYOLO - a compact You Only Look Once (YOLO) DL model version - is taken as an example. This model is heavy for a Raspberry Pi, but with the Neural Compute Stick 2 we will engineer a better frame rate than merely using a Raspberry Pi.

We need a Raspberry Pi 4 Model B, a USB cable, and the Neural Compute Stick 2. A Raspberry Pi camera is optional, if you are not loading video from the disk.

Figure 3: Intel NCS2 on Raspberry Pi 4

Let us first install a few dependencies required by openCV and OpenVINO:

$ sudo apt-get update && sudo apt-get upgrade

$ sudo apt-get install build-essential cmake unzip pkg-config

Next, it is time to install a selection of image and video libraries — these are key to work with image and video files:

$ sudo apt-get install libjpeg-dev libpng-dev libtiff-dev

$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev

$ sudo apt-get install libxvidcore-dev libx264-dev

$ sudo apt-get install libgtk-3-dev

$ sudo apt-get install libcanberra-gtk*

Next, we need two packages which contain numerical optimizations for OpenCV:

$ sudo apt-get install libatlas-base-dev gfortran

Add the following command to download the python headers:

$ sudo apt-get install python3-dev

Installation of OpenVINO's optimized OpenCV on the Raspberry Pi

$ wget https://download.01.org/opencv/2020/openvinotoolkit/2020.1/l_openvino_toolkit_runtime_raspbian_p_2020.1.023.tgz

We will now unpack the folder and rename the same for better readability:

$ tar -xf l_openvino_toolkit_runtime_raspbian_p_2020.1.023.tgz

$ mv l_openvino_toolkit_runtime_raspbian_p_2020.1.023 openvino

Configure openVINO to use with the Raspberry Pi:

$ nano ~/.bashrc

Add below lines to the end of the .bashrc file and save it

# OpenVINO

source ~/openvino/bin/setupvars.sh

Close editor and source the file:

$ source ~/.bashrc

Next, we need to add the current user to the Rasbian users group:

$ sudo usermod -a -G users "$(whoami)"

Reboot the Pi:

$ sudo reboot

Reopen the terminal and set the USB rules:

$ cd ~

$ sh openvino/install_dependencies/install_NCS_udev_rules.sh

It is better to generate a virtual environment to stay clear of the installed libraries for the project. We will now create a virtual environment for this OpenVino toolkit:

$ wget https://bootstrap.pypa.io/get-pip.py

$ sudo python3 get-pip.py

$ sudo pip install virtualenv virtualenvwrapper

$ sudo rm -rf ~/get-pip.py ~/.cache/pip

$ nano ~/.bashrc

Include the following lines in the .bashrc file:

# virtualenv

export WORKON_HOME=$HOME/.virtualenvs

export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3

source /usr/local/bin/virtualenvwrapper.sh

VIRTUALENVWRAPPER_ENV_BIN_DIR=bin

We will now source the file:

$ source ~/.bashrc

We will now generate a virtual environment and name it NCS2. The following command will create a virtual python 3 environment:

$ mkvirtualenv NCS2 -p python3

Open this virtual environment and install the libraries needed for this project:

$ workon NCS2

$ pip install numpy

$ pip install "picamera[array]"

$ pip install imutils

$ pip install pillow

Type the following command to setup the Movidius NCS with OpenVINO after setup:

$ source ~/openvino/bin/setupvars.sh

You can download the project source code and the model from here.

Unzip the file after downloading and open the terminal. Navigate to the ncs2.py directory and type the following in the terminal:

$ cd /home/pi/Downloads/ncs2

$ workon NCS2

$ source ~/openvino/bin/setupvars.sh

$ python ncs2.py

Figure 4: Image detection on Raspberry Pi

The video is loaded from disk, and it will begin object detection. This video is found in the project link. You can, however, use your video. To do this, provide the video file location in line 37 of the source file:

vs = cv2.VideoCapture("/home/pi/Downloads/ncs2/videos/vid.mp4")

If you prefer to use a USB camera and not a video file, change cam= True in line 15 of the source file. The USB is the default camera. If you prefer the Raspberry Pi camera, then in the source file, comment line no 31 and uncomment line no 32:

vs = VideoStream(src=0).start() #USB CAMERA

#vs = VideoStream(usePiCamera=True).start() # RASPBERRY PI CAMERA

We can achieve a frame rate of 5 to 6 frames per second (fps) on a video that gets loaded from the disk, whereas we can achieve a 10 to 12 fps rate when a USB camera is run. The frame rate, sans the Neural Compute Stick 2, will be noticeably slower, unable to clock even one fps.

Conclusion

Intel has also pioneered a complete AI development Kit, which involves a powerful combination of an Intel Core processor, Intel Movidius Myriad X Vision Processing Unit (VPU), and an integrated graphics unit from Intel for high-performance, low-power AI workloads. The three hardware engines can run diverse AI workloads, to deliver an exhaustive raw AI capability for contemporary PCs. This kit is comprised of Windows 10 pre-loaded Intel NUC and AI development tools, and code samples to help developers expedite new AI applications. Tutorials come included.