Any human task that can be automated comes under the Artificial intelligence (AI) umbrella. AI includes numerous tools, algorithms, and systems, including machine learning (ML). Computing power rises roughly a factor of 10 every year, pushed mainly by new classes of custom hardware and processor architecture. This computing boom is a key component of AI progress.
AI mirrors to re-engineer the structure and function of the human brain to enable machines to solve problems the way humans would do. ML is an AI subset that uses statistical techniques to include learning abilities in computers without being explicitly programmed. ML uses algorithms to analyze data, and then makes a prediction based on its interpretation. It requires an inseparable combination of processing power and software. The goal of ML is to automate complex analytical tasks as much as possible.
To learn more about AI, click here.
Hardware Core for AI
Optimized hardware and processing architecture are a must for AI to function in edge environments. These come with specific needs linked to connectivity, processing capacity, energy efficiency, and security.
The AI chipset products range includes a graphic processing unit (GPU), central processing unit (CPU), a neural network processor (NNP), reduced instruction set computer (RISC) processor application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), accelerators, and more. A few chipsets are dedicated to edge processing or devices, and a few are earmarked for servers employed in Cloud computing while others serve autonomous vehicles and ML platforms. A few products constitute the AI computational framework and other AI training platforms. Both Edge and server chipsets must be optimized for superior performance and ultra-low power administering sub-one to 30 trillion operations per second (TOPs).
Computer vision (CV), ML, and DL libraries now offer compiled packages. A few notable libraries are OpenCV, GPU4Vision, OpenVIDIA, and sci-kit-learn. OpenGL is an Open Source Graphics Library used to create 3D computer-generated objects. It is an API that uses graphics accelerators for faster rasterization of scenes. OpenCL is an open-source parallel computing library. It is considered a standard for parallel computing across heterogeneous computing environments, like CPUs, GPUs, FPGAs, and special accelerators devices. OpenCV is an open-source CV library using OpenGL, DirectX, OpenCL, CUDA, and many others to analyze images or videos visually. OpenCV has C++, C, Python, and other interfaces running on Windows, Linux, Android, and Mac OS X. These libraries take advantage of multi-core processing, enabled with OpenCL. Hardware acceleration of the underlying heterogeneous compute platform DL frameworks like TensorFlow, Torch/PyTorch, and Caffe are used. It’s a higher-level layer than OpenGL. The OpenCL is also supported.
AI computing, specifically ML computing, can be accomplished remotely in the cloud, within a data warehouse, or at the "edge" right on a device. Edge computing makes perfect sense when real-time response and lowest latency becomes critical.
Raspberry Pi 4 (RPi 4), Ultra96-V2, and Beagle Bone AI (BBAI) are the SBC platforms presented by element14 for completing AI based applications tasks. It is important to consider the ease of working for AI at the Edge applications.
Element14 has helpfully crafted the AI configurator designed to help engineers to select the optimal solution for their AI centric projects. Click here for details.
The following content will examine the hardware and software requirements for AI at the Edge applications and also touch on the user favorite SBCs for AI on element14.
1. Raspberry Pi 4 (RPi 4)
The new Raspberry Pi 4 concerns AI and the embedded IoT. The original RPi has long outgrown its hobbyist origins and evolved to an IoT developer platform able to manage ML applications.
The RPi 4 is armed with a Broadcom BCM2711, quad-core Cortex-A72 (Arm v8) 64-bit SoC @ 1.5GHz having a Video core VI GPU managing all graphical input/output is supporting OpenGL ES 3.x and 4Kp60 hardware decode of HEVC video. It supports Dual monitor with 4K resolution along with H.265 video and also video scaling, camera input, with all HDMI and the composite video outputs. RPi 4 comes with Gigabit Ethernet, and it supports Bluetooth Low Energy 5.0 and Dual-band 802.11ac wireless networking. Click this element14 community link to expore more features: Raspberry Pi
The RPi 4 finds wide use among makers and hobbyists due to manufacturer technical support and the community ecosystem, which follows the purchase. Its performance makes an useful proposition for those embedded engineers who wish to develop consumer-grade IoT products, and ML inferencing on the edge.
DL algorithms are computationally expensive- a big problem for the resource-constrained RPi. Additional hardware is a must to run these computationally intense algorithms on the RPi. The Intel Movidius NCS and Coral USB Accelerator imports powerful ML inferencing capabilities to existing Linux systems. These are powerful but small USB Accelerators derived from an ASIC based design. Such devices are “coprocessors” designed to augment primary CPU capabilities. Using these accelerators results in significant performance changes. They combine with optimized libraries provided from both Google and Intel.
2. Beaglebone AI
The Beaglebone AI board design caters exclusively to artificial AI workloads at the edge. The TI’s SoC AM5729 has dual-core Cortex-A15 processor embedding a dual-core C66x DSP, and 4 EVE (Embedded Vision Engine) cores, paired with a dual-core Arm Cortex M4 CPUs serve as an image-processing unit, generating maximum 8x performance per watt when running calculations for CV models compared to running on an Arm Cortex A15-based CPU. It also includes a dual-core PowerVR SGX544 3D GPU, Vivante GC320 Core 2D accelerator supports OpenGL ES 2.x for graphics applications.
The 3D GPU supports almost all general embedded applications. The GPU can simultaneously process different data types like general-purpose data, pixel data, video data, and vertex data.
A detailed technical specification can be read here: BeagleBoard
This SBC is aimed at developers who show an interest in experimenting with ML and CV. The user can access optimized hardware via the TI Deep Learning (DL) OpenCL API, which offers a "zero-download out-of-box software experience" with TI C66x DSP cores and embedded-vision-engine cores with pre-installed tools.
Most element14 community users prefer Tensor Flow, PyTorch, OpenCL, OpenCV frameworks used for Beaglebone AI.
3. Ultra96-V2 Development Board
Ultra96 is an Arm-based, Xilinx Zynq UltraScale+ MPSoC development board based on the Linaro 96Boards specification, which integrates an Arm multicore, multiprocessing system with programmable logic that can be used to complete hardware accelerate compute-intensive tasks. The 96Boards’ specifications are open and define a standard board layout for development platforms. This can be used by software applications, hardware devices, kernel, and system software developers. Ultra96 represents a unique position by supporting a wide range of potential peripherals and acceleration engines in programmable logic.
The Zynq UltraScale+ MPSoC ZU3EG device is fitted with Quad-core Arm Cortex-A53 MPCore with CoreSight as an Application Processing Unit and Dual-core Arm Cortex-R5 with CoreSight Real-Time Processing Unit, Arm Mali-400 MP2 as a GPU. The Ultra96-V2 board comes 2GB LPDDR4 RAM, a 16GB microSD card, Wireless options include 802.11b/g/n Wi-Fi and Bluetooth 5 Low Energy. Click this element14 community link to expore more features: Introducing Ultra96-V2
There are options to connect the Ultra96-V2 via a webserver utilizing integrated wireless access point capability. The PetaLinux desktop environment is an alternative, with the latter, viewed on (if needed) the integrated Mini DisplayPort video output. It supports compatible low-speed and high-speed expansion connectors by the addition of peripheral accessories like the MikroE Click Mezzanine for 96Boards.
Ultra96-V2 runs PetaLinux off a micro SD card and targets AI, ML, IoT/Cloud connectivity for add-on sensors, embedded computing, robotics, entry-level Zynq UltraScale+ MPSoC development, as well as training, prototyping and proof-of-concept demos.
Participate here to explore more design challenges faced by designers who use ultra96 hardware: Ultra96 Hardware Design Forum