The AI Camera is built around a Sony IMX500 image sensor with an integrated AI accelerator. It can run a wide variety of popular neural network models, with low power consumption and low latency, leaving the processor in your Raspberry Pi free to perform other tasks.
Key features of the Raspberry Pi AI Camera include:
- 12 MP Sony IMX500 Intelligent Vision Sensor
- Sensor modes: 4056×3040 at 10fps, 2028×1520 at 30fps
- 1.55 µm × 1.55 µm cell size
- 78-degree field of view with manually adjustable focus
- Integrated RP2040 for neural network and firmware management
The AI Camera can be connected to all Raspberry Pi models, including Raspberry Pi Zero, using our regular camera ribbon cables.
Under the hood
How does the Raspberry Pi AI Camera Work?
To make use of the integrated AI accelerator, we must first upload a model. On older Raspberry Pi devices this process uses the I2C protocol, while on Raspberry Pi 5 we are able to use a much faster custom two-wire protocol. The camera end of the link is managed by an on-board RP2040 microcontroller; an attached 16MB flash device caches recently used models, allowing us to skip the upload step in many cases.
Once the sensor has started streaming, the IMX500 operates as a standard Bayer image sensor, much like the one on Raspberry Pi Camera Module 3. An integrated Image Signal Processor (ISP) performs basic image processing steps on the sensor frame (principally Bayer-to-RGB conversion and cropping/rescaling), and feeds the processed frame directly into the AI accelerator. Once the neural network model has processed the frame, its output is transferred to the host Raspberry Pi together with the Bayer frame over the CSI-2 camera bus.
The Raspberry Pi AI Camera works differently from traditional AI-based camera image processing systems, as shown in the diagram below:
The left side demonstrates the architecture of a traditional AI camera system. In such a system, the camera delivers images to the Raspberry Pi. The Raspberry Pi processes the images and then performs AI inference. Traditional systems may use external AI accelerators (as shown) or rely exclusively on the CPU.
The right side demonstrates the architecture of a system that uses IMX500. The camera module contains a small Image Signal Processor (ISP) which turns the raw camera image data into an input tensor. The camera module sends this tensor directly into the AI accelerator within the camera, which produces an output tensor that contains the inferencing results. The AI accelerator sends this tensor to the Raspberry Pi. There is no need for an external accelerator, nor for the Raspberry Pi to run neural network software on the CPU.
To fully understand this system, familiarise yourself with the following concepts:
- Input Tensor
-
The part of the sensor image passed to the AI engine for inferencing. Produced by a small on-board ISP which also crops and scales the camera image to the dimensions expected by the neural network that has been loaded. The input tensor is not normally made available to applications, though it is possible to access it for debugging purposes.
- Region of Interest (ROI)
-
Specifies exactly which part of the sensor image is cropped out before being rescaled to the size demanded by the neural network. Can be queried and set by an application. The units used are always pixels in the full resolution sensor output. The default ROI setting uses the full image received from the sensor, cropping no data.
- Output Tensor
-
The results of inferencing performed by the neural network. The precise number and shape of the outputs depend on the neural network. Application code must understand how to handle the tensor.
Raspberry Pi AI Camera System Architecture
The diagram below shows the various camera software components (in green) used during our imaging/inference use case with the Raspberry Pi AI Camera module hardware (in red):
Raspberry Pi Ai Camera Pose Estimation
A key benefit of the AI Camera is its seamless integration with our Raspberry Pi camera software stack. Under the hood, libcamera processes the Bayer frame using our own ISP, just as it would for any sensor.
We also parse the neural network results to generate an output tensor, and synchronise it with the processed Bayer frame. Both of these are returned to the application during libcamera’s request completion step.
Raspberry Pi Ai Camera - Object Dectection Demo
Getting Started
Check out our Getting Started Guide. There you’ll find instructions on installing the AI Camera hardware, setting up the software environment, and running the examples and neural networks in our model zoo.
Sony’s AITRIOS Developer site has more technical resources on the IMX500 sensor, in particular the IMX500 Converter and IMX500 Package documentation, which will be useful for users who want to run custom-trained networks on the AI Camera.
https://www.raspberrypi.com/documentation/accessories/ai-camera.html
How to Label, Train and Deploy a custom AI Model in Sony AITRIOS
https://developer.aitrios.sony-semicon.com/en/studio/brain-builder