The Experimenting with Sensor Fusion Design Challenge from Element14 is an exciting chance to combine video and accelerometer data to solve a problem of our choosing. The provided kit consists of a Digilent Pcam 5C Mega Pixel camera module, a Digilent PMod Nav 9-axis IMU (plus barometer) module, and a Xilinx SP701 Spartan 7 FPGA Evaluation Kit.
As one of the competitors, I will be exploring Visual Simultaneous Localization and Mapping (VSLAM) for indoor spaces. VSLAM is a class of algorithms that combines images sequences with pose information to construct a map of a device’s surroundings and at the same time estimate the location within that map.
Why VSLAM?
Visual SLAM algorithms typically combine data from one or more cameras along with an inertial measurement unit (IMU), so this is a good match to the components provided in the challenge kit. Simultaneous localization and mapping, aka “SLAM”, is relevant to robotics, model construction, 3D scanning, self-driving vehicles, and other applications where determining device position and orientation matters. An autonomous system with a map can navigate a space efficiently without requiring guides, or constantly bumping into walls and objects.
Why not just use GPS?
Global Positioning System or GPS relies on a network of satellites and a local receiver to provide remarkably accurate coordinates. This allows the positions of automobiles, farm equipment, smart phones, and other GPS enabled devices to be determined in real time. While GPS can determine positions accurately, a separate map needs to be provided to identify navigable and non-navigable regions. Additionally, GPS signals are unreliable indoors, under tree canopies, etc. where the GPS satellite signals are blocked. VSLAM algorithms require no or little external infrastructure, and can be deployed in a wide variety of terrains.
My Plan
While VSLAM is an area of active research and is steadily improving to combine latest developments in deep learning, computer vision, object detection, and identification, I will focus on “Classical” VSLAM techniques for this design challenge given the time constrains. "Classical" VSLAM consists of a set of front-end algorithms that obtain and process sensor data and a set of back-end algorithms that determine the pose and position. These algorithms will be implemented locally on the FPGA using Verilog, Xilinx IP cores, and a soft microprocessor core. The result of this project will be a stream of position data that can be used to construct a map of the system’s environment.
- Prerequisites
Prior to implementing VSLAM, I will need to familiarize myself with the Xilinx SP701 FPGA development board as well as the sensors. I will then create a “Hello World” demonstration to use an Arm Cortex-M4 DesignStart FPGA soft processor core to demonstrate the ability to read and modify sensor data and communicate via UART.
- Front-End
1.Data acquisition
A time sequence camera image data and IMU data is acquired and pre-processed.
2. Visual Odometry
A visual odometry algorithm is applied to the camera image data to determine distance traveled.
- Back-End
3. Sensor Fusion/Filtering & Optimization
An extended Kalman filter will combine the distance traveled from visual odometry and IMU orientation information to provide an estimate of location in our map.
4. Loop Closing
Loop closing reduces error in the location by using known points in the map.
5. Map reconstruction
The visited locations are combined into a map, depending on the application. The location data points in our map and reconstruct offline.
About Me
My background is in electrical engineering and I was first exposed to FPGAs as an undergraduate over 20 years ago. My day job is mostly in software and electromagnetic modeling, but I try to keep my embedded skills sharp with FPGA, microcontroller, robotics, and RF projects.