My Path to Learn Robotics

2 Mar 2021

Hello,

The main purpose of this post is to break the long silence in the Robotics Section of the element14 and also a bit of showing off my progress on my personal learning experience.

Please let me know what you think and if you like these kinds of blogs in element14.

Introduction
What I have done
Next step
Some Links

Introduction

I would like to learn about mapping and one of the relatively affordable options is experimenting with Intel RealSense sensors. They are not cheap but they are much cheaper than industrial LIDAR. There are still more affordable LIDAR options like Slamtec LIDARs but they are only in 2D.

So here we go. I have chosen RealSense sensors for my learning.

D435i which is an RGB-D sensor which means it can output depth information as well as color.
T265 which is using two gray-scale fish-eye cameras and outputs the pose of the camera relative to the location when it turned on. It does the visual odometry on the camera and frees the CPU and the application writer from this calculation.
Both of them connected through USB and USB 3.0 would give the best perfomance.

The following image is showing my setup. The Top sensor is the D435i and the bottom one is T265. They are both connected to a NVIDIA Jetson Nano (very first version) and the whole system is mounted on a 3D printed chassis that I modified and printed from the DonkeCar Project. You can find the fusion 360 model of the chassis here: https://a360.co/2E9mb4H . I tried to use the preview embedding feature of fusion 360 here but it did not work.

{gallery:width=740,height=600,autoplay=false} My Gallery Title
My Setup

{gallery:width=740,height=600,autoplay=false} My Gallery Title

My Setup

What I have done

To start I needed a way to read the sensor data in Jetson Nano. Unfortunately they do not release the binaries for the Jetson but there are some instructions on how to compile the library from the source and install it on Jetson. Here I documented how I am compiling the librealsense for Jetson and specific python version.

The other challenge is that they do not have python examples as much as C++ examples, so I had to convert their C++ examples into python. What I like about python is the pydoc module that shows the documentations and all available functions and classes in all the modules.

python3 -m pydoc -p 9999

This is running a web server at localhost:9999 that includes all the documentation of all modules. Using docs and the C++ examples I could write some python scripts to capture the data from these two senors.

For visualization of the 3d data I decided to use Pangolin. Pangolin is a lightweight wrapper for OpenGL and I saw it being used in some open source SLAM (Simultaneous Localization and Mapping) Projects. It was quite fast and easy to draw something on the screen in 3D for me. Consider that I am not that much familiar with computer graphics. And of course they have python bindings.

I had some results being able to draw the sensor measurements live on the screen as point cloud until the following gif image that I am very happy about.

Why am I happy? Here you see a mouse sitting on a table beside some orchids. The point cloud data is coming from different location of the sensors in space. but the objects are being drawn on the screen relatively in a fix location. I know it is not perfect. They are moving a bit. But this means to some extent It knows where sensor is in 3D space and from where it is capturing the measurements.

To be more clear, you are seeing this.

Why isn't it perfect? The big factor is that I am still not considering the displacement between two sensors. I am finding the pose of the sensors from the T265 sensor and getting the point cloud from D435i. But these two are not exactly on top of each other. So using the pose of the T265 for D435i is not exactly correct here. So the next step is to calibrate these two sensor relative to each other and find the displacement between these two. That is next step.

The sensor measurements are in different coordinate systems. The point cloud in the following system relative to D435i.

and the poses from T265 is in the following coordinate system

Of course I considered this rotation to calculate the points relative to a fixed coordinate system. But I am ignoring the displacement between two sensors. Intel has a good articles about the coordinate systems of these two sensors here and here.

The following image shows a the math I used so far. Currently I am ignoring the translation part of the transformation matrix between the two sensors.

Next step

is proper calibration of these two sensor. I have some ideas about calibration. I had to do a similar thing as an assignment as a job interview in US (although I did not get that job) It was fun experience and I documented that here: https://github.com/yosoufe/CameraLidarCalibration

Since I am just playing with these stuff. there is no force on what to do next. There are other options to learn. For example recently I learned a bit about deep learning for classification of point cloud data. It would be interesting to try it with live sensor measurements.