In this blog series, I attempt to bring two of my passions together … AI and rock climbing.
In my last blog, I talked about the 2020 Tokyo Olympics that are expected to take place in 2021, and which will feature rock climbing as a discipline. The discipline presents humans performing spectacular movements that defy gravity.
Applying AI to climbing … The Long Road to the 2020 Tokyo Olympics
Having been learning about machine learning for a while now, I decided to verify what the AI algorithms were able to do with these unique images.
I used some images of the olympic athletes for my initial investigation, and was surprised by the very strange classifications that were being generated : airplane, boat, chair, dog, …
To illustrate, using mobilenet, these two images of Olympic climbers were incorrectly classified.
In this first example, Great Britain's Shauna Coxsey is mis-classified as a boat … probably because she appears to be floating up the wall
In this next example, Canada's Sean McColl is mis-classified as an airplane … although I do agree that he is literally flying up this wall
Jokes aside, and to be fair, there are a lot of images where the climbers are being correctly classified as "person", but the algorithm is clearly having a really hard time with these images.
What is going on ? Why are these climbers images being incorrectly classified ?
Clearly the background have something to do this these mis-classifications … the bright colored triangular structures on the wall, are giving false clues about other objects it has been trained for.
My initial investigation with pose estimation was further disappointing, since the first step in pose estimation is the detection of a "person". No "person", no pose estimation.
Here is a video of a side-by-side comparison of two climbers that I prepared for this exercise:
I experimented with a two-stage detector (SSD + SPNet), as well as a single-stage detector (OpenPose) on the above video. In both cases, the pose estimation models rarely detected the climber. On the rare occasions where the climber was detected, the pose estimation was often incorrect, as shown in the following images.
When the climbers touched the ground, however, things change …
Why are these pose estimation algorithms having such a hard time ?
If we take a step back, this is understandable, and even to be expected. The images used to train these models have images of humans standing, sitting, perhaps laying down, but never hanging from the walls.
This is a common phenomenon in AI, known as "bias". We often hear about training data having color or gender bias. There are many different "biases" in public data sets, and we should be aware of these before re-using or training a model that makes use of them.
As a simple example, the very famous ImageNet dataset ( http://image-net.org/ ) has a bias towards dogs, with a few hundred classes out of the total being dogs.
Researchers at MIT created an unusual dataset called ObjectNet ( http://objectnet.dev/ ), which contains various "odd objects". The goal of this dataset is to show that trained models can easily be challenged by providing images with unexpected viewpoints and/or backgrounds.
The "bias" I want to highlight is the assumption of how a human is expected to be seen … usually in the direction that gravity is pulling … not hanging from the ceiling or walls. For the purpose of this on-going discussion, let's call this the "odd human" data set.
I do not know if there exists a public data set that includes climbers, but this is clearly the first place to look.
Please share your feedback:
- What approach would you take to support these "odd" human poses ?
- Are you aware of models that have been specifically trained for climbing images ?
- Are you aware of public datasets that include climbers ?
References:
- ImageNet dataset : http://image-net.org/
- ObjectNet dataset : http://objectnet.dev/
- Odd Objects : https://news.mit.edu/2019/object-recognition-dataset-stumped-worlds-best-computer-vision-models-1210
Top Comments