Applying AI to climbing … Deep Learning meets the "odd human" dataset

16 Apr 2021

In this blog series, I attempt to bring two of my passions together … AI and rock climbing.

In my last blog, I talked about the 2020 Tokyo Olympics that are expected to take place in 2021, and which will feature rock climbing as a discipline. The discipline presents humans performing spectacular movements that defy gravity.

Applying AI to climbing … The Long Road to the 2020 Tokyo Olympics

Having been learning about machine learning for a while now, I decided to verify what the AI algorithms were able to do with these unique images.

I used some images of the olympic athletes for my initial investigation, and was surprised by the very strange classifications that were being generated : airplane, boat, chair, dog, …

To illustrate, using mobilenet, these two images of Olympic climbers were incorrectly classified.

In this first example, Great Britain's Shauna Coxsey is mis-classified as a boat … probably because she appears to be floating up the wall

In this next example, Canada's Sean McColl is mis-classified as an airplane … although I do agree that he is literally flying up this wall

Jokes aside, and to be fair, there are a lot of images where the climbers are being correctly classified as "person", but the algorithm is clearly having a really hard time with these images.

What is going on ? Why are these climbers images being incorrectly classified ?

Clearly the background have something to do this these mis-classifications … the bright colored triangular structures on the wall, are giving false clues about other objects it has been trained for.

My initial investigation with pose estimation was further disappointing, since the first step in pose estimation is the detection of a "person". No "person", no pose estimation.

Here is a video of a side-by-side comparison of two climbers that I prepared for this exercise:

https://youtu.be/HKD7PgqvX6w

I experimented with a two-stage detector (SSD + SPNet), as well as a single-stage detector (OpenPose) on the above video. In both cases, the pose estimation models rarely detected the climber. On the rare occasions where the climber was detected, the pose estimation was often incorrect, as shown in the following images.

When the climbers touched the ground, however, things change …

Why are these pose estimation algorithms having such a hard time ?

If we take a step back, this is understandable, and even to be expected. The images used to train these models have images of humans standing, sitting, perhaps laying down, but never hanging from the walls.

This is a common phenomenon in AI, known as "bias". We often hear about training data having color or gender bias. There are many different "biases" in public data sets, and we should be aware of these before re-using or training a model that makes use of them.

As a simple example, the very famous ImageNet dataset ( http://image-net.org/ ) has a bias towards dogs, with a few hundred classes out of the total being dogs.

Researchers at MIT created an unusual dataset called ObjectNet ( http://objectnet.dev/ ), which contains various "odd objects". The goal of this dataset is to show that trained models can easily be challenged by providing images with unexpected viewpoints and/or backgrounds.

The "bias" I want to highlight is the assumption of how a human is expected to be seen … usually in the direction that gravity is pulling … not hanging from the ceiling or walls. For the purpose of this on-going discussion, let's call this the "odd human" data set.

I do not know if there exists a public data set that includes climbers, but this is clearly the first place to look.

Please share your feedback:

What approach would you take to support these "odd" human poses ?
Are you aware of models that have been specifically trained for climbing images ?
Are you aware of public datasets that include climbers ?

References:

ImageNet dataset : http://image-net.org/
ObjectNet dataset : http://objectnet.dev/
Odd Objects : https://news.mit.edu/2019/object-recognition-dataset-stumped-worlds-best-computer-vision-models-1210

Top Comments

ralphjy over 4 years ago +1

Just some random thoughts... I've mentioned it before, but it seems like too hard a problem (too many variables) to solve without a huge amount of data. What if you could augment the data with a few sensors…

ralphjy over 4 years ago

Just some random thoughts...

I've mentioned it before, but it seems like too hard a problem (too many variables) to solve without a huge amount of data. What if you could augment the data with a few sensors that would attach to equipment that is requisite to all climbers - left shoe, right shoe, rosin bag. Haven't thought of how to implement it, but then detecting orientation would be easier. Not sure how to locate hands (maybe wristbands?)... and obviously this would only work with future climbs.

For video only, what if for a specific climb you could train the model to identify each climber. I've seen cases where data is augmented by taking a specific image and rotating it. Then your training data set would have each person with varying orientations. And if you had different profiles that would be even better. Not a generalized case, but seems like it might work.

Hope that you can find an existing dataset. Maybe you can start one .
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel