A more in-depth look at alternatives to YOLO v8
Use a different algorithm
- Background substruction
- FOMO
- Other versions of YOLO
Optimize YOLOv8
- Find a way to use more CPU cores for video processing.
- Use a higher camera resolution so the app can detect cars at a longer distance, but it may result in many false positives, which does not work for my use case.
- Crop the frame to reduce the processing time.
- Skip some frames during video processing
Background substruction
OpenCV has a capability that works great for object detection and tracking with relatively static backgrounds (BackgroundSubtractorMOG2). It is very fast and can be used on a constrained devices like RPi4.
But I wasn't able so far to make it work reliably with fast-changing backgrounds. Here is an example where it is missing a car.
Here is a video of video processing using this method.
As reliability is critical for my use case I've dropped this option from further consideration.
FOMO
FOMO is 30x faster than MobileNet SSD and can run in <200K of RAM, however, it has significant limitations, which make it unusable for my project.
- Does not output bounding boxes. Hence the size of the object is not available. And I need its size to calculate the distance between the cyclist and cars.
- Objects shouldn’t be too close to each other. But cars can be quite close on the road.
YOLOv8 with optimization
After additional research, I discovered that the YOLO family is focused on GPU-accelerated hardware for real-time processing. While RPi4 has GPU there are no useful drivers that YOLO can leverage to achieve performance boost. I've started looking at additional optimization options.
The reason I'd like to try it again is its great reliability. It can reliably detect different classes of objects, including cars/trucks/motorcycles/buses.
As the first step, I've added support for parallel processing. RPi4 has 4 cores and they can process video frames in parallel. The multi-threading is a bit tricky in Python and requires some expertise. YOLO documentation has some good pointers on how to achieve it.
Another optimization I've used was to skip frames if RPi compute is still busy processing previous frames.
I've decided to use a custom object tracker instead of a built-in native to YOLOv8. It allowed me to save more than a hundred milliseconds of compute time per frame and I've added additional attributes, which help to reason about the road situation over different frames.
Here is the result of processing the recorded test drive using YOLOv8 with my custom tracker. Bounding boxes around cars have different colors. The red color represents potential danger when cars are close to a cyclist. When such a situation gets detected the RPi4 will notify the cyclist about approaching danger. Other colors of bounding boxes helped me with adjusting control parameters. The numbers on top of the bounding boxes represent the identification of the cars for tracking purposes.
Next Steps
I did my tests on PC to iterate code development and testing faster. Now I need to deploy it on RPi4, add a notification of the cyclist feature, and run another live test.
I've found an issue with storing the video while running video processing in parallel. The above video was recorded in a single thread mode. I need to clean up my code a bit to resolve it.
And there are a few more optimization ideas to play with.