CycleSafe - Summary Blog

20 Jun 2024

Project goals

The goal of CycleSafe project is to improve safety of cyclists on the road. The secondary goal is to make it affordable for a wider community.

Project Overview

The project will use RPi 4, a camera, and machine learning algorithms to detect cars and trucks in the back of the cyclist and warn of potential collisions. The camera will classify continuously objects and will calculate the distance. If an object is recognized as a car or a truck by measuring the distance to it and the change of the distance it will be able to warn cyclists of the potential collision.

Object Detection and Classification
Implement an ML algorithm for efficient and accurate object recognition.
Continuously analyze the camera feed to identify objects in the cyclist’s vicinity.

Distance Calculation
Use computer vision techniques to estimate the distance between the cyclist and detected objects.
Combine depth perception with object size to calculate accurate distances.

Cars and Trucks Recognition and Tracking
Focus on identifying cars and trucks specifically.
Maintain a list of recognized car objects and their positions.

Collision Warning System
Monitor changes in distance to approaching cars.
If the distance decreases rapidly, trigger a warning signal (LED for cyclists, flash taillight to alert cars behind and buzzer).

User Interface and Alerts
Design a simple user interface (UI) for the cyclist.
Communicate real-time information about detected cars/trucks and their proximity.
Provide clear alerts when collision risk is high.

Testing and Optimization
Conduct extensive testing in various scenarios (day/night, different speeds, different environmental conditions, including rain, snow, vibration, dust ).
Optimize the system for accuracy, low latency, and minimal false positives/negatives.

Power Efficiency and Durability:
Optimize for energy-efficient components to prolong battery life.
Ensure the system is robust and weather-resistant.

Components

HAMMOND 1554VA2GYCL Plastic Enclosure, Watertight, Clear Lid, PCB Box, Polycarbonate, 88.9 mm, 160 mm, 240 mm, IP68

RASPBERRY-PI CM4104000 Raspberry Pi Compute Module 4 Lite, 4GB RAM, Wireless, BCM2711, ARM Cortex-A72

RASPBERRY-PI CM4IO Compute Module 4 I/O Board, Raspberry Pi, BCM2711, ARM Cortex-A72

RASPBERRY-PI RPI 8MP CAMERA BOARD Daughter Board, Raspberry Pi Camera Board, Version 2, Sony IMX219 8-Megapixel Sensor

Portable power supply

Bike rear rack

Fixtures

Potential Challenges

Creating a machine learning algorithm that can process video (object detection and classification) in near-real time using RPi4

Achieve acceptable distance calculation precision

Environmental impact (vibration, rain, snow) on hardware and video processing

Securely mounting hardware on a bicycle

Optimize power consumption to allow at least 60 minutes of ride without recharging.

Why do I need to test the enclosure?

CycleSafe will be used in the outdoor environment. It will be subject for all kinds of environmental conditions, including snow, and vibration. The role of the enclosure is to host and protect sensitive electronic components. So I'd like to understand what will be temperature and humidity inside the enclosure when I expose it to the environment.

Test Preparation

I've been using Home Assistant for a few years and I have a few XIAOMI Mijia Bluetooth-compatible Thermometers 2 connected to it. So I've placed one of them into the HAMMOND 1554VA2GYCL enclosure. I've added an element14 robot there too.

I've closed the enclosure and placed it outside. Fortunately, we've got a weather forecast for snow and rain during my test period.

Test Data

The test data represents two periods when the enclosure was outside (at the beginning and at the end of the graph). There was a period in the middle when the enclosure was inside.

Snow Test

There was a lot of snow coming.

I've prepared the enclosure for the test and set it outside.

There were a few centimeters of snow the next day. So the enclosure was covered with snow.

The next day snow starts melting.

When I zoomed in I was able to see the sensor data - 14.4C and 64% humidity.

Enclosure Test Conclusions

The enclosure protected the sensor quite well from rain and snow. The sensor was functioning all the time.

It can't protect from the temperature changes. It actually can increase the temperature inside the enclosure due to solar radiation. It may be good during the cold season and can be bad during the hot season. So, it should be taken into consideration.

The humidity was changing over time. At one point at noon on April 14 during the heavy rain, it reached above 70%. While my sensor was functioning all the time, the specification of the electronics must be validated before placing it into the enclosure and exposing it to the environment.

Design Changes

When I tried to connect the RPi Compute Module 4 I/O board with RPi Camera v2 I realized that they were not compatible. So, As I have a USB camera I've decided to reuse it instead of purchasing an additional camera. Here is the new design diagram.

CycleSafe design v2

Need for Data and Test Preparation

I need to collect video recordings, so I can use them to test the application and its car (obstacles) detection processing. So I need to complete a bike ride with the prototype. I've decided to use a power bank which has a 12V output as the power source. The battery can support RPi for more than a day. And it fits well inside the enclosure.

components and enclosure

I've added some packing materials to isolate components inside the enclosure and closed it.

the enclosure closed

I've purchased a Large-Capacity Rear Seat Bag for Outdoor Cycling and a Bicycle Rear Seat Carrier - Luggage Cargo Rack to host my HAMMOND 1554VA2GYCL Plastic Enclosure and other components. Once the order was delivered I installed the rack and the bag on my bicycle and put the prototype inside the bag.

{gallery}Bicycle with a rack
The rack back view
The rack-top view
Inside the bag

{gallery}Bicycle with a rack

rack back view

The rack back view

The rack top view

The rack-top view

Inside the bag

Video Recorder App

I wrote a Python app using the OpenCV library to record 20 minutes of video. Then I configured it to be started immediately after the RPi boot.

import cv2
import datetime

def get_current_datetime_string():
    """
    Returns the current date and time as a string in the format "YYYYMMDDHHMMSS".
    """
    now = datetime.datetime.now()
    return now.strftime("%Y%m%d%H%M%S")

# function video capture
cap = cv2.VideoCapture("/dev/video0",  cv2.CAP_V4L2)

if not cap.isOpened():
    print("Error: Could not open video Source.")
    exit()
"""
uvcdynctrl -f
Ubisoft camera
Pixel format: YUYV (YUYV 4:2:2; MIME type: video/x-raw-yuv)
  Frame size: 640x480
    Frame rates: 30, 20, 15, 10, 5, 1
  Frame size: 320x240
    Frame intervals: 1/30, 1/20, 1/15, 1/10, 1589/15625, 1/1
  Frame size: 160x120
    Frame intervals: 1/30, 1/20, 1/15, 1/10, 1589/15625, 1/1
  Frame size: 176x144
    Frame intervals: 1/30, 1/20, 1/15, 1/10, 1589/15625, 1/1
  Frame size: 352x288
    Frame intervals: 1/30, 1/20, 1/15, 1/10, 1589/15625, 1/1

DYNEX camera
Pixel format: MJPG (Motion-JPEG; MIME type: image/jpeg)
  Frame size: 640x480
    Frame rates: 30, 25, 20, 15, 10, 5
  Frame size: 352x288
    Frame rates: 30, 25, 20, 15, 10, 5
  Frame size: 320x240
    Frame rates: 30, 25, 20, 15, 10, 5
  Frame size: 176x144
    Frame rates: 30, 25, 20, 15, 10, 5
  Frame size: 160x120
    Frame rates: 30, 25, 20, 15, 10, 5
  Frame size: 1280x1024
    Frame rates: 15, 10, 5

"""
cap.set(3,320)
cap.set(4,320)

# rame rate or frames per second
fps = 30
 
# Width and height of the frames in the video stream

size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
#size = (320, 320)
"""
Create a VideoWriter object. 
FourCC FourCC is a 4-byte code used to specify the video codec. The list of available codes can be found in fourcc.org. It is platform dependent.
number of frames per second (fps)
frame size should be passed.
May specify isColor flag. If it is True, encoder expect color frame, otherwise it works with grayscale frame.
"""
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
videoWriter = cv2.VideoWriter('/var/lib/tail/tail-'+get_current_datetime_string()+'.mp4', 
    fourcc, fps, size)
    #cv2.VideoWriter_fourcc('I','4','2','0'), fps, size)
 
success, frame = cap.read()
 
# 20 minutes maximum recording
numFramesRemaining = 20*60*fps
 
# loop until there are no more frames and variable > 0
while success and numFramesRemaining > 0:
    videoWriter.write(frame)
    success, frame = cap.read()
    if not success:
        print("Failed to grab the frame from the video source, Trying again...")
    #else:
    #    print(f'Height: {frame.shape[0]}, Width: {frame.shape[1]}')

    # cv2.imshow('frame',frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    numFramesRemaining -= 1
 
#Closes video file or capturing device
cap.release()
videoWriter.release()
cv2.destroyAllWindows()

Ready to test

At this point, I was ready to record the video and start further development.

Video Recording

I've recorded my bicycle ride from the tail camera perspective near the sunset on a nice spring evening. I've used the components and the app I've described in my previous blog CycleSafe - #3 - The Preparations for the Road Test and the Data Collection The length of the record I've published is 8 minutes as the other 12 minutes don't have seeing of cars, so I've cut them.

What is YOLOv8

YOLO v8 is a recent version of the popular YOLO (You Only Look Once) object detection algorithm, released by Ultralytics in 2022. It introduces several improvements and new features over previous versions, making it a powerful tool for real-time object detection and instance segmentation tasks.

Key Features of YOLO v8

Instance Segmentation: In addition to object detection, YOLO v8 can perform instance segmentation, which means it can identify and segment individual objects within an image, providing pixel-level masks for each instance.
Improved Accuracy: YOLO v8 incorporates new techniques and architectural changes that enhance its accuracy in detecting and localizing objects, especially small objects, compared to previous versions.
Faster Inference Speed: YOLO v8 has been optimized for faster inference, making it suitable for real-time applications that require high processing speeds, such as surveillance systems and autonomous vehicles.
New Loss Function: YOLO v8 utilizes a new loss function called "focal loss," which helps improve the detection of small objects by down-weighting well-classified examples and focusing on hard-to-detect objects.
Higher Resolution: YOLO v8 processes images at a higher resolution (608x608 pixels) compared to previous versions, allowing for better detection of smaller objects and improved overall accuracy.
Trainable Bag-of-Freebies: YOLO v8 introduces a "trainable bag-of-freebies" technique, which involves training the model with various data augmentation and regularization techniques to improve its performance further.

I've selected it based on its accuracy for real-time object detection capabilities and built-in capability to classify cars, trucks, and bicycles.

YOLOv8 has several models. The smallest one is nano. It is suitable to run on constrained devices with limited computing power.

Video Processing

I wrote a Python app using the OpenCV and YOLOv8 nano libraries to process the recorded video.

import cv2
import datetime
from ultralytics import YOLO
import supervision as sv

# Load the YOLOv8n model
model = YOLO('yolov8n.pt')

model.info()

def get_current_datetime_string():
    """
    Returns the current date and time as a string in the format "YYYYMMDDHHMMSS".
    """
    now = datetime.datetime.now()
    return now.strftime("%Y%m%d%H%M%S")

input_video_path = 'tail-video.mp4'

cap = cv2.VideoCapture(input_video_path)

if not cap.isOpened():
    print("Error: Could not open video Source.")
    exit()

# rame rate or frames per second
fps = 30
 
# Width and height of the frames in the video stream

size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
videoWriter = cv2.VideoWriter('/var/lib/tail/tail-detection'+get_current_datetime_string()+'.mp4', 
    fourcc, fps, size)
 
success, frame = cap.read()
count=30
# loop until there are no more frames and variable > 0
while success:
    success, frame = cap.read()
    if not success:
        print("Failed to grab the frame from the video source, Trying again...")
    if (count < fps):
        count = count+1
    else:
        count=0

        # Perform object detection https://docs.ultralytics.com/modes/predict/#inference-sources
        results = model.predict(frame, classes=[2, 3, 5, 7], conf=0.25)  # Class IDs for car, motorcycle, bus, truck imgsz=frame.shape, 

        # Visualize the results on the frame
        annotated_frame = results[0].plot()
        videoWriter.write(annotated_frame)

cap.release()
videoWriter.release()
cv2.destroyAllWindows()

The record was processed using the YOLO v8 nano model. The processing was taking almost 2 seconds per 1 frame on RPi4.

And it was using only 1 of its 4 CPU cores.

I've decided to process only 1 frame for a second of the original recording. So it appears as a fast movie replay at 30x speed. I've uploaded the video to Youtube, but I found its resolution not great.

Some math

In my estimation, a cyclist needs at least 2 seconds to react to a potential danger. If a car runs at 100 km/h it can travel ~28 m in a second. My app needs to process at least two frames to measure the distance and speed of the car relative to the cyclist. It will take ~4 seconds with YOLOv8 nano without further optimization. So it needs to be able to detect car 6 seconds in advance, which gives a distance of 168 meters. It is not realistic to expect based on my current setup. If a car is approaching at 50 km/h then the distance to detect it goes down to 84 meters, which is a more realistic scenario. But then it will reduce the usefulness of the solution.

Alternatives

Use a higher camera resolution so the app can detect cars at a longer distance, but it may result in many false positives which makes the solution useless.

Find a way to use more CPU cores for video processing.

Use a different algorithm (FOMO, other versions of YOLO).

Crop the frame to reduce the processing time.

A more in-depth look at alternatives to YOLO v8

Use a different algorithm

Background substruction
FOMO
Other versions of YOLO

Optimize YOLOv8

Find a way to use more CPU cores for video processing.
Use a higher camera resolution so the app can detect cars at a longer distance, but it may result in many false positives, which does not work for my use case.
Crop the frame to reduce the processing time.
Skip some frames during video processing

Background substruction

OpenCV has a capability that works great for object detection and tracking with relatively static backgrounds (BackgroundSubtractorMOG2). It is very fast and can be used on a constrained devices like RPi4.

But I wasn't able so far to make it work reliably with fast-changing backgrounds. Here is an example where it is missing a car.

Here is a video of video processing using this method.

As reliability is critical for my use case I've dropped this option from further consideration.

FOMO

FOMO is 30x faster than MobileNet SSD and can run in <200K of RAM, however, it has significant limitations, which make it unusable for my project.

Does not output bounding boxes. Hence the size of the object is not available. And I need its size to calculate the distance between the cyclist and cars.
Objects shouldn’t be too close to each other. But cars can be quite close on the road.

YOLOv8 with optimization

After additional research, I discovered that the YOLO family is focused on GPU-accelerated hardware for real-time processing. While RPi4 has GPU there are no useful drivers that YOLO can leverage to achieve performance boost. I've started looking at additional optimization options.

The reason I'd like to try it again is its great reliability. It can reliably detect different classes of objects, including cars/trucks/motorcycles/buses.

As the first step, I've added support for parallel processing. RPi4 has 4 cores and they can process video frames in parallel. The multi-threading is a bit tricky in Python and requires some expertise. YOLO documentation has some good pointers on how to achieve it.

Another optimization I've used was to skip frames if RPi compute is still busy processing previous frames.

I've decided to use a custom object tracker instead of a built-in native to YOLOv8. It allowed me to save more than a hundred milliseconds of compute time per frame and I've added additional attributes, which help to reason about the road situation over different frames.

Here is the result of processing the recorded test drive using YOLOv8 with my custom tracker. Bounding boxes around cars have different colors. The red color represents potential danger when cars are close to a cyclist. When such a situation gets detected the RPi4 will notify the cyclist about approaching danger. Other colors of bounding boxes helped me with adjusting control parameters. The numbers on top of the bounding boxes represent the identification of the cars for tracking purposes.

Testing on RPi4

I did my tests on PC to iterate code development and testing faster. Now I need to deploy it on RPi4, add a notification of the cyclist feature, and run another live test.

After I've migrated the code I run the test in the multi-threaded mode. I've got "terminate called without an active exception" error and a core dump.

Ultralytics YOLOv8.2.27 🚀 Python-3.11.6 torch-2.3.0 CPU (Cortex-A72)

YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs

terminate called without an active exception
Aborted (core dumped)

This error was reported by other developers, so I've added a comment too.

Here is the main part of the code:

#!/usr/bin/env python
#from __future__ import print_function

import numpy as np
import cv2 as cv

from multiprocessing.pool import ThreadPool
from collections import deque

from ultralytics import YOLO
from functools import lru_cache
from tracker import*
from datetime import datetime

class DummyTask:
    def __init__(self, data):
        self.data = data
    def ready(self):
        return True
    def get(self):
        return self.data

#Returns the current date and time as a string in the format "YYYYMMDDHHMMSS".
def get_current_datetime_string():
    now = datetime.now()
    return now.strftime("%Y%m%d%H%M%S")

def main():

    #folder = "C:/Users/serge/Videos/Captures/"
    folder = "/var/lib/tail/"
    file_name = folder+"tail-trim20240601083612.mp4"
    #cap = video.create_capture(file_name)
    cap = cv.VideoCapture(file_name)
    cap.set(cv.CAP_PROP_POS_FRAMES, 30*280)

    # rame rate or frames per second
    fps = 30
    # Width and height of the frames in the video stream
    size = (int(cap.get(cv.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))
    fourcc = cv.VideoWriter_fourcc(*'mp4v')
    videoWriter = cv.VideoWriter(folder+'/tail-detection'+get_current_datetime_string()+'.mp4', fourcc, fps, size)

    @lru_cache(maxsize=None)
    def _initializer():
        global model
        model = YOLO("yolov8n.pt")
        #model.info()
        model.fuse()
        #model.info()
    
    RED = (0, 0, 255)
    YELLOW = (0, 255, 255)
    BLUE = (255, 0, 0)
    GREEN = (0, 255, 0)
    obstacles_classes=[2, 3, 5, 7]

    def process_frame(frame, mytracker):
        _initializer()
        # Perform object detection https://docs.ultralytics.com/modes/predict/#inference-sources
        # Class IDs for car, motorcycle, bus, truck
        results = model.predict(frame, obstacles_classes, conf=0.25) 
        for res in results:
            boxes = res.boxes
            if len(boxes)>0:
                points=[]
                for c in boxes:
                    if int(c.cls.numpy()) in obstacles_classes:
                        points.append(np.round(c.xywh[0].numpy()).astype(int))
                if len(points)>0:
                    point=mytracker.update(points)
                    for i in point:
                        x,y,w,h,id,d=i
                        if (d>0):
                            if (w>45):
                                if (abs(388/2-x)<50):
                                    color = RED
                                else:
                                    color = YELLOW
                            else:
                                color = BLUE
                        else:
                            color = GREEN

                        cv.rectangle(frame,(x-w//2,y-h//2),(x+w-w//2,y+h-h//2),color,2)
                        cv.putText(frame,str(id),(x,y -1),cv.FONT_HERSHEY_COMPLEX,1,(255,0,0),2)
        return frame

    threadn = cv.getNumberOfCPUs()
    pool = ThreadPool(processes = threadn-1)
    pending = deque()
    threaded_mode = True

    tracker=Tracker()

    while True:
        while len(pending) > 0 and pending[0].ready():
            res = pending.popleft().get()
            #cv.imshow('threaded video', res)
            videoWriter.write(res)
        _ret, frame = cap.read()
        if frame is None:
            break
        #processing only if compute is available; otherwise skipping frames
        if len(pending) < threadn:
            if threaded_mode:
                task = pool.apply_async(process_frame, (frame.copy(), tracker))
            else:
                task = DummyTask(process_frame(frame, tracker))
            pending.append(task)
        """
        ch = cv.waitKey(1)
        if ch == ord(' '):
            threaded_mode = not threaded_mode
        if ch == 27:
            break
        """
    cap.release()
    videoWriter.release()
    #cv.destroyAllWindows()

if __name__ == '__main__':
    print(__doc__)
    main()

And here is my tracker code:

import math

class Tracker:
    def __init__(self):
        # Store the center positions of the objects
        self.center_points = {}
        # Keep the count of the IDs
        # each time a new object id detected, the count will increase by one
        self.id_count = 0
  
    def update(self, objects_rect):
        # Objects boxes and ids
        objects_bbs_ids = []
        PROXIMITY_THRESHOLD = 35

        # Get center point of new object
        for rect in objects_rect:
            x, y, w, h = rect
            cx = x + w // 2
            cy = y + h // 2

            # Find out if that object was detected already
            same_object_detected = False
            for id, pt in self.center_points.items():
                dist = math.hypot(cx - pt[0], cy - pt[1])

                if dist < PROXIMITY_THRESHOLD:
                    self.center_points[id] = (cx, cy, w)
                    #print(self.center_points)
                    objects_bbs_ids.append([x, y, w, h, id, w-pt[2]])
                    same_object_detected = True
                    break

            # New object is detected we assign the ID to that object
            if same_object_detected is False:
                self.center_points[self.id_count] = (cx, cy, w)
                objects_bbs_ids.append([x, y, w, h, self.id_count, 0])
                self.id_count += 1
        # Clean the dictionary by center points to remove IDS not used anymore
        new_center_points = {}
        for obj_bb_id in objects_bbs_ids:
            _, _, _, _, object_id, _ = obj_bb_id
            center = self.center_points[object_id]
            new_center_points[object_id] = center

        # Update dictionary with IDs not used removed
        self.center_points = new_center_points.copy()
        return objects_bbs_ids

I'll need to troubleshoot it a bit more to resolve it and complete integration with the alerting component.

Conclusion

It was a great experience to dive again into video processing to resolve a real problem.

All electronic components and the enclosure worked very well for my project. The software part is rapidly evolving and requires additional attention. It will be great if the RPi4 GPU module get the support of OpenCV/YOLOv8 to gain more computing performance.

Top Comments

vlasov01 8 months ago in reply to beacon_dave

beacon_dave Thank you! Yes, I've tried it with the "Background obstruction". But it degraded the quality with a big increase of false positives and false negatives. I've described it in the section above. Maybe I need to master this technique to make it work for this project.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
beacon_dave 9 months ago in reply to vlasov01

Not having a front number plate in some jurisdictions would certainly be an issue with that idea.

Another thought would be if you really need to classify the moving object or not. If any sufficiently large object is moving towards you at speed / proximity then that alone is perhaps enough to be cause for concern to the rider. They perhaps don't really need to know that it has been classified as a car with >65% certainty but instead just that something large enough to hurt them is getting pretty close.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
beacon_dave 9 months ago in reply to vlasov01

Yes that has fixed it. Thanks.

There appeared to be two issues - the first was that the image upload had failed which we have seen several times before. The second looks like there was a URL encoding issue going as it looked like it was trying to convert between hyphens/dashes and underscores in the filename.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
vlasov01 9 months ago in reply to beacon_dave

Thank you for sharing your idea. I was thinking of adding GPS to have speed value as well.

I haven't thought about using fast automatic number plate recognition algorithms. In some jurisdictions (like here in Quebec, Canada) there are no license plates in front of the cars.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
vlasov01 9 months ago in reply to beacon_dave

I think the issue with the image is fixed now after I've changed the file name.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel