Live face recognition with NVidia Jetson Nano using Facebook profile images as input

28 Dec 2019

I really like the NVidia Jetson Nano. 128 CUDA cores is a lot of power for an $89 small form factor computer. Especially given what you can do with it. I've bought mine for $99 from SeeedStudio (and without advertising for them - I see the price droppeed to $89 these days). It doesn't have WiFi or an active cooler so I've also bought a b/g/n WiFi module plus two antennas from AliExpress and used a 5V cooler for the huge heatsink. Got myself a metal case and started experimenting.

There's a lot it can teach you about machine learning. From detecting animals in JPEG images to identifiying people with bags and only with bags in live feeds. From recognizing objects in real-time to using Lidar technology to map terrain. My PhD thesis in Sociology is about how onlne protests get started and expand to the streets. Things like the Arab Spring or Black Lives Matter. I've used the Jetson Nano to identify people in static images taken from news sites reporting the protests in France.

I've also used it to test how Python code works in identifying bottles and random objects from my home:

I've even tried OpenALPR (Open Automatic License Plate Recognition) on Romanian licence plates and got very decent results:

But when I was invited by some of my teachers to present at a workshop about the use of modern technology on Social Sciences Research I needed something better. Something that the public could interract with. They were not technical people and from previous workshops on the same subject I knew their attention would fade quickly if I were to bombard them with technical details. That's how I've found this video. It seemed just what I've needed.

The system is caled FAAM and is basically a class attendance cataloging system. It uses pictures from a yearbook to help the teacher know and store the names of the students attending his class. Pretty useful if you're working in the education system. But I only needed part of the code. FAAM used a school bell ring detection code to start face identification and then an online spreadsheet to store names. I didn't need that. I only wanted to take the Facebook profile pictures of some of the people I knew would attend my workshop so the Jetson Nano could print out their names on the screen when they came into camera view. Like in this instance of me sitting in the bathroom because I thought it would be nice to have a white background in the screenshot:

The profile picture I used for live identification is this:

The Python scripts of the original developer (whose full name I sadly don't know but who goes by the nickname "sirbendarby" on YouTube) were documented but not for my needs so I had to stumble around a bit until I've figured out what does what.

Face-Encoding.py I modified to take a bunch of Facebook profile images stored in a folder called /imagini. I also didn't like the bitmap font used so I opted for plain old Arial as it had special characters and a ”technical" feel to it. I dropped arial.ttf in the same folder the /imagini folder resided. I edited Face-Encoding.py like so:

import face_recognition
import pickle

# This looks for images and analyzes faces.  You'll need to change directory to where your images are stored
ciuraru_claudiu_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/ciuraru_claudiu.jpg")
ciuraru_claudiu_face_encoding = face_recognition.face_encodings(ciuraru_claudiu_image)[0]

ioana_filip_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/ioana_filip.jpg")
ioana_filip_face_encoding = face_recognition.face_encodings(ioana_filip_image)[0]

ionut_butean_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/ionut_butean.jpg")
ionut_butean_face_encoding = face_recognition.face_encodings(ionut_butean_image)[0]

lorena_nemes_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/lorena_nemes.jpg")
lorena_nemes_face_encoding = face_recognition.face_encodings(lorena_nemes_image)[0]

palfi_levente_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/palfi_levente.jpg")
palfi_levente_face_encoding = face_recognition.face_encodings(palfi_levente_image)[0]

razvan_coloja_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/razvan_coloja.jpg")
razvan_coloja_face_encoding = face_recognition.face_encodings(razvan_coloja_image)[0]

roxana_alexandra_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/roxana_alexandra.jpg")
roxana_alexandra_face_encoding = face_recognition.face_encodings(roxana_alexandra_image)[0]

roxana_mihaela_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/roxana_mihaela.jpg")
roxana_mihaela_face_encoding = face_recognition.face_encodings(roxana_mihaela_image)[0]

raluca_buhas_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/raluca_buhas.jpg")
raluca_buhas_face_encoding = face_recognition.face_encodings(raluca_buhas_image)[0]

#coloja_zsuzsi_image = face_recognition.load_image_file("/home/cypress/attendance/imagini/coloja_zsuzsi.jpg")
#coloja_zsuzsi_face_encoding = face_recognition.face_encodings(coloja_zsuzsi_image)[0]

# This creates face encodings for the analyzed images
known_face_encodings = [
    ciuraru_claudiu_face_encoding,
    ioana_filip_face_encoding,
    ionut_butean_face_encoding,
    lorena_nemes_face_encoding,
    palfi_levente_face_encoding,
    razvan_coloja_face_encoding,
    roxana_alexandra_face_encoding,
    roxana_mihaela_face_encoding,
    raluca_buhas_face_encoding
#    coloja_zsuzsi_face_encoding
]

# This creates face encoding data file that Attendance.py uses
with open('data_set_faces', 'wb') as f:
    pickle.dump(known_face_encodings , f)

As you can see, for this part you will need to manually assign a name to each image. For example the full path to my profile image is

/home/cypress/attendance/imagini/razvan_coloja.jpg

I named each JPG to the person's name in lowercase for ease of use.

Upon launch, the script creates a database of these images and their identifying variables and names it data_set_faces. You only need to lauch this script once to create the database then re-launch it every time you add new pictures to the folder and define them in the script.

The second script is called Attendance.py and is the one running everything. In it you'll have to manually assign names to correspond to each JPG file in the /imagini folder. Here it is in its entirety:

#!/usr/bin/python3

import face_recognition
import cv2
import numpy as np
import platform
import datetime
import subprocess
import sys
import os
import pickle
import threading
from PIL import ImageFont
# This makes the script compatible with the Jetson Nano
platform.machine() == "aarch64"

# This makes Open CV work with the Jetson Nano's camera.  This resolution works well on a 5 inch monitor
#def get_jetson_gstreamer_source(capture_width=1280, capture_height=720, display_width=1280, display_height=720, framerate=60, flip_method=0):
def get_jetson_gstreamer_source(capture_width=640, capture_height=380, display_width=640, display_height=380, framerate=60, flip_method=0):

    """
    Return an OpenCV-compatible video source description that uses gstreamer to capture video from the camera on a Jetson Nano
    """
    return (
            f'nvarguscamerasrc ! video/x-raw(memory:NVMM), ' +
            f'width=(int){capture_width}, height=(int){capture_height}, ' +
            f'format=(string)NV12, framerate=(fraction){framerate}/1 ! ' +
            f'nvvidconv flip-method={flip_method} ! ' +
            f'video/x-raw, width=(int){display_width}, height=(int){display_height}, format=(string)BGRx ! ' +
            'videoconvert ! video/x-raw, format=(string)BGR ! appsink'
            )

# This initializes some needed variables
face_locations = []
face_encodings = []
face_names = []
face_names_all = []
face_names_total = []
process_this_frame = True

# This function clears face_names_all every 3 seconds to limit false positive identifications
def clear():
    threading.Timer(3, clear).start()
    del face_names_all[:]

clear()

# This accesses pre-encoded face encodings (written to a data file using Face_Encoding.py 
with open('/home/cypress/attendance/data_set_faces', 'rb') as f:
    known_face_encodings = pickle.load(f)

# This defines how long you want the camera to run
endTime = datetime.datetime.now() + datetime.timedelta(minutes=15)

# Get a reference to webcam #0 (the default one)
video_capture = cv2.VideoCapture(get_jetson_gstreamer_source(), cv2.CAP_GSTREAMER)

# This is a list of names you want displayed when face_recognition identifies a face.  List in order of
# face encodings in Face_Encoding.py
known_face_names = [
    "Ciuraru Claudiu",
    "Ioana Filip",
    "Ionut Butean",
    "Lorena Nemes",
    "Palfi Levente",
    "Razvan Coloja",
    "Roxana Alexandra",
    "Roxana Mihaela",
    "Raluca Buhas"
#    "Zsuzsi Coloja"
]

# This loop processes the video stream and identifies faces
while True:
    # Grab a single frame of video
    ret, frame = video_capture.read()

    # Resize frame of video to 1/4 size for faster face recognition processing
    small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)

    # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)
#    rgb_small_frame = small_frame[:,:,::-1]
    rgb_small_frame = cv2.cvtColor(small_frame,cv2.COLOR_BGR2RGB)
    # Only process every other frame of video to save time
    if process_this_frame:
        # Find all the faces and face encodings in the current frame of video
        face_locations = face_recognition.face_locations(rgb_small_frame)
        face_encodings = face_recognition.face_encodings(rgb_small_frame, face_locations)

        face_names = []
        for face_encoding in face_encodings:
            # See if the face is a match for the known face(s)
            matches = face_recognition.compare_faces(known_face_encodings, face_encoding,
            tolerance=0.6)
            name = "Necunoscut"

            # If a match was found in known_face_encodings, just use the first one.
            #if True in matches:
            #    first_match_index = matches.index(True)
            #    name = known_face_names[first_match_index]

            # Or instead, use the known face with the smallest distance to the new face
            face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
            best_match_index = np.argmin(face_distances)
            if matches[best_match_index]:
                name = known_face_names[best_match_index]

            face_names.append(name)
            face_names_all.append(name)
            if face_names_all.count(name) > 10:
                if name not in face_names_total:
                    face_names_total.append(name)

    process_this_frame = not process_this_frame

    # Display the results.  When face is positively identified and stored, box around face turns green
    for (top, right, bottom, left), name in zip(face_locations, face_names):
        # Scale back up face locations since the frame we detected in was scaled to 1/4 size
        top *= 4
        right *= 4
        bottom *= 4
        left *= 4

        # Draw a box around the face
        if name in face_names_total:
            cv2.rectangle(frame, (left, top), (right, bottom), (0, 255, 0), 2)

        # Draw a label with a name below the face
            cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 255, 0), cv2.FILLED)
            font = cv2.FONT_HERSHEY_DUPLEX
#            font = ImageFont.truetype("/home/cypress/attendance/arial.ttf", 12)
#            cv2.putText(frame, name, (left + 6, bottom - 6), font, 0.5, (255, 255, 255), 1)
            cv2.putText(frame, name, (left + 6, bottom - 6), font, 0.5, (255, 255, 255), 1)


        # Draw a box around the face
        else:
            cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)

        # Draw a label with a name below the face
            cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 0, 255), cv2.FILLED)
            font = cv2.FONT_HERSHEY_DUPLEX
#            font = ImageFont.truetype("/home/cypress/attendance/arial.ttf", 12)
            cv2.putText(frame, name, (left + 6, bottom - 6), font, 0.5, (255, 255, 255), 1)

    # Display the resulting image
    cv2.imshow('Video', frame)

    # Hit 'q' on the keyboard to quit!
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    # Breaks loop after the time you specified in endTime
    if datetime.datetime.now() >= endTime:
        break

# Release handle to the webcam
video_capture.release()
cv2.destroyAllWindows()

# Prints list of names identified
print (face_names_total)

sys.exit()

I modified a few things from the original script. First, there was a line that converted BGR to RGB and t went like this:

rgb_small_frame = small_frame[:,:,::-1]

But cv2 already can do it with BGR2RGB and this is more practical:

rgb_small_frame = cv2.cvtColor(small_frame,cv2.COLOR_BGR2RGB)

You can also use TrueType fonts instead of bitmap one with

font = ImageFont.truetype

instead of Hershey Duplex.

Lastly you need to define the names of the people in the images (starting with line #60):

known_face_names = [
    "Ciuraru Claudiu",
    "Ioana Filip",
    "Ionut Butean",
    "Lorena Nemes",
    "Palfi Levente",
    "Razvan Coloja",
    "Roxana Alexandra",
    "Roxana Mihaela",
    "Raluca Buhas"
#    "Zsuzsi Coloja"
]

The list has to be in the order defined in the . Don't use special characters in the names.

Run Attendance.py with sudo python3 Attendance.py. The Raspberry Pi NoIR v 2.1 I used presents the video stream in a 640x380px window and the video feed framerate is set at 60. The small resolution is to avoid lags, but I left higher resolution examples commented out in the script above.

Tips:

Use flip_method=2 instead of flip_method=0 if your Raspberry Pi camera is upside-down.
You have to use administrative rights to launch the scripts.
You have to use Python3.
You have to use the barrel connector for the NVidia Jetson Nano as the 5V MicroUSB power port can't always handle intensive GPU usage, especially when a lot of people come into the camera's focus.
In name = "Necunoscut", Necunoscut is Romanian for "Unknown". It's used on-screen to define people not listed in the two Python scripts.
pillow has to be installed (sudo apt install pillow or pip3 install pillow)

Note: I take no credit for these scripts. They belong to their original author. I've just modified them to fit my needs.

Top Comments

neilk over 5 years ago

Awesome project. Thanks for sharing this

Neil
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
cypresstwist over 5 years ago in reply to dougw

Thank you. I just like to learn new stuff all the time otherwise I'm starting to feel like I'm "dumbing down". It keeps me occupied and entertained. Sociology is more like a secondary and personal interest for me: I'm a Psychologist and Sociology is closely related with Social Psychology. I started reading Sociology books back in 2012 and they made more sense for me than Psychology itself. So I enrolled both in a PhD and a Bachelor's in Sociology. Not for the degrees but to get guidance. Until then I was reading chaotically and without a direction. My teachers guided me starting from zero and now I know what to read and which concepts define what.
But I'm no expert in Sociology by all means. I just like it a lot. It's easier to quantify people as groups than as individuals. As individuals we vary, as groups on the other hand we are predictable. As a Psychologist I mostly deal with the individual in my private practice. Rarely couples or families or group therapy. So I was lacking in Social Sciences knowledge regarding groups and society as a whole.
It also opened up some new doors for me technology-wise. I had to build my own SBC cluster to do my research for my PhD as it relates to downloading loads of data using various social media APIs, parsing it in BASH and Python then inserting it in a CSV and outputting graphic plots and raw data with the help of R scripts. I would have probably never started using R if it wasn't for Sociology. Or build my own cluster (probably).
Hoping it won't sound pompous, I truly believe education is one of the few things we can gain and manage to keep during our lifetimes. Money comes and goes; so do friends, relatives, goot or bad times. But once you've gained education noone can take that away from you. You can end up poorer than a rusty nail and still have the same education from when you could afford it. Given the fact that in Romania education is mostly free (or costs next to nothing) I keep enrolling and taking classes one cycle after the other. Never regreted those years. I could do this until I'm 70.
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel
dougw over 5 years ago

Awesome project.
Expertise in both technology and sociology - impressive.
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel