Neural style transfer
Among the many applications a neural network can be applied to, there is an interesting field called Neural style transfer. This term refers to an algorithm that takes as input a content-image (e.g. a tortle), a style-image (e.g. artistic waves) and return the content of the content-image as if it was ‘painted’ using the artistic style of the style-image. This technique was initially proposed by Gatys et al in 2105 and the good things is that it does not require any new foundation: it just uses well-known loss functions. In short we define two loss functions, one for the content (DC) and one for the style (DS). DC measures how different the content is between two images, while DS measures how different the style is between two images. Then, we take a third image, the input, (e.g. a with noise), and we transform it in order to both minimize its content-distance with the content-image and its style-distance with the style-image.
Johnson et al. (2016) built on the work of Gatys et al., proposing a neural style transfer algorithm that is up to three orders of magnitude faster. The Johnson et al. method frames neural style transfer as a super-resolution-like problem based on perceptual loss functions. While the Johnson et al. method is certainly fast, the biggest downside is that you cannot arbitrarily select your style images like you could in the Gatys et al. method. Instead, you first need to explicitly train a network to reproduce the style of your desired image. Once the network is trained, you can then apply it to any content image you wish. You should see the Johnson et al. method as a more of an “investment” in your style image — you better like your style image as you’ll be training your own network to reproduce its style on content images.
Johnson et al. provide documentation on how to train your own neural style transfer models on their official GitHub page.
Finally, it’s also worth noting that that in Ulyanov et al.’s 2017 publication, Instance Normalization: The Missing Ingredient for Fast Stylization, it was found that swapping batch normalization for instance normalization (and applying instance normalization at both training and testing), leads to even faster real-time performance and arguably more aesthetically pleasing results as well.
Python implementation
There are many implementations of this algorithm on the web. I started from the code I extracted from this tutorial because it only depends open python3 and opencv-python. There are other feature-rich and fast implementations that use pytorch and cuda, but I experienced some installation issues and, for this reason, I temporarily gave up.
To create the environment, I installed opencv-python package
pip3 install opencv-python
Then I simply executed the code extracted from the tutorial on a sample image. The first think to do, is to download the model and norms from Johnson web pages
python3 init.py --download
The style transfer can now be tested on a sample image
python3 init.py --image ./image.jpg
To run a real-time style transfer on the video captured by the camera, simply run
python3 init.py
Emotion settings
Next part of the project is to create controls the use will adjust according to his or her feelings at the moment the picture was taken
I implemented a control panel in PyQt. The control panel includes a push button to save the picture and six sliders, one for each of the following base feelings
- fear
- sadness
- surprise
- happiness
- disgust
- anger
Each basic feeling will be associated to a certain style and the percentage associated to each feelings will be use to make a weighted composition of styles to apply to the original image.
Control panel updates a file on the Raspberry Pi's file system whenever one of the sliders value changes. The file is the read by the style transfer application
The algorithm to mix styles is implemented in function predict_all. Here we re-scale the percentages of each style and create an array of weights to apply to each image generated by the style transfer
def predict_all(img, values, h, w): blob = cv.dnn.blobFromImage(img, 1.0, (w, h), (meanX, meanY, meanZ), swapRB=False, crop=False) sum = 0 for value in values: sum += value if sum == 0: return weights = [] for value in values: weights.append(value / sum) num_models = 0 for i in range(0, len(nets)): if weights[i] != 0: net = nets[i] print ("[INFO] Applying model " + str(i) + ", weight: " + str(weights[i])) if num_models == 0: out = predict(blob, net, h, w) * weights[i] else: out += predict(blob, net, h, w) * weights[i] num_models = num_models + 1 return out
The output of the predict_all function is then visualized by means of the OpenCV's imshow function
Sharing pictures
My initial plan was to install either a PiMoroni Enviro or a PiMoroni Automation hat to get some feedback from the environment that could have been "merged" into the picture, but there was some mechanical issues with the supports of the plexyglass case. So for the moment I put this feature in standby and I implemented instead the ability to send picture to your Telegram account. To make your Emoticam talk with you Telegram account, follow these steps
- Search for a Telegram contact named "botfather"
- Type "/start" to start chatting with the bot
- Type "/newbot" to create a new bot. You will be asked to enter a name and a username for the new bot. Botfather will reply with the token you need to insert in your python script
- Because we need to send notifications to the Telegram account, we need the unique identifier of the user. There is no easy way to find the user id but invoke a specific API to get the latest messages from the destination account and read the user id there. So in your Telegram account search the account of the newlt-created bot and send a message. Then, in your browser, open the URL https://api.telegram.org/bot<YOUR_BOT_TOKEN>/getUpdates. This will show the list of pending messages. The id we are looking for is the value of the filed "chat_id"
- Install the Python Telegram library
pip3 install telepot
- We have everything we need to create our bot. The skeleton of the bot is in the listing below
import datetime # Importing the datetime library import telepot # Importing the telepot library from telepot.loop import MessageLoop # Library function to communicate with telegram bot from time import sleep # Importing the time library to provide the delays in program def handle(msg): chat_id = msg['chat']['id'] # Receiving the message from telegram command = msg['text'] # Getting text from the message print ('Received:') print(command) # code to handle incoming commands # Insert your telegram token below bot = telepot.Bot('<YOUR TOKEN>') print (bot.getMe()) # Start listening to the telegram bot and whenever a message is received, the handle function will be called. MessageLoop(bot, handle).run_as_thread() print ('Listening....') while 1: sleep(10) if (os.path.exists('./share') and os.path.exists("./picture.jpg")): bot.sendPhoto(char_id="<CHAT ID>", photo=open("./picture.jpg", rb")) os.system("rm -f ./picture.jpg")
Building the Emoticam
The components of the Emoticam are
- a Raspberyy Pi 4
- A 5" 800x480 DSI display
- A Raspberry Pi HW camera
- 2 mm Plexiglass
Assembly the Raspberry Pi and the display by means of four bolts as shown in picture below
Then, you need 5 pieces of Plexyglass:
- 12.5 x 8.6 cm
- 12.1 x 4 cm (2 pieces)
- 8.6 x 4 cm (2 pieces)
Glue the pieces to make a box, the drill 4 holes to fix the box to the Raspberry Pi and display assembly and other 4 holes to fix the Raspberry Pi HW camera
Use screws and bolts to keep everything in place
Final touches
Now some final touches
Hide desktop taskbar
The procedure to hide the desktop taskbar is quite simple
- Open an SSH terminal
- Go to /etc/xdg/lxsession/LXDE-pi folder
cd /etc/xdg/lxsession/LXDE-pi
- Edit file autostart
sudo nano autostart
- Comment out the line "@lxpanel --profile LXDE-pi" by inserting a "#" character at the start of the line
- Press CTRL-X to save changes and exit the editor
Create the script
To run the three Python scripts that makes up the Emoticam, I created a small script and saved it into the folder /etc/init.d, where all the initialization scripts are stored
The script invokes the three Python applications and makes them run in background (see the "&" at the end of each line)
#!/bin/sh cd /home/pi/emoticam python3 init.py & python3 controls.py & python3 tg.py &
Be sure to give the file you just created execution permission
sudo chmod +x /etc/init.d/emoticam.sh
Run the script at boot
To start the Emoticam every time the Raspberry pi boots,
- create a directory autostart in /home/pi/.config
mkdir /home/pi/.config/autostart
- move into this directory
cd /home/pi/.config/autostart
- create a file Emoticam.desktop with the following content
Desktop Entry] Name=Emoticam Type=Application Comment= Exec=/home/pi/emoticam/emoticam.sh
- make sure the file has execute permissions
chmod +x Emoticam.desktop
Demo
Video below shows the Emoticam in action
Source code for this project is available at
https://github.com/ambrogio-galbusera/emoticam
I hope you enjoyed this project. Thanks for reading!