
CHAPTER
13
COMPUTER VISION WITH NOVA:
FACE TRACKING

Face tracking is one of the most popular and exciting computer vision applications. To be able to do face tracking with Nova, we will be using Arduino Software (IDE), Processing (IDE) and OpenCV libraries. Arduino Software (IDE) will be used to program Nova, telling it how to move when it receives data regarding the position of a detected face. Processing (IDE) will be used to detect a face through the camera, to get the position of the face in the video frame, and then to send its coordinates to Nova so that it moves and tries to keep the face in the center of the video frame. Lastly, OpenCV, an open-source computer vision library, will be included to allow us use the pre-defined face tracking functions.
Before diving into the coding and the practical example, it is useful to clearly understand the difference between face recognition, face detection and face tracking.

Face recognition, face detection and face tracking form a large amount of computer vision usage in todays world. All three of these sub branches of computer vision has different applications and require different processing powers.
Face detection is a broader term than face recognition. Face detection just means that a system is able to identify that there is a human face present in an image or video. Face detection has several applications, only one of which is facial recognition.
While the process is somewhat complex, face detection algorithms often begin by searching for human eyes. Eyes constitute what is known as a valley region and are one of the easiest features to detect. Once eyes are detected, the algorithm might then attempt to detect facial regions including eyebrows, the mouth, nose, nostrils and the iris. Once the algorithm surmises that it has detected a facial region, it can then apply additional tests to validate whether it has, in fact, detected a face.
Face detection feature of OpenCV (an open-source computer vision library that we will be using with Nova) works in real time and you can easily detect the face in every frame. So, you might ask why do we need face tracking in the first place. We will be exploring the different reasons you may want to track objects in a video and not just do repeated detections.
Usually tracking algorithms are faster than detection algorithms. The reason is simple. When you are tracking an object that was detected in the previous frame, you know a lot about the appearance of the object. You also know the location in the previous frame and the direction and speed of its motion. So in the next frame, you can use all this information to predict the location of the object in the next frame and do a small search around the expected location of the object to accurately locate the object. A good tracking algorithm will use all information it has about the object up to that point while a detection algorithm always starts from scratch. Therefore, while designing an efficient system usually an object detection is run on every nth frame while the tracking algorithm is employed in the n-1 frames in between.
It is true that tracking benefits from the extra information it has, but you can also lose track of an object when they go behind an obstacle for an extended period of time or if they move so fast that the tracking algorithm cannot catch up. It is also common for tracking algorithms to accumulate errors and the bounding box tracking the object slowly drifts away from the object it is tracking. To fix these problems with tracking algorithms, a detection algorithm is run every so often. Detection algorithms are trained on a large number of examples of the object. They, therefore, have more knowledge about the general class of the object. On the other hand, tracking algorithms know more about the specific instance of the class they are tracking. Tracking can help when detection fails: If you are running a face detector on a video and the person’s face get’s occluded by an object, the face detector will most likely fail. A good tracking algorithm, on the other hand, will handle some level of occlusion.
The main algorithm we will be using for detecting and tracking faces will be based on pre-defined human face proportions. As mentioned earlier, there are 2 spots in the human face which are relatively darker compared to the rest: Eyes and mouth. The algorithm tries to find these darker spots with similar proportions to a human face. This is not the most accurate way of detecting and tracking a face, but one of the methods that require the least processing power, in other terms, the fastest method.

Never forget that there will never be one perfect method for a given task. While dealing with many engineering concepts, you have to analyse each method and strategy, evaluate their advantages and disadvantages, investigate what kind of trade offs you need to encounter when making a choice. For the case of face tracking, there are many different algorithms and methods to identify and track a face. You can pick any of them by keeping in mind some important aspects such as the processing power of a PC, quality of the camera, level of light in an ambience, required accuracy etc. You can always use a better algorithm based on your hardware specifications and your requirements for a specific project.
The one we will be working on in this chapter is based on the 3 darker spots in the human face as described in the above paragraph. This is a very simple algorithm that can run most likely on any computer without exceeding the processing limits. On the other hand, the disadvantage of this method is that it cannot detect or track a face when a face is turned left or right relative to the camera, in other terms when both eyes are not seen by the camera. So, the face should always point towards the camera while detecting and tracking a face.
Let's start by opening Processing (IDE) and creating our sketch.
PROCESSING (IDE) & OPENCV
In order to create computer vision and image processing projects, we will mainly be working with OpenCV, an open-source computer vision library. OpenCV has its own libraries that are compatible with Processing (IDE) and they are based on OpenCV's official Java bindings. It attempts to provide convenient wrappers for common OpenCV functions that are friendly to beginners and feel familiar to the Processing environment.
Let's begin by downloading the OpenCV library and including it in our sketch. When you open Processing (IDE), go to tab "Sketch" > "Import Library..." > "Add Library...". A new window will open, and you can write in the search tab "OpenCV". The OpenCV library for Processing (IDE), prepared by Greg Borenstein should come up. Click on install to install the library.


Additionally, it will be useful to download 2 more libraries for the Processing (IDE). When you search for Arduino within the libraries, a library should come up as "Arduino (Firmata)". Finally, search for video, and a library should come up named "Video GStreamer-based video library for Processing". Download and install these 2 libraries as well as you might need them in the future.
Let's start by importing the necessary classes. Add the below lines at the top of your sketch:
import gab.opencv.*;
import java.awt.Rectangle;
import processing.serial.*;
import processing.video.*;
We will be using "OpenCV" library for the face tracking algorithms, "Rectangle" to draw rectangle around the detected faces, "Serial" to establish a serial connection between Nova and your PC and to allow data transfer, and finally "video" to capture video through the camera module of Nova.

After importing the libraries, we need to create variables to start the formation of our sketch. Add the below lines to your sketch:
OpenCV opencv;
Serial CreoqodeNova_Port;
Rectangle[] faces;
Capture video;
Following this step, we will create our void setup() function. This should be familiar to you as the role of setup() function is exactly the same as in Arduino Software (IDE). This function will be executed once each time the sketch is compiled and run. Create your setup() function as below:
void setup() {
}

Inside the setup() function we have just created, we need to add the below lines. Make sure you add these lines between the curly brackets.
size(320 , 240);
video = new Capture(this, 320, 240);
opencv = new OpenCV(this, 320, 240);
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);
video.start();
CreoqodeNova_Port = new Serial(this, "COM4", 9600);
CreoqodeNova_Port.bufferUntil('\n');
faces = opencv.detect();
With the above commands, we are creating a video frame with size 320 pixels x 240 pixels. You can change the resolution as you wish but do not forget the processing power you need for higher resolutions will increase dramatically.
In the following lines, we are calling the required function for face tracking from the OpenCV library, which is named as CASCADE_FRONTALFACE. The command with Serial function simply establishes a connection between your PC and Nova. Note that the communication port in the example is "COM4" as this is the name of the assigned port by Arduino Software (IDE). You have to change it to the port name you are using to upload your sketches to Nova. Processing (IDE) will be sending data (coordinates of a face in this tutorial) to Nova using the same communication port. The number 9600 is the baud rate, which sets the data rate in bits per second (baud) for serial data transmission. For establishing a communication between your PC and Nova, keep the baud rate at 9600.

Since we have completed our setup() function, now it is time to create the main body of our sketch which loops continuously until it is stopped manually. This is the same case with loop() function in Arduino Software (IDE), but in Processing (IDE) this function is called void draw(). Let's create our void draw() function by adding the lines below in our sketch:
void draw() {
}
Inside the draw() function, add the lines below:
scale(1);
opencv.loadImage(video);
image(video, 0, 0 );
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] faces = opencv.detect();
println(faces.length);
The above lines command your PC to capture the video through camera, to detect the face in the frame and then to draw a green rectangle around the detected face. You can always play around this piece of code and change the colour and thickness of the rectangle. Stroke() function is based on RGB colour system, and for every colour you can give a value between 0 to 255. Red is (255, 0, 0), green is (0, 255, 0), blue is (0, 0, 255), and by mixing these values, you can achieve any colour you wish.

Now, we will be creating a large for-loop in order to get the average position of the face in every frame of the captured video.
for (int i = 0; i < faces.length; i++) {
println(faces[i].x + "," + faces[i].y + "," + faces[i].width + "," + faces[i].height);
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height);
avgX = faces[i].x + faces[i].width/2;
avgY = faces[i].y + faces[i].height/2;
fill(255);
strokeWeight(2.0);
stroke(0);
ellipse(avgX, avgY, 8, 8);
stroke(255, 0, 0);
point(avgX, avgY);
}
This for-loop firstly prints the coordinates of the bottom left corner of a detected face in the video frame in terms of pixels, then prints the width and height of the detected face in pixels to the console of Processing (IDE).
Then, the average position for both x and y coordinates are saved in variables named as avgX and avgY. You have to create 2 variables at the beginning of the sketch, even before setup() function as below:
int avgX;
int avgY;
The average for "x" is calculated by summing up the coordinates of the most left pixel of the detected face and half of the width of the detected face. The average for "y" is calculated by summing up the coordinates of the most bottom pixel of the detected face and half of the height of the detected face. This provides the exact centre point of a face.
Finally, to visualise the centre point of the detected face in the video frame, we are using both ellipse() and point() functions. You can always play with their colour and shapes by changing the variables as described before.

Now, we need to create an additional function called captureEvent() that will be used in the draw() function. This function is used to capture the video frames.
void captureEvent(Capture c) {
c.read();
}
Make sure this function is placed outside the curly brackets of draw() function.

Finally, we need to add a couple of more code lines in order to send the average position of the detected face to Nova, so that it can move accordingly and keep the position of the face always in the centre of the frame. Let's add the below lines at the end of for-loop (inside the for-loop) in the draw() function.
avgX = ( avgX * 100 / 177 );
avgY = ( avgY * 100 / 133 );
CreoqodeNova_Port.write(avgX);
CreoqodeNova_Port.write(avgY);

As you will notice, instead of sending the average coordinates of the detected face to Nova, we are multiplying the data with constants. The coordinates of the exact centre of the frame is (160, 120), as the resolution we are working with is 320 pixels x 240 pixels. By multiplying the average values by the coefficients seen above, we are converting the coordinates of the centre point to (90, 90). This will be useful when the data is transfered to Creoqode Mini Mega and used in PID controller of which the setpoints are adjusted to be 90 for both servos. We will go into detail about this section in the upcoming paragraphs.
Finally, by write() function, we are sending the average coordinates of the detected face to Nova, via the communication port assigned for Creoqode Mini Mega. As the communication is based on USB, a serial communication protocol, only one data can be sent at a time. This is the main reason we are sending the average coordinates individually with 2 separate commands. This will be a very similar case when we start coding on Arduino Software (IDE) to program Nova and to read the data sent from your PC.
That's it! We have finished preparing the code part for Processing (IDE).
Before starting to code with Arduino Software (IDE), let's connect the camera of Nova to your PC with the USB cable provided. Once you connect the camera to your PC, it should be automatically detected.

After connecting it, you should go the Device Manager or equivalent application on your PC, and disable the built-in camera or webcam. By doing this, you will allow your PC, or Processing (IDE) to automatically select Nova's camera to capture the video.

Until this stage, we have created our sketch in Processing (IDE), which captures the video, detects the face, and sends the coordinates of the face via the serial communication port assigned for Creoqode Nova. Then, we have connected Nova's camera to our PC, disabled the built-in webcam in device manager, so our sketch in Processing (IDE) would use Nova's camera for video capturing purposes.
Now, we have to create a new sketch in Arduino Software (IDE) which should be receiving the coordinates sent by Processing (IDE), use these coordinates in a closed-loop PID controller, then feed the output of the PID controller to servo motors in order to provide accurate movements to track the face.
Let's start by opening Arduino Software (IDE) and include the servo library, create the servo motors, and assign the servo motors to the pins they are attached to, same as we did during the calibration chapter.

For face tracking, we will only be using 2 servos, which are NovaServo_3 connected to digital pin 36 and NovaServo_4 connected to digital pin 38. NovaServo_3 rotates the face of Nova allowing it to look up and down, where as NovaServo_4 rotates the body of Nova allowing it to look left and right. By using only these 2 servos, Nova can track objects in a quite wide range, with approximately 180 degrees horizontal scanning and 130 degrees vertical scanning.
So, initialize these 2 servos in the setup() function as well, setting them for 90 degrees angle by adding the lines below:
NovaServo_3.write(90);
NovaServo_4.write(90);
After this step we will be creating a PID controller for our sketch and setting up its constants.
If you are new to control theory and PID controllers, please refer to the Control Theory chapter of the Educational Guide which explains in detail the concept of control theory, PI, PD and PID controllers, their practical applications and explains how to use the PID library in Arduino Software (IDE) and how to integrate it to various projects.
First, we will start by including the PID library in our sketch with the line below:
#include <PID_v1.h>
Then, we will create 2 PID controllers, one for NovaServo_3 and one for NovaServo_4 by adding the lines below:
PID PID1(&Input_1, &Output_1, &Setpoint_1, Kp_1, Ki_1, Kd_1, DIRECT);
PID PID2(&Input_2, &Output_2, &Setpoint_2, Kp_2, Ki_2, Kd_2, DIRECT);

We have to also define our Kp, Ki and Kd constants for the 2 PID controllers. We will provide the constants we came up with after quite a long time of trials. However, you can always change these values and improve them. Add below lines at the beginning of the sketch to define the constants and assign values:
double Kp_1 = 0.016;
double Ki_1 = 0.012;
double Kd_1 = 0;
double Kp_2 = 0.028;
double Ki_2 = 0.026;
double Kd_2 = 0;
We have managed to achieve a smooth and fast movement with no overshooting by only using a PI controller, by setting the derivative constant to 0.

Let's also create variables to be used in PID controllers which are Input, Output and Setpoint. For both setpoints, assign a value of 90 as this is the exact centre of the frame. The main purpose of tracking is to keep the face always in the middle of the frame, and by setting the setpoint to 90, the error margin will always be calculated with reference to the centre point of the frame. Place the lines below on top of PID constants that we just created.
double Setpoint_1 = 90;
double Input_1;
double Output_1;
double Setpoint_2 = 90;
double Input_2;
double Output_2;
Following this, we need to initialize the PID controller in setup() function. You can use the lines below, and make sure you place them in the brackets of setup() function.
PID1.SetMode(AUTOMATIC);
PID1.SetSampleTime(1);
PID1.SetOutputLimits(-35, 35);
PID2.SetMode(AUTOMATIC);
PID2.SetSampleTime(1);
PID2.SetOutputLimits(-35, 35);
Above, SetSampleTime() function defines how often PID controller needs to calculate and generate an output, and SetOutputLimits() function defines the minimum and maximum output values that are generated based on the error margin.
At the beginning of the setup() function, inside the curly brackets, add the below line:
Serial.begin(9600);
This will initiate the serial communication through the assigned communication port with baud rate 9600. Remember this was the same rate we set in Processing (IDE).


Before getting into the main body of the sketch, we need to create some variables:
int serialCount = 0;
int serialInArray[2];
int posX = 90;
int posY = 90;
int errorX;
int errorY;
Place these lines before the setup() function where define all other variables. "serialCount" and "serialInArray" will be used to store data sent from Processing (IDE). As told earlier, serial communication allows only one data to be transfered at a time. That is why we are creating an array and storing each data sent in a different element of that array. After that, we will be using those elements to move Nova. "posX" and "posY" are the angle data that are constantly updated with the output generated by PID controllers, and these are the 2 variables that go in the servo write() functions. "errorX" and "errorY" are the variables that the difference between input and setpoint is assigned to be used for PID controllers. Since we have created the necessary variables, we can start shaping the loop() function now. Let's start with the lines below that goes in the curly brackets of loop() function:
while(Serial.available() == 0);
serialInArray[serialCount] = Serial.read();
serialCount++;
This means, unless there is no data coming from the serial port, do nothing. If there is data coming, assign the first bit of data to first element of "serialInArray", then assign the second bit of data to the second element of the same array, and so on. For this project, the only data that will be coming through is the 2D coordinates of a face in a video frame, which consists of 2 bits of data, x and y positions. So, we will be creating a large if loop, which will reset the array each time the third bit of data arrives. By doing so, there will always be a "serialInArray" array containing only 2 elements (x, y). Please look at the code lines below, and it should make more sense:
if (serialCount > 1){
Input_1 = serialInArray[1];
Input_2 = serialInArray[0];
PID1.Compute();
PID2.Compute();
posX = posX + Output_2;
posY = posY + Output_1;
NovaServo_4.write(posX);
if(posY > 75)NovaServo_3.write(posY);
serialCount = 0;
}
This block of code means:
1. If there is already 2 bits of data recorded in the array, assign these values to Input_1 and Input_2 respectively.
2. Run the PID controllers and generate 2 outputs by using the input and setpoints.
3. Feed the outputs generated to the PosX and PosY and update their values, in order move Nova respectively by placing them in the servo write() functions.
4. Finally, set "serialCount" to 0, so that new bits of data can be read and this procedure can be repeated.

Congratulations!
Now Nova is ready to do face tracking. By clicking the upload symbol on Arduino Software (IDE), upload your sketch to Nova so that it knows what to do before we send the data from Processing (IDE). After Nova is programmed, open the sketch you have created in Processing (IDE) and run it. A video frame should open with 320 x 240 pixels, and your face should be in a green rectangular contour with a dot in the middle, which is the exact point of which the coordinates will be sent over to Nova by Processing (IDE).
When you move your face, Nova should receive the updated location of the detected face, and move accordingly so that the new location of your face is again in the middle of the frame. Play with the code to understand it thoroughly by changing the variables, constants, even the algorithm. There will always be a better and smoother way of tracking a face, and it is up to you to improve it and share with our community!