Story
Both Youtube and Tiktok are becoming more and more popular. This means that more and more people will like to record and upload videos on the Internet. However, when recording video, you always need to manually move the device to adjust the correct angle. So why can't we use a more convenient way?
We can create a webcam that tracks your face, moves automatically
No matter where your face is, it will follow your face.Its smart.
I'm in the early stages of exploring it, and there isn't much in the way of actual construction yet, just ideas about functionality. Of course, I'm doing some experimenting
Hardware
This project is pretty straightforward with hardware:
- 2 Servos (such as MG996)
- Arduino Nano
- Logitech 720p webcam
- 3D printed gimbal parts
I didn't iterate and perfect them - I'm sure there are improvements to be made to these.
Software
I used python, OBS Studio and openCV to allow the camera to track the face and send the video to a video call
- Python monitors video from a webcam.
- Use OBS studio to create a virtual webcam. Ensure video can be transferred to zoom, WebEx, etc. It took a few tries, but after using the installer multiple times, I was able to create a "dummy" webcam that my python code could send images to. Just select the virtual camera source in Zoom.
- Use OpenCV to track faces in videos, it has built-in face tracking capabilities. I also implemented tracking of my headphones (the headphones have bright blue bars that are great for tracking) and aruco tags (similar to QR codes).
Track faces
To track my face reliably, I had to implement multiple steps of tracking:
1. Front-face tracking
1.1 Python variable primary_cascade
2. If 1 doesn't find a face, check with a different front-face tracking algorithm
2.1 Python variable secondary_cascade
3. If 2 fails, try looking for a face in profile (from the side)
3.1 Python variable tertiary_cascade
Now:
1. If the face is in the deadband (center rectangle of the screen) do nothing.
1.1 This prevents the camera from moving every time my head moves an inch
2. If not in the deadband, move towards it
2.1 I do a lot of filtering here. I implemented a crude PID system with "velocity" averaging
3. If no face was found, but one was found recently, keep moving in the direction the camera was already moving
3.1 The problem: Whenever I walk too quickly out of frame, it usually caught me as I started moving but then it lost me. This makes it a tricky control loop problem to solve - if your controller is too far off of the real value, no more measurements can be taken.
3.2 The solution: I implemented a "momentum" value so that the camera proceeds in the direction the last face was seen. This momentum lasts for 1.5 seconds after a face was seen.
Sometimes, no face is seen for a while, and the camera is staring at the ceiling (very unhelpful). I implemented a timeout so that if after 7 seconds a face has not been seen, the camera turns back to center (it's power up default). This sometimes helps it find my face again, if I'm somewhere near the middle of the room.