Create a real-time object tracking camera with TensorFlow and Raspberry Pi

Are you just getting started with machine/deep learning, TensorFlow, or Raspberry Pi?

I created rpi-deep-pantilt as an interactive demo of object detection in the wild, and in this article, I'll show you how to reproduce the video below, which depicts a camera panning and tilting to track my movement across a room.

Image by:

^{Raspberry Pi 4GB, Pi Camera v2.1, Pimoroni Pan-Tilt HAT, Coral Edge TPU USB Accelerator}

I'm just a girl, standing in front of a tiny computer, reminding you most computing problems can be solved by sheer force of will. ?

MobileNetv3 + SSD @TensorFlow model I converted #TFLite #RaspberryPi + @pimoroni pantilt hat, PID controller.

Write-up soon! ✨ https://t.co/v63KSJtJHO pic.twitter.com/dmyAlWCnWk

— Leigh (@grepLeigh) November 28, 2019

This article will cover:

Build materials and hardware assembly instructions.
Deploying a TensorFlow Lite object-detection model (MobileNetV3-SSD) to a Raspberry Pi.
Sending tracking instructions to pan/tilt servo motors using a proportional–integral–derivative (PID) controller.
Accelerating inferences of any TensorFlow Lite model with Coral's USB Edge TPU Accelerator and Edge TPU Compiler.

Terms and references

Raspberry Pi: A small, affordable computer popular with educators, hardware hobbyists, and robot enthusiasts.
Raspbian: The Raspberry Pi Foundation's official operating system for the Pi. Raspbian is derived from Debian Linux.
TensorFlow: An open source framework for dataflow programming used for machine learning and deep neural learning.
TensorFlow Lite: An open source framework for deploying TensorFlow models on mobile and embedded devices.
Convolutional neural network: CNN is a type of neural network architecture that is well-suited for image classification and object detection tasks.
Single-shot detector: SSD is a type of CNN architecture specialized for real-time object detection, classification, and bounding box localization.
MobileNetV3: A state-of-the-art computer vision model optimized for performance on modest mobile phone processors.
MobileNetV3-SSD: An SSD based on MobileNet architecture. This tutorial will use MobileNetV3-SSD models available through TensorFlow's object-detection model zoo.

Image by:
^{Comparison of computer vision neural networks}
Edge TPU: a tensor processing unit (TPU) is an integrated circuit for accelerating computations performed by TensorFlow. The Edge TPU was developed with a small footprint for mobile and embedded devices "at the edge."

^{Cloud TPUs (left and center) accelerate TensorFlow model training and inference. Edge TPUs (right) accelerate inferences in mobile devices.}

Build list

Essential

Raspberry Pi 4 (4GB recommended)
Raspberry Pi Camera V2
Pimoroni Pan-Tilt HAT Kit
MicroSD card (16GB or more)
Micro-HDMI cable

Optional

12" CSI/DSI ribbon for Raspberry Pi Camera: The Pi Camera's stock cable is too short for the Pan-Tilt HAT's full range of motion.
RGB NeoPixel Stick: This component adds a consistent light source to your project.
Coral Edge TPU USB Accelerator: Accelerates inference (prediction) speed on the Raspberry Pi. You don't need this to reproduce the demo.

Looking for a project with fewer moving pieces? Check out Portable Computer Vision: TensorFlow 2.0 on a Raspberry Pi to create a hand-held image classifier.

Set up the Raspberry Pi

There are two ways you can install Raspbian to your MicroSD card:

NOOBS ("New Out Of Box Software") is a GUI operating system installation manager. If this is your first Raspberry Pi project, I'd recommend starting here.
Write the Raspbian image to an SD card.

This tutorial and supporting software were written using Raspbian (Buster). If you're using a different version of Raspbian or another platform, you'll probably experience some pains.

Before proceeding, you'll need to:

Install software

Install system dependencies:

$ sudo apt-get update && sudo apt-get install -y python3-dev libjpeg-dev libatlas-base-dev raspi-gpio libhdf5-dev python3-smbus

Create a new project directory:

$ mkdir rpi-deep-pantilt && cd rpi-deep-pantilt

Create a new virtual environment:
```
$ python3 -m venv .venv
```

Activate the virtual environment:

$ source .venv/bin/activate && python3 -m pip install --upgrade pip

Install TensorFlow 2.0 from a community-built wheel:

$ pip install https://github.com/leigh-johnson/Tensorflow-bin/blob/master/tensorflow-2.0.0-cp37-cp37m-linux_armv7l.whl?raw=true

Install the rpi-deep-pantilt Python package:
```
$ python3 -m pip install rpi-deep-pantilt
```

Assemble Pan-Tilt HAT hardware

If you purchased a pre-assembled Pan-Tilt HAT kit, you can skip to the next section. Otherwise, follow the steps in Assembling Pan-Tilt HAT before proceeding.

Connect the Pi Camera

Turn off the Raspberry Pi.
Locate the camera module between the USB module and HDMI modules.
Unlock the black plastic clip by (gently) pulling upward.
Insert the camera module's ribbon cable (with metal connectors facing away from the Ethernet/USB ports on a Raspberry Pi 4).
Lock the black plastic clip.

Enable the Pi Camera

Turn the Raspberry Pi on.
Run sudo raspi-config and select Interfacing Options from the Raspberry Pi Software Configuration Tool's main menu. Press Enter.
Select the Enable Camera menu option and press Enter.
In the next menu, use the Right arrow key to highlight Enable and press Enter.

Test the Pan-Tilt HAT

Next, test the installation and setup of your Pan-Tilt HAT module.

SSH into your Raspberry Pi.
Activate your virtual environment:
```
source .venv/bin/activate
```
Run:
```
rpi-deep-pantilt test pantilt
```
Exit the test with Ctrl+C.

If you installed the HAT correctly, you should see both servos moving in a smooth sinusoidal motion while the test is running.

Test the Pi Camera

Next, verify that the Pi Camera is installed correctly by starting the camera's preview overlay. The overlay will render on the Pi's primary display (HDMI).

Plug your Raspberry Pi into an HDMI screen.
SSH into your Raspberry Pi.
Activate your virtual environment:
```
$ source .venv/bin/activate
```
Run:
```
$ rpi-deep-pantilt test camera
```
Exit the test with Ctrl+C.

If you installed the Pi Camera correctly, you should see footage from the camera rendered on your HDMI or composite display.

Test object detection

Next, verify you can run an object-detection model (MobileNetV3-SSD) on your Raspberry Pi.

SSH into your Raspberry Pi.
Activate your Virtual Environment:
```
$ source .venv/bin/activate
```
Run:
```
$ rpi-deep-pantilt detect
```

Your Raspberry Pi should detect objects, attempt to classify them, and draw bounding boxes around them. Note: Only the following objects can be detected and tracked using the default MobileNetV3-SSD model.

$ rpi-deep-pantilt list-labels
[‘person’, ‘bicycle’, ‘car’, ‘motorcycle’, ‘airplane’, ‘bus’, ‘train’, ‘truck’, ‘boat’, ‘traffic light’, ‘fire hydrant’, ‘stop sign’, ‘parking meter’, ‘bench’, ‘bird’, ‘cat’, ‘dog’, ‘horse’, ‘sheep’, ‘cow’, ‘elephant’, ‘bear’, ‘zebra’, ‘giraffe’, ‘backpack’, ‘umbrella’, ‘handbag’, ‘tie’, ‘suitcase’, ‘frisbee’, ‘skis’, ‘snowboard’, ‘sports ball’, ‘kite’, ‘baseball bat’, ‘baseball glove’, ‘skateboard’, ‘surfboard’, ‘tennis racket’, ‘bottle’, ‘wine glass’, ‘cup’, ‘fork’, ‘knife’, ‘spoon’, ‘bowl’, ‘banana’, ‘apple’, ‘sandwich’, ‘orange’, ‘broccoli’, ‘carrot’, ‘hot dog’, ‘pizza’, ‘donut’, ‘cake’, ‘chair’, ‘couch’, ‘potted plant’, ‘bed’, ‘dining table’, ‘toilet’, ‘tv’, ‘laptop’, ‘mouse’, ‘remote’, ‘keyboard’, ‘cell phone’, ‘microwave’, ‘oven’, ‘toaster’, ‘sink’, ‘refrigerator’, ‘book’, ‘clock’, ‘vase’, ‘scissors’, ‘teddy bear’, ‘hair drier’, ‘toothbrush’]

Track objects at ~8FPS

This is the moment you've been waiting for! Take the following steps to track an object at roughly eight frames per second (FPS) using the Pan-Tilt HAT.

SSH into your Raspberry Pi.
Activate your virtual environment:
```
$source .venv/bin/activate
```
Run:
```
$ rpi-deep-pantilt track
```

By default, this will track objects with the label person. You can track a different type of object using the --label parameter.

For example, to track a banana, you would run:

$ rpi-deep-pantilt track --label=banana

On a Raspberry Pi 4 (4GB), I benchmarked my model at roughly 8FPS.

INFO:root:FPS: 8.100870481091935
INFO:root:FPS: 8.130448201926173
INFO:root:FPS: 7.6518234817241355
INFO:root:FPS: 7.657477766009717
INFO:root:FPS: 7.861758172395542
INFO:root:FPS: 7.8549541944597
INFO:root:FPS: 7.907857699044301

Track objects in real-time with Edge TPU

You can accelerate model inference speed with Coral's USB Accelerator. The USB Accelerator contains an Edge TPU, which is an ASIC chip specialized for TensorFlow Lite operations. For more info, check out Getting started with the USB Accelerator.

SSH into your Raspberry Pi.

Install the Edge TPU runtime:

$ echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

$ curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

$ sudo apt-get update && sudo apt-get install libedgetpu1-std

Plug in the Edge TPU (preferably into a USB 3.0 port). If your Edge TPU was already plugged in, remove and re-plug it so the udev device manager can detect it.
Try the detect command with the --edge-tpu option. You should be able to detect objects in real-time!
```
$ rpi-deep-pantilt detect --edge-tpu --loglevel=INFO
```
Note that loglevel=INFO will show you the FPS at which objects are detected and bounding boxes are rendered to the Raspberry Pi Camera's overlay.

You should see around ~24FPS, which is the rate at which frames are sampled from the Pi Camera into a frame buffer:
```
INFO:root:FPS: 24.716493958392558
INFO:root:FPS: 24.836166606505206
INFO:root:FPS: 23.031063233367547
INFO:root:FPS: 25.467177106703623
INFO:root:FPS: 27.480438524486594
INFO:root:FPS: 25.41399952505432
```
Try the track command with the --edge-tpu option:
```
$ rpi-deep-pantilt track --edge-tpu
```

Wrapping up

Congratulations! You're now the proud owner of a DIY object-tracking system, which uses a single-shot detector (a type of convolutional neural network) to classify and localize objects.

PID controller

The pan/tilt tracking system uses a proportional–integral–derivative (PID) controller to track the centroid of a bounding box smoothly.

TensorFlow model zoo

The models in this tutorial are derived from ssd_mobilenet_v3_small_coco and ssd_mobilenet_edgetpu_coco in the TensorFlow detection model zoo.

My models are available for download via GitHub releases notes in leigh-johnson/rpi-deep-pantilt.

I added the custom TFLite_Detection_PostProcess operation, which implements a variation of non-maximum suppression (NMS) on model output. NMS is a technique that filters many bounding box proposals using set operations.

Special thanks and acknowledgments

MobileNetEdgeTPU SSDLite contributors: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu, Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le
MobileNetV3 SSDLite contributors: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang
Adrian Rosebrock for writing Pan/tilt face tracking with a Raspberry Pi and OpenCV, which was the inspiration for this whole project
Jason Zaman for reviewing this article and early release candidates

This article was originally published on the Towards Data Science Medium channel and is reused with permission.

Comments are closed.

This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.

Create a real-time object tracking camera with TensorFlow and Raspberry Pi

Terms and references

Build list

Essential

Optional

Set up the Raspberry Pi

Install software

Assemble Pan-Tilt HAT hardware

Connect the Pi Camera

Enable the Pi Camera

Test the Pan-Tilt HAT

Test the Pi Camera

Test object detection

Track objects at ~8FPS

Track objects in real-time with Edge TPU

Wrapping up

PID controller

TensorFlow model zoo

Special thanks and acknowledgments

7 favorite Raspberry Pi projects

How to build projects using the Raspberry Pi camera

3 cool machine learning projects using TensorFlow and the Raspberry Pi

Comments are closed.

Create a real-time object tracking camera with TensorFlow and Raspberry Pi

Terms and references

Build list

Essential

Optional

Set up the Raspberry Pi

Install software

Assemble Pan-Tilt HAT hardware

Connect the Pi Camera

Enable the Pi Camera

Test the Pan-Tilt HAT

Test the Pi Camera

Test object detection

Track objects at ~8FPS

Track objects in real-time with Edge TPU

Wrapping up

PID controller

TensorFlow model zoo

Special thanks and acknowledgments

7 favorite Raspberry Pi projects

How to build projects using the Raspberry Pi camera

3 cool machine learning projects using TensorFlow and the Raspberry Pi

Comments are closed.

Related Content