Are you just getting started with machine/deep learning, TensorFlow, or Raspberry Pi?
I created rpi-deep-pantilt as an interactive demo of object detection in the wild, and in this article, I'll show you how to reproduce the video below, which depicts a camera panning and tilting to track my movement across a room.
I'm just a girl, standing in front of a tiny computer, reminding you most computing problems can be solved by sheer force of will. ?
MobileNetv3 + SSD @TensorFlow model I converted #TFLite#RaspberryPi + @pimoroni pantilt hat, PID controller.
Write-up soon! ✨ https://t.co/v63KSJtJHO pic.twitter.com/dmyAlWCnWk— Leigh (@grepLeigh) November 28, 2019
This article will cover:
- Build materials and hardware assembly instructions.
- Deploying a TensorFlow Lite object-detection model (MobileNetV3-SSD) to a Raspberry Pi.
- Sending tracking instructions to pan/tilt servo motors using a proportional–integral–derivative (PID) controller.
- Accelerating inferences of any TensorFlow Lite model with Coral's USB Edge TPU Accelerator and Edge TPU Compiler.
Terms and references
- Raspberry Pi: A small, affordable computer popular with educators, hardware hobbyists, and robot enthusiasts.
- Raspbian: The Raspberry Pi Foundation's official operating system for the Pi. Raspbian is derived from Debian Linux.
- TensorFlow: An open source framework for dataflow programming used for machine learning and deep neural learning.
- TensorFlow Lite: An open source framework for deploying TensorFlow models on mobile and embedded devices.
- Convolutional neural network: CNN is a type of neural network architecture that is well-suited for image classification and object detection tasks.
- Single-shot detector: SSD is a type of CNN architecture specialized for real-time object detection, classification, and bounding box localization.
- MobileNetV3: A state-of-the-art computer vision model optimized for performance on modest mobile phone processors.
- MobileNetV3-SSD: An SSD based on MobileNet architecture. This tutorial will use MobileNetV3-SSD models available through TensorFlow's object-detection model zoo.
- Edge TPU: a tensor processing unit (TPU) is an integrated circuit for accelerating computations performed by TensorFlow. The Edge TPU was developed with a small footprint for mobile and embedded devices "at the edge."
|
|
|
Cloud TPUs (left and center) accelerate TensorFlow model training and inference. Edge TPUs (right) accelerate inferences in mobile devices.
Build list
Essential
- Raspberry Pi 4 (4GB recommended)
- Raspberry Pi Camera V2
- Pimoroni Pan-Tilt HAT Kit
- MicroSD card (16GB or more)
- Micro-HDMI cable
Optional
- 12" CSI/DSI ribbon for Raspberry Pi Camera: The Pi Camera's stock cable is too short for the Pan-Tilt HAT's full range of motion.
- RGB NeoPixel Stick: This component adds a consistent light source to your project.
- Coral Edge TPU USB Accelerator: Accelerates inference (prediction) speed on the Raspberry Pi. You don't need this to reproduce the demo.
Looking for a project with fewer moving pieces? Check out Portable Computer Vision: TensorFlow 2.0 on a Raspberry Pi to create a hand-held image classifier.
Set up the Raspberry Pi
There are two ways you can install Raspbian to your MicroSD card:
- NOOBS ("New Out Of Box Software") is a GUI operating system installation manager. If this is your first Raspberry Pi project, I'd recommend starting here.
- Write the Raspbian image to an SD card.
This tutorial and supporting software were written using Raspbian (Buster). If you're using a different version of Raspbian or another platform, you'll probably experience some pains.
Before proceeding, you'll need to:
Install software
- Install system dependencies:
$ sudo apt-get update && sudo apt-get install -y python3-dev libjpeg-dev libatlas-base-dev raspi-gpio libhdf5-dev python3-smbus
- Create a new project directory:
$ mkdir rpi-deep-pantilt && cd rpi-deep-pantilt
- Create a new virtual environment:
$ python3 -m venv .venv
- Activate the virtual environment:
$ source .venv/bin/activate && python3 -m pip install --upgrade pip
- Install TensorFlow 2.0 from a community-built wheel:
$ pip install https://github.com/leigh-johnson/Tensorflow-bin/blob/master/tensorflow-2.0.0-cp37-cp37m-linux_armv7l.whl?raw=true
- Install the rpi-deep-pantilt Python package:
$ python3 -m pip install rpi-deep-pantilt
Assemble Pan-Tilt HAT hardware
If you purchased a pre-assembled Pan-Tilt HAT kit, you can skip to the next section. Otherwise, follow the steps in Assembling Pan-Tilt HAT before proceeding.
Connect the Pi Camera
- Turn off the Raspberry Pi.
- Locate the camera module between the USB module and HDMI modules.
- Unlock the black plastic clip by (gently) pulling upward.
- Insert the camera module's ribbon cable (with metal connectors facing away from the Ethernet/USB ports on a Raspberry Pi 4).
- Lock the black plastic clip.
Enable the Pi Camera
- Turn the Raspberry Pi on.
- Run sudo raspi-config and select Interfacing Options from the Raspberry Pi Software Configuration Tool's main menu. Press Enter.
- Select the Enable Camera menu option and press Enter.
- In the next menu, use the Right arrow key to highlight Enable and press Enter.
Test the Pan-Tilt HAT
Next, test the installation and setup of your Pan-Tilt HAT module.
- SSH into your Raspberry Pi.
- Activate your virtual environment:
source .venv/bin/activate
- Run:
rpi-deep-pantilt test pantilt
- Exit the test with Ctrl+C.
If you installed the HAT correctly, you should see both servos moving in a smooth sinusoidal motion while the test is running.
Test the Pi Camera
Next, verify that the Pi Camera is installed correctly by starting the camera's preview overlay. The overlay will render on the Pi's primary display (HDMI).
- Plug your Raspberry Pi into an HDMI screen.
- SSH into your Raspberry Pi.
- Activate your virtual environment:
$ source .venv/bin/activate
- Run:
$ rpi-deep-pantilt test camera
- Exit the test with Ctrl+C.
If you installed the Pi Camera correctly, you should see footage from the camera rendered on your HDMI or composite display.
Test object detection
Next, verify you can run an object-detection model (MobileNetV3-SSD) on your Raspberry Pi.
- SSH into your Raspberry Pi.
- Activate your Virtual Environment:
$ source .venv/bin/activate
- Run:
$ rpi-deep-pantilt detect
Your Raspberry Pi should detect objects, attempt to classify them, and draw bounding boxes around them. Note: Only the following objects can be detected and tracked using the default MobileNetV3-SSD model.
$ rpi-deep-pantilt list-labels
[‘person’, ‘bicycle’, ‘car’, ‘motorcycle’, ‘airplane’, ‘bus’, ‘train’, ‘truck’, ‘boat’, ‘traffic light’, ‘fire hydrant’, ‘stop sign’, ‘parking meter’, ‘bench’, ‘bird’, ‘cat’, ‘dog’, ‘horse’, ‘sheep’, ‘cow’, ‘elephant’, ‘bear’, ‘zebra’, ‘giraffe’, ‘backpack’, ‘umbrella’, ‘handbag’, ‘tie’, ‘suitcase’, ‘frisbee’, ‘skis’, ‘snowboard’, ‘sports ball’, ‘kite’, ‘baseball bat’, ‘baseball glove’, ‘skateboard’, ‘surfboard’, ‘tennis racket’, ‘bottle’, ‘wine glass’, ‘cup’, ‘fork’, ‘knife’, ‘spoon’, ‘bowl’, ‘banana’, ‘apple’, ‘sandwich’, ‘orange’, ‘broccoli’, ‘carrot’, ‘hot dog’, ‘pizza’, ‘donut’, ‘cake’, ‘chair’, ‘couch’, ‘potted plant’, ‘bed’, ‘dining table’, ‘toilet’, ‘tv’, ‘laptop’, ‘mouse’, ‘remote’, ‘keyboard’, ‘cell phone’, ‘microwave’, ‘oven’, ‘toaster’, ‘sink’, ‘refrigerator’, ‘book’, ‘clock’, ‘vase’, ‘scissors’, ‘teddy bear’, ‘hair drier’, ‘toothbrush’]
Track objects at ~8FPS
This is the moment you've been waiting for! Take the following steps to track an object at roughly eight frames per second (FPS) using the Pan-Tilt HAT.
- SSH into your Raspberry Pi.
- Activate your virtual environment:
$source .venv/bin/activate
- Run:
$ rpi-deep-pantilt track
By default, this will track objects with the label person. You can track a different type of object using the --label parameter.
For example, to track a banana, you would run:
$ rpi-deep-pantilt track --label=banana
On a Raspberry Pi 4 (4GB), I benchmarked my model at roughly 8FPS.
INFO:root:FPS: 8.100870481091935
INFO:root:FPS: 8.130448201926173
INFO:root:FPS: 7.6518234817241355
INFO:root:FPS: 7.657477766009717
INFO:root:FPS: 7.861758172395542
INFO:root:FPS: 7.8549541944597
INFO:root:FPS: 7.907857699044301
Track objects in real-time with Edge TPU
You can accelerate model inference speed with Coral's USB Accelerator. The USB Accelerator contains an Edge TPU, which is an ASIC chip specialized for TensorFlow Lite operations. For more info, check out Getting started with the USB Accelerator.
- SSH into your Raspberry Pi.
- Install the Edge TPU runtime:
$ echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list $ curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - $ sudo apt-get update && sudo apt-get install libedgetpu1-std
- Plug in the Edge TPU (preferably into a USB 3.0 port). If your Edge TPU was already plugged in, remove and re-plug it so the udev device manager can detect it.
- Try the detect command with the --edge-tpu option. You should be able to detect objects in real-time!
$ rpi-deep-pantilt detect --edge-tpu --loglevel=INFO
Note that loglevel=INFO will show you the FPS at which objects are detected and bounding boxes are rendered to the Raspberry Pi Camera's overlay.
You should see around ~24FPS, which is the rate at which frames are sampled from the Pi Camera into a frame buffer:INFO:root:FPS: 24.716493958392558 INFO:root:FPS: 24.836166606505206 INFO:root:FPS: 23.031063233367547 INFO:root:FPS: 25.467177106703623 INFO:root:FPS: 27.480438524486594 INFO:root:FPS: 25.41399952505432
- Try the track command with the --edge-tpu option:
$ rpi-deep-pantilt track --edge-tpu
Wrapping up
Congratulations! You're now the proud owner of a DIY object-tracking system, which uses a single-shot detector (a type of convolutional neural network) to classify and localize objects.
PID controller
The pan/tilt tracking system uses a proportional–integral–derivative (PID) controller to track the centroid of a bounding box smoothly.
TensorFlow model zoo
The models in this tutorial are derived from ssd_mobilenet_v3_small_coco and ssd_mobilenet_edgetpu_coco in the TensorFlow detection model zoo.
My models are available for download via GitHub releases notes in leigh-johnson/rpi-deep-pantilt.
I added the custom TFLite_Detection_PostProcess operation, which implements a variation of non-maximum suppression (NMS) on model output. NMS is a technique that filters many bounding box proposals using set operations.
Special thanks and acknowledgments
- MobileNetEdgeTPU SSDLite contributors: Yunyang Xiong, Bo Chen, Suyog Gupta, Hanxiao Liu, Gabriel Bender, Mingxing Tan, Berkin Akin, Zhichao Lu, Quoc Le
- MobileNetV3 SSDLite contributors: Bo Chen, Zhichao Lu, Vivek Rathod, Jonathan Huang
- Adrian Rosebrock for writing Pan/tilt face tracking with a Raspberry Pi and OpenCV, which was the inspiration for this whole project
- Jason Zaman for reviewing this article and early release candidates
This article was originally published on the Towards Data Science Medium channel and is reused with permission.
Comments are closed.