Getting started with a TensorFlow surgery classifier with TensorBoard data viz

Getting started with a TensorFlow surgery classifier with TensorBoard data viz

Train your own neural network to classify images, then use TensorBoard to visualize what's happening.

Getting started with a TensorFlow surgery classifier with TensorBoard data viz
Image by :

Get the newsletter

Join the 85,000 open source advocates who receive our giveaway alerts and article roundups.

The most challenging part of deep learning is labeling, as you'll see in part one of this two-part series, Learn how to classify images with TensorFlow. Proper training is critical to effective future classification, and for training to work, we need lots of accurately labeled data. In part one, I skipped over this challenge by downloading 3,000 prelabeled images. I then showed you how to use this labeled data to train your classifier with TensorFlow. In this part we'll train with a new data set, and I'll introduce the TensorBoard suite of data visualization tools to make it easier to understand, debug, and optimize our TensorFlow code.

Given my work as VP of engineering and compliance at healthcare technology company C-SATS, I was eager to build a classifier for something related to surgery. Suturing seemed like a great place to start. It is immediately useful, and I know how to recognize it. It is useful because, for example, if a machine can see when suturing is occurring, it can automatically identify the step (phase) of a surgical procedure where suturing takes place, e.g. anastomosis. And I can recognize it because the needle and thread of a surgical suture are distinct, even to my layperson's eyes.

My goal was to train a machine to identify suturing in medical videos. 

I have access to billions of frames of non-identifiable surgical video, many of which contain suturing. But I'm back to the labeling problem. Luckily, C-SATS has an army of experienced annotators who are experts at doing exactly this. My source data were video files and annotations in JSON.

The annotations look like this:

        "annotations": [
                "endSeconds": 2115.215,
                "label": "suturing",
                "startSeconds": 2319.541
                "endSeconds": 2976.301,
                "label": "suturing",
                "startSeconds": 2528.884
        "durationSeconds": 2975,
        "videoId": 5
        "annotations": [
        // ...etc...

I wrote a Python script to use the JSON annotations to decide which frames to grab from the .mp4 video files. ffmpeg does the actual grabbing. I decided to grab at most one frame per second, then I divided the total number of video seconds by four to get 10k seconds (10k frames). After I figured out which seconds to grab, I ran a quick test to see if a particular second was inside or outside a segment annotated as suturing (isWithinSuturingSegment() in the code below). Here's

# Grab frames from videos with ffmpeg. Use multiple cores.
# Minimum resolution is 1 second--this is a shortcut to get less frames.
# (C)2017 Adam Monsen. License: AGPL v3 or later.
import json
import subprocess
from multiprocessing import Pool
import os
frameList = []
def isWithinSuturingSegment(annotations, timepointSeconds):
    for annotation in annotations:
        startSeconds = annotation['startSeconds']
        endSeconds = annotation['endSeconds']
        if timepointSeconds > startSeconds and timepointSeconds < endSeconds:
            return True
    return False
with open('available-suturing-segments.json') as f:
    j = json.load(f)
    for video in j:
        videoId = video['videoId']
        videoDuration = video['durationSeconds']
        # generate many ffmpeg frame-grabbing commands
        start = 1
        stop = videoDuration
        step = 4 # Reduce to grab more frames
        for timepointSeconds in xrange(start, stop, step):
            inputFilename = '/home/adam/Downloads/suturing-videos/{}.mp4'.format(videoId)
            outputFilename = '{}-{}.jpg'.format(video['videoId'], timepointSeconds)
            if isWithinSuturingSegment(video['annotations'], timepointSeconds):
                outputFilename = 'suturing/{}'.format(outputFilename)
                outputFilename = 'not-suturing/{}'.format(outputFilename)
            outputFilename = '/home/adam/local/{}'.format(outputFilename)
            commandString = 'ffmpeg -loglevel quiet -ss {} -i {} -frames:v 1 {}'.format(
                timepointSeconds, inputFilename, outputFilename)
                'outputFilename': outputFilename,
                'commandString': commandString,
def grabFrame(f):
    if os.path.isfile(f['outputFilename']):
        print 'already completed {}'.format(f['outputFilename'])
        print 'processing {}'.format(f['outputFilename'])
p = Pool(4) # for my 4-core laptop, frameList)

Now we're ready to retrain the model again, exactly as before.

To use this script to snip out 10k frames took me about 10 minutes, then an hour or so to retrain Inception to recognize suturing at 90% accuracy. I did spot checks with new data that wasn't from the training set, and every frame I tried was correctly identified (mean confidence score: 88%, median confidence score: 91%).

Here are my spot checks. (WARNING: Contains links to images of blood and guts.)

Image Not suturing score Suturing score
Not-Suturing-01.jpg 0.71053 0.28947
Not-Suturing-02.jpg 0.94890 0.05110
Not-Suturing-03.jpg 0.99825 0.00175
Suturing-01.jpg 0.08392 0.91608
Suturing-02.jpg 0.08851 0.91149
Suturing-03.jpg 0.18495 0.81505

How to use TensorBoard

Visualizing what's happening under the hood and communicating this with others is at least as hard with deep learning as it is in any other kind of software. TensorBoard to the rescue! from part one automatically generates the files TensorBoard uses to generate graphs representing what happened during retraining.

To set up TensorBoard, run the following inside the container after running

pip install tensorboard
tensorboard --logdir /tmp/retrain_logs

Watch the output and open the printed URL in a browser.

Starting TensorBoard 41 on port 6006
(You can navigate to

You'll see something like this:

I hope this will help; if not, you'll at least have something cool to show. During retraining, I found it helpful to see under the "SCALARS" tab how accuracy increases while cross-entropy decreases as we perform more training steps. This is what we want.

Learn more

If you'd like to learn more, explore these resources:

Here are other resources that I used in writing this series, which may help you, too:

If you'd like to chat about this topic, please drop by the ##tfadam topical channel on Freenode IRC. You can also email me or leave a comment below.

This series would never have happened without great feedback from Eva Monsen, Brian C. Lane, Rob Smith, Alex Simes, VM Brasseur, Bri Hatch, and the editors at

About the author

bust photo of Adam Monsen, technology professional
Adam Monsen - Adam Monsen is the VP of Engineering at C-SATS, where he leads all hardware and software efforts to assess and improve healthcare professionals. Adam is also a co-founder of SeaGL, the Seattle GNU/Linux Conference. Personal blog Twitter ... more about Adam Monsen