How to build deep learning inference through Knative serverless framework

How to build deep learning inference through Knative serverless framework

Using deep learning to classify images when they arrive in object storage.

A robot arm illustration
Image by : 
Opensource.com
x

Subscribe now

Get the highlights in your inbox every week.

Deep learning is gaining tremendous momentum in certain academic and industry circles. Inference—the capability to retrieve information from real-world data based on pre-trained models—is at the core of deep learning applications.

Deep learning inference can be used to classify images when they arrive in object storage, whether it's hosted on a public cloud, such as Amazon S3 or Azure Blob, or on-premises using an interface such as Ceph RADOS Gateway (RGW). The conventional workflow for this use case is that when an image is updated, it triggers an event, and the object storage publishes the event to its subscribers. The subscribers then download the image and send it to an inference service. Such an event-triggering mechanism can be found in AWS's Simple Queue Service.

The recently emerged Ceph RGW PubSub project expands opportunities for on-premises solutions. We have integrated RGW PubSub with Knative, the Kubernetes-native serverless framework, and developed rgw-pubsub-api for deep learning inference serving functions.

RGW PubSub

RGW PubSub leverages RGW's multi-site system and provides a plugin mechanism that makes it easy to create different actions on object changes. One example is the metadata indexing plugin that indexes metadata of any object in the system. The added pubsub sync module provides a mechanism to publish and subscribe to object modifications in the system.

RGW PubSub provides a REST API that can be used to configure and use this feature.

Everything is user based. A user can define a topic. Changes can be published to this topic by defining a notification that includes a bucket (owned by the user) and the event type (object created, object deleted). It is possible to have multiple publishers to a single topic. A subscription must be created to get these events, and there can be more than one subscription per topic. Events can be pulled from a subscription and then acked, an action that removes the event. A push mechanism also had been implemented; however, a reliable and scalable push is still in development. Events that are not acked will eventually be removed.

We implemented a golang client to interact with the RGW PubSub service. The golang client is used by an RGW PubSub command-line interface (CLI) client and Knative eventing source.

RGW PubSub integration with Knative

Knative has three core components: eventing, build, and serving. Eventing is the framework that pulls external events from various sources such as GitHub, GCP PubSub, and Kubernetes Event. Once the events are pulled, they are dispatched to event sinks. A commonly used sink is a Knative Service, an internal endpoint running HTTP(S) services hosted by a serving function.

We modeled RGW PubSub's Knative eventing source after Knative's ContainerSource, which is used to run container images that can generate events and dispatch to an event sink.

The PubSub eventing source uses the RGW PubSub API to pull events periodically, post the events to an event sink, and delete them. On the other end of the sink, we implemented two inference serving functions: Google Vision and ResNet.

Inference serving functions

Google Vision serving function

This serving function runs an HTTP service that accepts events posted by RGW PubSub eventing source, downloads the image from RGW through Amazon S3 API, and posts the image to the Google Vision service to get the annotation.

To get this function running, start by cloning the repository and editing service-entry.yaml and subscription.yaml to reflect your local RGW settings and your Google Vision API Key.

Then run the following:

kubectl apply -f deploy/google-vision-svc/service-entry.yaml
kubectl apply -f deploy/google-vision-svc/subscription.yaml

After the serving function is running, test it by uploading an image of a cat into RGW:

# wget https://r.hswstatic.com/w_907/gif/tesla-cat.jpg
# ./s3 put buck/cat1.jpg --in-file=./tesla-cat.jpg
# ./s3 put buck/cat2.jpg --in-file=./tesla-cat.jpg

Check the serving container's log:

# kubectl logs -lserving.knative.dev/service=rgwpubsub-svc -c user-container
2018/11/29 16:22:49 Ready and listening on port 8080
2018/11/29 16:23:42 [2018-11-29T16:23:41Z] application/json rgwpubsub. Object: "cat1.jpg"  Bucket: "buck"
2018/11/29 16:23:43 label: cat, Score: 0.993347
2018/11/29 16:25:01 [2018-11-29T16:25:01Z] application/json rgwpubsub. Object: "cat2.jpg"  Bucket: "buck"
2018/11/29 16:25:02 label: cat, Score: 0.993347

The cat is identified!

ResNet serving function

This serving function uses ResNet with a pre-trained ImageNet model to classify an image.

First, edit service-entry.yaml and subscription-resnet.yaml to reflect your local RGW settings and Tensorflow serving endpoint.

Next, run the following:

kubectl apply -f deploy/resnet-grpc/service-entry.yaml
kubectl apply -f deploy/resnet-grpc/subscription-grpc.yaml

Then upload cat and dog images to RGW:

# wget https://r.hswstatic.com/w_907/gif/tesla-cat.jpg
# wget https://upload.wikimedia.org/wikipedia/commons/d/d9/Collage_of_Nine_Dogs.jpg
#./s3 put buck/dogs.jpg --in-file=./Collage_of_Nine_Dogs.jpg
#./s3 put buck/telsa-cat.jpg --in-file=./tesla-cat.jpg

Check the serving container's log:

# kubectl logs -lserving.knative.dev/service=rgwpubsub-svc -c user-container
2018/11/29 18:52:23 Ready and listening on port 8080
2018/11/29 18:54:31 [2018-11-29T18:54:31Z] application/json rgwpubsub. Object: "dogs.jpg"  Bucket: "buck"
2018/11/29 18:54:32 classes: [162]
2018/11/29 18:57:54 [2018-11-29T18:57:51Z] application/json rgwpubsub. Object: "telsa-cat.jpg"  Bucket: "buck"
2018/11/29 18:57:54 classes: [286]

Regarding ImageNet classes, class 162 is beagle, and class 286 is cougar, puma, catamount, mountain lion, painter, panther, Felis concolor. The classifier is close enough.

Looking ahead

We look forward to more community involvement around Ceph RGW PubSub and Knative. Contributions, testing, and bug reports are welcome. 


Huamin Chen & Yehuda Sadeh-Weinraub will present How to Build Deep Learning Inference Through Knative Serverless Framework at KubeCon + CloudNativeCon North America, December 10-13 in Seattle.

About the author

Yehuda Sadeh-Weinraub - Yehuda has been involved in Ceph since 2008, and has been working on various related projects and subsystems. He is the original developer of the RADOS Gateway (RGW) which he currently co-leads as part of his work at Red Hat. He also worked on multiple other Ceph projects, such as the Linux kernel Ceph filesystem module, and RBD. Notable other Ceph modules that he initiated along with Sage Weil are the Linux kernel RBD module, the RADOS object classes, and the cephx authentication.

About the author

Huamin Chen - Dr. Huamin Chen works at Red Hat's CTO office and is passionate inventor and developer about storage and cloud technologies. He is one of the founding members of Kubernetes SIG-Storage, and created the storage volume plugins for Kubernetes and OpenShift. He is also a member of Rook and Ceph.