Why and how to handle exceptions in Python Flask

Image by:

Image from Unsplash.com, Creative Commons Zero

In my last Python Flask article, I walked you through the building of a simple application to take in a Threat Stack webhook and archive the alert to AWS S3. In this post, I'll dive into Python exception handling and how to do it in a secure manner.

I wrote the code in the last article to be as simple and readable as possible, but what happens if something goes wrong in my application? I didn't include any error or exception handling. If something goes wrong—for example, say you hit a bug or receive a bad dat—you can't do anything about it in the application. Instead of returning a parseable JSON (JavaScript Object Notation) response, the app will just spit a backtrace embedded in an HTML document back. The entity sending the request to your service is then left trying to figure out what may have gone wrong.

What do you need to handle?

Some words of wisdom:

A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.

—Leslie Lamport, computer scientist and 2013 A.M. Turing Award winner.

You can start by placing "computer" with "service" in the preceding Lamport quotation. Your application talks to Threat Stack and AWS S3. A failure communicating with either of those can cause your own service to fail. A failure might be caused by the service being down, being unresponsive, or returning an unexpected response. Any number of issues can cause a communication failure between systems.

You also need to handle input validation. Your service has two different requests that take input:

Sending alerts to the service requires a JSON document to be sent and parsed.
Searching for alerts can take optional date parameters.

The input to your service might not be what you expect through a simple mistake, such as a typo or a misunderstanding of what's required. Worse, some people will intentionally send bad data to see what happens. Fuzzing is a technique used in application penetration testing where malformed or semi-formed data is sent to a service to uncover bugs.

What is the worst that could happen?

Other than being an unreliable service that's regularly breaking? I mentioned before that on an error the application will return a backtrace. Let's see what happens when an unparseable date is sent to your service:

When an unparseable date is sent to your service

You're returning your own code back to the requester. This code is reasonably benign, so take look at another example. If there was a Threat Stack communication issue: an issue that might happen completely at random (though hopefully not), this would appear:

Threat Stack communication issue

You're leaking the location of the service you're talking to, and if a developer had used poor practices, you might have even leaked your API key to a random person.

Exception catching and handling

Now that you know why handling exceptions in your application is important, I'll turn my focus on how to handle them properly. You want to accomplish the following when you start handling exceptions:

Identify what could go wrong
Return useful information to the client
Don't leak too much information

I'll admit that until now I did many things dangerously or even incorrectly until I wrote this post and I finally made corrections. While searching for answers, I found that many other people had similar questions about how to do things correctly. Even if you think this is a trivial topic, why not take a refresher?

Catching exceptions in app.models.threatstack

I'll walk through a part of this module to highlight a few different situations for you to handle. This is your function for getting alert detail from Threat Stack for a given alert ID:

def get_alert_by_id(alert_id):
    '''
    Retrieve an alert from Threat Stack by alert ID.
    '''
    alerts_url = '{}/alerts/{}'.format(THREATSTACK_BASE_URL, alert_id)

    resp = requests.get(
        alerts_url,
        headers={'Authorization': THREATSTACK_API_KEY}
    )

    return resp.json()

The function is straightforward. It constructs a URL, makes a request to Threat Stack, and returns the response's JSON content. So what can wrong? Of those three statements two can easily go wrong. When making a request to Threat Stack, a communication error that results in failure can occur. If you do get a response, you expect to parse a JSON document. What if there is no JSON document in the response?

Let's start with a failed request to Threat Stack. Put request.get() into a try/except block that will catch the exception type requests.exceptions.RequestException:

try:
    resp = requests.get(
      alerts_url,
      headers={'Authorization': THREATSTACK_API_KEY}
    )

except requests.exceptions.RequestException as e:
`   Pass

If you fail, this lets you perform any additional actions that you feel are necessary. If you were working with a database, you might roll back a transaction. You also might want to log the error for analysis later. (You would probably do that if you had already written the logging component for this application.) Notice that you're specifying the exception type to catch. Do not blanket catch all exceptions. You may be tempted to do this to save time, but it will potentially make your life harder down the road as you find yourself unable to understand why your application is failing. Take the time now to understand why your application might fail and for what reasons.

What do you want to do if the app fails to communicate with Threat Stack? You're going to raise a new exception. This is called catch and reraise. This technique makes organizing exception handling a little easier. You're going to define a set of exception classes inside the app.models.threatstack module that describe what could go wrong. Doing this will make it easier later when you'll add a handler to the application and tell it how to handle exceptions from the app.models.threatstack module.

You'll start by adding two exception classes. The first is the base exception class, which inherits the base Python Exception class. Every subsequent exception class will inherit the new base exception class. At first this may just seem like extra work, but it will be useful down the road. The next class will be for request failures. You'll even add a Threat Stack API error that you'll use later. You want the class name to be descriptive, so that you'll understand why your application failed just by reading it:

class ThreatStackError(Exception):
    '''Base Threat Stack error.'''

class ThreatStackRequestError(ThreatStackError):
    '''Threat Stack request error.'''

class ThreatStackAPIError(ThreatStackError):
    '''Threat API Stack error.'''

With the Exception classes in place, you can catch and reraise an exception:

try:
    resp = requests.get(
      alerts_url,
      headers={'Authorization': THREATSTACK_API_KEY}
    )

except requests.exceptions.RequestException as e:
    exc_info = sys.exc_info()
    raise ThreatStackRequestError, ThreatStackRequestError(e), exc_info[2]

What's going on after you catch the exception? Why didn't you just do this?

except requests.exceptions.RequestException as e:
   raise ThreatStackRequestError(e.args)

This mistake is very common when people catch and reraise exceptions. If you did the above, you lose the application backtrace. Inspecting the backtrace would show that you entered get_alert_by_id() and then you raised an error. You wouldn't see the further context of why request.get() failed. The previous example is the correct way to catch and reraise errors in Python 2. Your code will throw an exception named for a class that you know, and it will give you the code trace that leads to the exception so you can better debug it.

You have made a request, communicated with Threat Stack correctly, and are ready to return the response at the end of this function:

      return resp.json()

What can go wrong here? For one thing, the response may not have been a JSON body, which would cause you to throw an exception while attempting to parse it. The API is always supposed to return JSON, even on an error, but it is possible that something might still go unexpectedly wrong. Maybe an application issue spews a backtrace on error just as your application does right now. Maybe a load balancer has an issue and returns a 503 with a "Service Unavailable" page. API failures can also occur. You might have been sent back a JSON response that's perfectly parseable only to tell you that your request failed for some reason. For example, when you're trying to retrieve an alert that does not exist. Simply put, you need to make sure that your request returned a successful response. If you didn't get a successful response, you raise an error. You might be returned a communication error or an API error, so depending on what you received, you'll raise either ThreatStackRequestError or ThreatStackAPIError:

   if not resp.ok:
        if 'application/json' in resp.headers.get('Content-Type'):
            raise ThreatStackAPIError(resp.reason,
                                      resp.status_code,
                                      resp.json()
                                      )
        else:
            raise ThreatStackRequestError(resp.reason, resp.status_code)

    return resp.json()

If the request was successful, resp.ok will be True. If it is not, then you'll try to determine what sort of failure occurred: communication or API? You'll use a very simple approach to figuring out the difference. If the response header indicates JSON, assume you were able to talk to the API and the API sent you an error. Otherwise assume that something else along the way failed and you never made it to the Threat Stack API, and that it is a communication error.

Handling exceptions

Thus far you've been catching exceptions only to reraise a new exception. It might feel that you're not that much further from where you started. You're just raising exceptions and returning a backtrace to the client, but with your own class name.

Returning a backtrace to the client, but with your own class name

You're still leaking code, potentially leaking secrets, and providing someone with greater intelligence about your environment than you really want to. Now you need to start handling these exceptions.

Flask's documentation provides a good overview of handling exceptions. You're just going to tweak it slightly due to the simplicity of our application. Start by associating HTTP status codes with your error classes. Let's revisit your Threat Stack error classes in app.models.threatstack:

app.models.threatstack

class ThreatStackError(Exception):
    '''Base Threat Stack error.'''

class ThreatStackRequestError(ThreatStackError):
    '''Threat Stack request error.'''

class ThreatStackAPIError(ThreatStackError):
    '''Threat API Stack error.'''

You raise these exceptions when your service attempts to talk to Threat Stack and something unexpected happens. These can arguably be considered 500 level server errors. (Note: You can make a case that an invalid alert ID passed to get_alert_by_id(), which raises a ThreatStackAPIError exception should actually be a 400 Bad Request, but I'm not that concerned. My own preference is to simply consider model level exceptions as 500 level and view level exceptions as 400 level.) Recall when I suggested creating a base ThreatStackError class? Here's where you'll first use it:

app.models.threatstack

class ThreatStackError(Exception):
    '''Base Threat Stack error.'''
    status_code = 500

class ThreatStackRequestError(ThreatStackError):
    '''Threat Stack request error.'''

class ThreatStackAPIError(ThreatStackError):
    '''Threat API Stack error.'''

Repeat this process for adding status_codes in app.models.s3 and app.views.s3, too.

Now that your error classes have an HTTP status code, you'll add a handler for application exceptions. Flask's documentation uses the errorhandler() decorator. You would add the decorator and a function to the app.view.s3 module just as if you were adding another endpoint to your application:

app.view.s3

@s3.route('/status', methods=['GET'])
def is_available():
    # <SNIP>

@s3.errorhandler(Exception)
def handle_error(error):
    # <SNIP>

This is great for larger apps, which perhaps require more organization and different views that require their own error handling, but let's keep your code a little simpler. Instead you'll add a single Flask blueprint for handling errors that will handle all application exceptions:

app.errors

'''Application error handlers.'''
from flask import Blueprint, jsonify

errors = Blueprint('errors', __name__)

@errors.app_errorhandler(Exception)
def handle_error(error):
    message = [str(x) for x in error.args]
    status_code = error.status_code
    success = False
    response = {
        'success': success,
        'error': {
            'type': error.__class__.__name__,
            'message': message
        }
    }

    return jsonify(response), status_code

This is good to start with, but you're going to make an additional tweak. We are assuming that all Exception objects have a status_code attribute, which is simply not true. We would like to think that we are prepared to catch every possible exception case in our code, but people make mistakes. For that reason, you'll have two error handler functions. One will handle the error classes you know about (there's our base exception classes again), and the other will be for unexpected errors.

Another important thing to notice is that the application blindly returns the message associated with errors you catch. You're still at risk of potentially revealing information about your infrastructure, how your application works, or your secrets. In this particular application's case, you don't have to be as worried because you're aware of the types of exceptions you catch and reraise along with the information those exceptions return. For those exceptions you didn't anticipate, you always return the same error message as a precaution. I will revisit this in a later article when I discuss logging. Because this application currently has no logging, you're relying on the error response to be highly descriptive.

When you're returning API errors, ask yourself who will be using your service. Does the requester need to know as much as you're returning? A developer might appreciate the added context to help them debug their own service. An external third party probably doesn't need to know how your backend failed.

app.errors

'''Application error handlers.'''
from app.models.s3 import S3ClientError
from app.models.threatstack import ThreatStackError
from flask import Blueprint, jsonify

errors = Blueprint('errors', __name__)

@errors.app_errorhandler(S3ClientError)
@errors.app_errorhandler(ThreatStackError)
def handle_error(error):
    message = [str(x) for x in error.args]
    status_code = 500
    success = False
    response = {
        'success': success,
        'error': {
            'type': error.__class__.__name__,
            'message': message
        }
    }

    return jsonify(response), status_code

@errors.app_errorhandler(Exception)
def handle_unexpected_error(error):
    status_code = 500
    success = False
    response = {
        'success': success,
        'error': {
            'type': 'UnexpectedException',
            'message': 'An unexpected error has occurred.'
        }
    }

    return jsonify(response), status_code

Finally, you'll hook this blueprint up to the application in the app module. You add an additional function called _initialize_errorhandler(), which will import the blueprint and add it to your application:

app

def _initialize_errorhandlers(application):
    '''
    Initialize error handlers
    '''
    from app.errors import errors
    application.register_blueprint(errors)

def create_app():
    '''
    Create an app by initializing components.
    '''
    application = Flask(__name__)

    _initialize_errorhandlers(application)
    _initialize_blueprints(application)

    # Do it!
    return application

Now you have functional error handling when the application throws an exception, so instead of throwing a backtrace and revealing code as well as potentially returning sensitive information, the app returns a JSON doc that describes the error.

Final thoughts

You have made your threatstack-to-s3 service far more resilient to failure, but you probably also see there being more for us to do. In an upcoming post, I'll discuss logging.

View the finished product from this post.

This article originally appeared on the Threat Stack blog. Reposted with permission.