Why you should use Python and Rust together

Rust and Python have complementary strengths and weaknesses. Prototype in Python and move performance bottlenecks to Rust.
3 readers like this.
Ferris the crab under the sea, unofficial logo for Rust programming language

Opensource.com

Python and Rust are very different languages, but they actually go together rather well. But before discussing how to combine Python with Rust, I want to introduce Rust itself. You've likely heard of the language but may not have heard details about how it works.

What is Rust?

Rust is a low-level language. This means that the things the programmers deal with are close to the way computers "really" work.

For example, integer types are defined by bit size and correspond to CPU-supported types. While it is tempting to say that this means a+b in Rust corresponds to one machine instruction, it does not mean quite that!

Rust's compiler's chain is non-trivial. It is useful as a first approximation to treat statements like that as "kind of" true.

Rust is designed for zero-cost abstraction, meaning many of the abstractions available at the language level are compiled away at runtime.

For example, objects are allocated on the stack unless explicitly asked for. The result is that creating a local object in Rust has no runtime cost (though initialization might).

Finally, Rust is a memory-safe language. There are other memory-safe languages and other zero-cost abstraction languages. Usually, those are different languages.

Memory safety does not mean it is impossible to have memory violations in Rust. It does mean that there are only two ways that memory violations can happen:

  • A bug in the compiler.
  • Code that's explicitly declared unsafe.

Rust standard library code has quite a bit of code that is marked unsafe, though less than what many assume. This does not make the statement vacuous though. With the (rare) exception of needing to write unsafe code yourself, memory violations result from the underlying infrastructure.

Why does Rust exist?

Why did people create Rust? What problem was not addressed by existing languages?

Rust was designed as a language to achieve a combination of high-performance code that is memory safe. This concern is increasingly important in a networked world.

The quintessential use case for Rust is low-level parsing of protocols. The data to be parsed often comes from untrusted sources and may need to be parsed in a performant way.

If this sounds like what a web browser does, it is no coincidence. Rust originated from the Mozilla Foundation as a way to improve the Firefox browser.

In the modern world, browsers are no longer the only things on which there is pressure to be safe and fast. Even the common microservice architecture, combined with defense-in-depth principles, must be able to unpack untrusted data quickly.

Counting characters

To understand a "wrapping in Rust" example, there needs to be a problem to solve. Not just any problem will do. The issue needs to be:

  • Easy enough to solve
  • Be helped by the ability to write high-performance loops
  • Somewhat realistic

The toy problem in this case is whether a character appears more than X times in a string. This is something that's not easily amenable to performant regular expressions. Even dedicated Numpy code can be slower than necessary because often there is no need to scan the entire string.

You can imagine some combination of Python libraries and tricks that make this possible. However, the obvious algorithm is pretty fast if implemented in a low-level language and makes things more readable.

A twist is added to make the problem slightly more interesting and demonstrate some fun parts of Rust. The algorithm supports resetting the count on a newline (does the character appear more than X times in a line?) or on a space (does the character appear more than X times in a word?).

This is the only nod given to "realism." Any more realism will make the example not useful pedagogically.

Enum support

Rust supports enumeration (enums). You can do many interesting things with enums.

For now, a three-way enum without any other twirls is used. The enum encodes what character resets the count.

#[derive(Copy)]
enum Reset {
    NewlinesReset,
    SpacesReset,
    NoReset,
}

Struct support

The next Rust component is a bit more substantial: a struct. A Rust struct is somewhat close to a Python dataclass. Again, you can do more sophisticated things with a struct.

#[pyclass]
struct Counter {
    what: char,
    min_number: u64,
    reset: Reset, 
}

Implementation blocks

You add a method to a struct in a separate block in Rust: the impl block. The details are outside the scope of this article.

In this example, the method calls an external function. This is mostly done to break up the code. A more sophisticated use would instruct the Rust compiler to inline the function to allow readability without any runtime cost.

#[pymethods]
impl Counter {
    #[new]
    fn new(what: char, min_number: u64, reset: Reset) -> Self {
        Counter{what: what, min_number: min_number, reset: reset}
    }
    
    fn has_count(
        &self,
        data: &str,
    ) -> bool {
        has_count(self, data.chars())
    }
}

Function

By default, Rust variables are constant. Because the current count has to change, it is declared as a mutable variable.

fn has_count(cntr: &Counter, chars: std::str::Chars) -> bool {
    let mut current_count : u64 = 0;
    for c in chars {
        if got_count(cntr, c, &mut current_count) {
            return true;
        }
    }
    false
}

The loop goes over the characters and calls the function got_count. Again, this is done to break the code into slides. It does show how to send a mutable borrow reference to a function.

Even though current_count is mutable, both the sending and receiving sites explicitly mark the reference as mutable. This makes it clear which functions might modify a value.

Counting

The got_count resets the counter, increments it, and then checks it. Rust's colon-separated sequence of expressions evaluates to the result of the last expression, in this case, whether the threshold was met.

fn got_count(cntr: &Counter, c: char, current_count: &mut u64) -> bool {
    maybe_reset(cntr, c, current_count);
    maybe_incr(cntr, c, current_count);
    *current_count >= cntr.min_number
}

Reset code

The reset code shows another useful thing in Rust: matching. A complete description of the matching abilities in Rust would be a semester-level class, not two minutes in an unrelated talk, but this example matches on a tuple matching one of two options.

fn maybe_reset(cntr: &Counter, c: char, current_count: &mut u64) -> () {
    match (c, cntr.reset) {
        ('\n', Reset::NewlinesReset) | (' ', Reset::SpacesReset)=> {
            *current_count = 0;
        }
        _ => {}
    };
}

Increment support

The increment compares the character to the desired one and, if matched, increments the count.

fn maybe_incr(cntr: &Counter, c: char, current_count: &mut u64) -> (){
    if c == cntr.what {
        *current_count += 1;
    };
}

Note that I optimized the code in this article for slides. It is not necessarily a best-practice example of Rust code or how to design a good API.

Wrap Rust code for Python

To wrap Rust code for Python, you can use PyO3. The PyO3 Rust "crate" (or library) allows inline hints for wrapping Rust code into Python, making it easier to modify both together.

Include PyO3 crate primitives

First, you must include the PyO3 crate primitives.

use pyo3::prelude::*;

Wrap enum

The enum needs to be wrapped. The derive clauses are necessary for wrapping the enum for PyO3, because they allow the class to be copied and cloned, making them easier to use from Python.

#[pyclass]
#[derive(Clone)]
#[derive(Copy)]
enum Reset {
    /* ... */
}

Wrap struct

The struct is similarly wrapped. These call "macros" in Rust, which generate the needed interface bits.

#[pyclass]
struct Counter {
    /* ... */
}

Wrap impl

Wrapping the impl is more interesting. Another macro is added called new. This method is marked as #[new], letting PyO3 know how to expose a constructor for the built-in object.

#[pymethods]
impl Counter {
    #[new]
    fn new(what: char, min_number: u64,
          reset: Reset) -> Self {
        Counter{what: what,
          min_number: min_number, reset: reset}
    }
    /* ... */
}

Define module

Finally, define a function that initializes the module. This function has a specific signature, must be named the same as the module, and decorated with #[pymodule].

#[pymodule]
fn counter(_py: Python, m: &PyModule
) -> PyResult<()> {
    m.add_class::<Counter>()?;
    m.add_class::<Reset>()?;
    Ok(())
}

The ? shows that this function can fail (for example, if the class was not appropriately configured). The PyResult is translated into a Python exception at import time.

Maturin develop

For quick checking, maturin develop builds and installs the library into the current virtual environment. This helps iterate quickly.

$ maturin develop

Maturin build

The maturin build command builds a manylinux wheel, which can be uploaded to PyPI. The wheel is specific to the CPU architecture.

Python library

Using the library from Python is the nice part. Nothing indicates a difference between this and writing the code in Python. One useful aspect of this is that if you optimize an existing library in Python that already has unit tests, you can use the Python unit tests for the Rust library.

Import

Whether you used maturin develop or pip install to install it, importing the library is done with import.

import counter

Construct

The constructor was defined exactly so the object could be built from Python. This is not always the case. Sometimes objects are only returned from more sophisticated functions.

cntr = counter.Counter(
    'c',
    3,
    counter.Reset.NewlinesReset,
)

Call

The final pay-off is here at last. Check whether this string has at least three "c" characters:

>>> cntr.has_count("hello-c-c-c-goodbye")
True

Adding a newline causes the rest to happen, and there aren't three "c" characters without an intervening newline:

>>> cntr.has_count("hello-c-c-\nc-goodbye")
False

Using Rust and Python is easy

My goal is to convince you that combining Rust and Python is easy. I wrote little code to "glue" them. Rust and Python have complementary strengths and weaknesses.

Rust is great for high-performance, safe code. Rust has a steep learning curve and can be awkward for quickly prototyping a solution.

Python is easy to get started with and supports incredibly tight iteration loops. Python does have a "speed cap." Beyond a certain level it is harder to get better performance from Python.

Combining them is perfect. Prototype in Python and move performance bottlenecks to Rust.

With maturin, your development and deployment pipelines are easier to make. Develop, build, and enjoy the combo!

Moshe sitting down, head slightly to the side. His t-shirt has Guardians of the Galaxy silhoutes against a background of sound visualization bars.
Moshe has been involved in the Linux community since 1998, helping in Linux "installation parties". He has been programming Python since 1999, and has contributed to the core Python interpreter. Moshe has been a DevOps/SRE since before those terms existed, caring deeply about software reliability, build reproducibility and other such things.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.