WebAssembly for speed and code reuse

Find out how WebAssembly supplements JavaScript by providing better performance on compute-bound tasks.

Image by:

WOCinTech Chat. Modified by Opensource.com. CC BY-SA 4.0

Imagine translating a non-web application, written in a high-level language, into a binary module ready for the web. This translation could be done without any change whatsoever to the non-web application's source code. A browser can download the newly translated module efficiently and execute the module in the sandbox. The executing web module can interact seamlessly with other web technologies—with JavaScript (JS) in particular. Welcome to WebAssembly.

As befits a language with assembly in the name, WebAssembly is low-level. But this low-level character encourages optimization: the just-in-time (JIT) compiler of the browser's virtual machine can translate portable WebAssembly code into fast, platform-specific machine code. A WebAssembly module thereby becomes an executable suited for compute-bound tasks such as number crunching.

Which high-level languages compile into WebAssembly? The list is growing, but the original candidates were C, C++, and Rust. Let's call these three the systems languages, as they are meant for systems programming and high-performance applications programming. The systems languages share two features that suit them for compilation into WebAssembly. The next section gets into the details, which sets up full code examples (in C and TypeScript) together with samples from WebAssembly's own text format language.

Explicit data typing and garbage collection

The three systems languages require explicit data types such as int and double for variable declarations and values returned from functions. For example, the following code segment illustrates 64-bit addition in C:

long n1 = random();
long n2 = random();
long sum = n1 + n2;

The library function random is declared with long as the return type:

long random(); /* returns a long */

During the compilation process, C source is translated into assembly language, which is then translated into machine code. In Intel assembly language (AT&T flavor), the last C statement above would be something like the following (with ## introducing comments):

addq %rax, %rdx ## %rax = %rax + %rdx (64-bit addition)

The %rax and %rdx are 64-bit registers, and the addq instruction means add quadwords, where a quadword is 64 bits in size, which is the standard size for a C long. The assembly language underscores that executable machine code involves types, with the type given through some mix of the instruction and the arguments, if any. In this case, the add instruction is addq (64-bit addition) rather than, for example, addl, which adds 32-bit values typical of a C int. The registers in use are the full 64-bit ones (the r in %rax and %rdx) rather than 32-bit chunks thereof (e.g., %eax is the lower 32 bits of %rax, and %edx is the lower 32 bits of %rdx).

The assembly-language addition performs well because the operands are stored in CPU registers, and a reasonable C compiler (at even the default level of optimization) would generate assembly code equivalent to what is shown here.

The three systems languages, with their emphasis on explicit types, are good candidates for compilation into WebAssembly because this language, too, has explicit data types: i32 for a 32-bit integer value, f64 for a 64-bit floating-point value, and so on.

Explicit data types encourage optimization for function calls as well. A function with explicit data types has a signature, which specifies the data types for the arguments and the value, if any, returned from the function. Below is the signature for a WebAssembly function named $add, which is written in the WebAssembly text format language discussed below. The function takes two 32-bit integers as arguments and returns a 64-bit integer:

(func $add (param $lhs i32) (param $rhs i32) (result i64))

The browser's JIT compiler should have the 32-bit integer arguments and the returned 64-bit value stored in registers of the appropriate sizes.

When it comes to high-performance web code, WebAssembly is not the only game in town. For example, asm.js is a JS dialect designed, like WebAssembly, to approach native speed. The asm.js dialect invites optimization because the code mimics the explicit data types in the three aforementioned languages. Here's an example with C and then asm.js. The sample function in C is:

int f(int n) {       /** C **/
  return n + 1;
}

Both the parameter n and the returned value are explicitly typed as int. The equivalent function is asm.js would be:

function f(n) {      /** asm.js **/
  n = n | 0;
  return (n + 1) | 0;
}

JS, in general, does not have explicit data types, but a bitwise-OR operation in JS yields an integer value. This explains the otherwise pointless bitwise-OR operation:

n = n | 0;  /* bitwise-OR of n and zero */

The bitwise-OR of n and zero evaluates to n, but the purpose here is to signal that n holds an integer value. The return statement repeats this optimizing trick.

Among the JS dialects, TypeScript stands out for adopting explicit data types, which makes this language attractive for compilation into WebAssembly. (A code example below illustrates this.)

The second feature shared by the three systems languages is that they execute without a garbage collector (GC). For dynamically allocated memory, the Rust compiler automatically writes both the allocation and the deallocation code; in the other two systems languages, the programmer who dynamically allocates memory is responsible for explicitly deallocating this same memory. The systems languages avoid the overhead and complication of automated GC.

This quick overview of WebAssembly can be summarized as follows. Almost any article on the WebAssembly language mentions near-native speed as one of the language's major goals. The native speed is that of the compiled systems' languages; hence, these three languages were also the originally designated candidates for compilation into WebAssembly.

WebAssembly, JavaScript, and the separation of concerns

All rumors to the contrary, the WebAssembly language is not designed to replace JS, but rather to supplement JS by providing better performance on compute-bound tasks. WebAssembly also has an advantage when it comes to downloading. A browser fetches a JS module as text, an inefficiency that WebAssembly addresses. A module in WebAssembly has a compact binary format, which speeds up downloading.

Of equal interest is how JS and WebAssembly are meant to work together. JS is designed to read and write the Document Object Model (DOM), the tree representation of a web page. By contrast, WebAssembly does not come with any built-in functionality for the DOM; but WebAssembly can export functions that JS can then call as needed. This separation of concerns means a clean division of labor:

DOM<----->JS<----->WebAssembly

JS, in whatever dialect, still should manage the DOM, but JS also can use the general-purpose functionality delivered through WebAssembly modules. A code example helps illustrate the division of labor. (The code examples in this article are available in a ZIP file on my website.)

Hailstone sequences and the Collatz conjecture

A production-grade example would have WebAssembly code perform a heavy compute-bound task such as generating large cryptographic key pairs or using such pairs for encryption and decryption. A simpler example fits the bill as a stand-in that is easy to follow. There is number crunching, but of the routine sort that JS could handle with ease.

Consider the function hstone (for hailstone), which takes a positive integer as an argument. The function is defined as follows:

             3N + 1 if N is odd
hstone(N) =
             N/2 if N is even

For example, hstone(12) returns six, whereas hstone(11) returns 34. If N is odd, then 3N+1 is even; but if N is even, then N/2 could be either even (e.g., 4/2 = 2) or odd (e.g., 6/2 = 3).

The hstone function can be used iteratively by passing the returned value as the next argument. The result is a hailstone sequence such as this one, which starts with 24 as the original argument, the returned value 12 as the next argument, and so on:

24,12,6,3,10,5,16,8,4,2,1,4,2,1,...

It takes 10 calls for the sequence to converge to one, at which point the sequence of 4,2,1 repeats indefinitely: (3x1)+1 is 4, which is halved to yield two, which is halved to yield one, and so on. Plus magazine offers an explanation of why hailstone seems an appropriate name for such sequences.

Note that powers of two converge quickly, requiring just N divisions by two to reach one; for example, 32 = 2⁵ has a convergence length of five, and 64 = 2⁶ has a convergence length of six. Of interest here is the sequence length from the initial argument to the first occurrence of one. My code examples in C and TypeScript compute the length of a hailstone sequence.

The Collatz conjecture is that a hailstone sequence converges to one no matter what the initial argument N > 0 happens to be. No one has found a counterexample to the Collatz conjecture, nor has anyone found a proof to elevate the conjecture to a theorem. The conjecture, simple as it is to test with a program, remains a profoundly challenging problem in mathematics.

From C to WebAssembly in one step

The hstoneCL program below is a non-web application that can be compiled with a regular C compiler (e.g., GNU or Clang). The program generates a random integer value N > 0 eight times and computes the length of the hailstone sequence starting with N. Two programmer-defined functions, main and hstone, are of interest when the app is later compiled into WebAssembly.

Example 1. The hstone function in C

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int hstone(int n) {
  int len = 0;
  while (1) {
    if (1 == n) break;           /* halt on 1 */
    if (0 == (n & 1)) n = n / 2; /* if n is even */
    else n = (3 * n) + 1;        /* if n is odd  */
    len++;                       /* increment counter */
  }
  return len;
}

#define HowMany 8

int main() {
  srand(time(NULL));  /* seed random number generator */
  int i;
  puts("  Num  Steps to 1");
  for (i = 0; i < HowMany; i++) {
    int num = rand() % 100 + 1; /* + 1 to avoid zero */
    printf("%4i %7i\n", num, hstone(num));
  }
  return 0;
}

The code can be compiled and run from the command line (with % as the command-line prompt) on any Unix-like system:

% gcc -o hstoneCL hstoneCL.c  ## compile into executable hstoneCL
% ./hstoneCL                  ## execute

Here is the output from a sample run:

  Num  Steps to 1
  88      17
   1       0
  20       7
  41     109
  80       9
  84       9
  94     105
  34      13

The systems languages, including C, require specialized toolchains to translate source code into a WebAssembly module. For the C/C++ languages, Emscripten is a pioneering and still widely used option, one built upon the well-known LLVM (Low-Level Virtual Machine) compiler infrastructure. My examples in C use Emscripten, which you can install with this guide).

The hstoneCL program can be webified by using Emscription to compile the code—with no change whatsoever—into a WebAssembly module. The Emscription toolchain also creates an HTML page together with JS glue (in asm.js) that mediates between the DOM and the WebAssembly module that computes the hstone function. Here are the steps:

Compile the non-web program hstoneCL into WebAssembly:
```
% emcc hstoneCL.c -o hstone.html  ## generates hstone.js and hstone.wasm as well
```
The file hstoneCL.c contains the source code shown above, and the -o for output flag specifies the name of the HTML file. Any name would do, but the generated JS code and the WebAssembly binary file then have the same name (in this case, hstone.js and hstone.wasm, respectively). Older versions of Emscription (prior to 13) may require the flag -s WASM=1 to be included in the compilation command.
Use the Emscription development web server (or equivalent) to host the webified app:
```
% emrun --no_browser --port 9876 .   ## . is current working directory, any port number you like
```
To suppress warning messages, the flag --no_emrun_detect can be included. This command starts the web server, which hosts all the resources in the current working directory; in particular, hstone.html, hstone.js, and hstone.webasm.
Open a WebAssembly-enabled browser (e.g., Chrome or Firefox) to the URL https://localhost:9876/hstone.html.

This screenshot shows the output from my sample run with Firefox.

Image by:

^{Figure 1. The webified hstone program}

The result is remarkable, as the full compilation process requires but a single command and no change whatsoever to the original C program.

Fine-tuning the hstone program for webification

The Emscription toolchain nicely compiles a C program into a WebAssembly module and generates the required JS glue, but these artifacts are typical for machine-generated code. For example, the asm.js file produced is almost 100KB in size. The JS code handles multiple scenarios and does not use the most recent WebAssembly API. A simplified version of the webified hstone program will make it easier to focus on how the WebAssembly module (housed in the hstone.wasm file) interacts with the JS glue (housed in the hstone.js file).

There is another issue: WebAssembly code need not mirror the functional boundaries in a source program such as C. For example, the C program hstoneCL has two user-defined functions, main and hstone. The resulting WebAssembly module exports a function named _main but does not export a function named _hstone. (It's worth noting that the function main is the entry point in a C program.) The body of the C hstone function might be in some unexported function or simply wrapped into _main. The exported WebAssembly functions are exactly the ones that the JS glue can invoke by name. However, there is a directive to specify which source-language functions should be exported by name in the WebAssembly code.

Example 2. The revised hstone program

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <emscripten/emscripten.h>

int EMSCRIPTEN_KEEPALIVE hstone(int n) {
  int len = 0;
  while (1) {
    if (1 == n) break;           /* halt on 1 */
    if (0 == (n & 1)) n = n / 2; /* if n is even */
    else n = (3 * n) + 1;        /* if n is odd  */
    len++;                       /* increment counter */
  }
  return len;
}

The revised hstoneWA program, shown above, has no main function; it is no longer needed because the program is not designed to run as a standalone application but exclusively as a WebAssembly module with a single exported function. The directive EMSCRIPTEN_KEEPALIVE (defined in the header file emscripten.h) instructs the compiler to export an _hstone function in the WebAssembly module. The naming convention is straightforward: A C function such as hstone retains its name—but with a single underscore as its first character in WebAssembly (_hstone, in this case). Other compilers into WebAssembly follow different naming conventions.

To confirm that this approach works, the compilation step can be simplified to produce only the WebAssembly module and the JS glue, but not the HTML:

% emcc hstoneWA.c -o hstone2.js  ## we'll provide our own HTML file

The HTML file now can be simplified to this hand-written one:

<!doctype html>
<html>
  <head>
    <meta charset="utf-8"/>
    <script src="https://opensource.com/hstone2.js"></script>
  </head>
  <body/>
</html>

The HTML document loads the JS file, which in turn fetches and loads the WebAssembly binary file hstone2.wasm. By the way, the new WASM file is about half the size of the original example.

The application code can be compiled as before and then launched with the built-in web server:

% emrun --no_browser --port 7777 .  ## new port number for emphasis

After requesting the revised HTML document in a browser (in this case, Chrome), the browser's web console can be used to confirm that the hstone function has been exported as _hstone. Here is a segment of my session in the web console, with ## again introducing comments:

> _hstone(27)   ## invoke _hstone by name
< 111           ## output
> _hstone(7)    ## again
< 16            ## output

The EMSCRIPTEN_KEEPALIVE directive is the straightforward way to have the Emscripten compiler produce a WebAssembly module that exports any function of interest to the JS glue, which this compiler likewise produces. A customized HTML document, with whatever hand-crafted JS is appropriate, can then call the functions exported from the WebAssembly module. Hats off to Emscripten for this clean approach.

Compiling TypeScript into WebAssembly

The next code example is in TypeScript, which is JS with explicit data types. The setup requires Node.js and its npm package manager. The following npm command installs AssemblyScript, which is a WebAssembly compiler for TypeScript code:

% npm install -g assemblyscript  ## install the AssemblyScript compiler

The TypeScript program hstone.ts consists of a single function, again named hstone. Data types such as i32 (32-bit integer) now follow rather than precede parameter and local variable names (in this case, n and len, respectively):

export function hstone(n: i32): i32 { // will be exported in WebAssembly
  let len: i32 = 0;
  while (true) {
    if (1 == n) break;            // halt on 1
    if (0 == (n & 1)) n = n / 2;  // if n is even
    else n = (3 * n) + 1;         // if n is odd
    len++;                        // increment counter
  }
  return len;
}

The function hstone takes one argument of type i32 and returns a value of the same type. The function's body is essentially the same as in the C example. The code can be compiled into WebAssembly as follows:

% asc hstone.ts -o hstone.wasm  ## compile a TypeScript file into WebAssembly

The WASM file hstone.wasm is only about 14KB in size.

To highlight details of how a WebAssembly module is loaded, the hand-written HTML file below (index.html in the ZIP on my website) includes the script to fetch and load the WebAssembly module hstone.wasm and then to instantiate this module so that the exported hstone function can be invoked for confirmation in the browser's console.

Example 3. The HTML page for the TypeScript code

<!doctype html>
<html>
  <head>
    <meta charset="utf-8"/>
    <script>
      fetch('hstone.wasm').then(response =>            <!-- Line 1 -->
      response.arrayBuffer()                           <!-- Line 2 -->
      ).then(bytes =>                                  <!-- Line 3 -->
      WebAssembly.instantiate(bytes, {imports: {}})    <!-- Line 4 -->
      ).then(results => {                              <!-- Line 5 -->
      window.hstone = results.instance.exports.hstone; <!-- Line 6 -->
      });
    </script>
  </head>
  <body/>
</html>

The script element in the HTML page above can be clarified line by line. The fetch call in Line 1 uses the Fetch module to get the WebAssembly module from the web server that hosts the HTML page. When the HTTP response arrives, the WebAssembly module does so as a sequence of bytes, which are stored in the arrayBuffer of the script's Line 2. These bytes constitute the WebAssembly module, which is all of the code compiled from the TypeScript file. This module has no imports, as indicated at the end of Line 4.

At the start of Line 4, the WebAssembly module is instantiated. A WebAssembly module is akin to a non-static class with non-static members in an object-oriented language such as Java. The module contains variables, functions, and various support artifacts; but the module, like the non-static class, must be instantiated to be usable, in this case in the web console, but more generally in the appropriate JS glue code.

The script's Line 6 exports the original TypeScript function hstone under the same name. This WebAssembly function is available now to any JS glue code, as another session in the browser's console will confirm.

WebAssembly has a more concise API for fetching and instantiating a module. The new API reduces the script above to only the fetch and instantiate operations. The longer version shown here has the benefit of exhibiting details; in particular, the representation of a WebAssembly module as a byte array that gets instantiated as an object with exported functions.

The plan is to have a web page load a WebAssembly module in the same way as a JS ES2015 module:

<script type='module'>...</script>

JS then would fetch, compile, and otherwise handle the WebAssembly module as if it were just another JS module.

The text format language

WebAssembly binaries can be translated to and from text format equivalents. The binaries usually reside in files with a WASM extension, whereas their human-readable text counterparts reside in files with a WAT extension. WABT is a set of nearly a dozen tools for dealing with WebAssembly, including ones to translate to and from formats such as WASM and WAT. The conversion tools include the wasm2wat, wasm2c, and wat2wasm utilities.

The text-format language adopts the S-expression (S for symbolic) syntax popularized by Lisp. An S-expression (sexpr for short) represents a tree as a list with arbitrarily many sublists. For example, this sexpr occurs near the end of the WAT file for the TypeScript example:

(export "hstone" (func $hstone)) ## export function $hstone by the name "hstone"

The tree representation is:

        export        ## root
          |
     +----+----+
     |         |
  "hstone"    func    ## left and right children
               |
            $hstone   ## single child

In text format, a WebAssembly module is a sexpr whose first term is module, which is the root of the tree. Here is an elementary example of a module that defines and exports a single function, which takes no arguments but returns the constant 9876:

(module
  (func (result i32)
    (i32.const 9876)
  )
  (export "simpleFunc" (func 0)) // 0 is the unnamed function's index
)

The function is defined without a name (i.e., as a lambda) and exported by referencing its index 0, which is the index of the first nested sexpr in the module. The export name is given as a string; in this case, "simpleFunc."

Functions in text format have a standard pattern, which can be depicted as follows:

(func <signature> <local vars> <body>)

The signature specifies the arguments (if any) and the returned value (if any). For example, here's the signature for an unnamed function that takes two 32-bit integer arguments but returns a 64-bit integer value:

(func (param i32) (param i32) (result i64)...)

Names can be given to functions, arguments, and local variables. A name begins with a dollar sign:

(func $foo (param $a1 i32) (param $a2 f32) (local $n1 f64)...)

The body of a WebAssembly function reflects the underlying stack machine architecture of the language. Stack storage is for scratchpad. Consider this example of a function that doubles its integer argument and returns the value:

(func $doubleit (param $p i32) (result i32)
  get_local $p
  get_local $p
  i32.add)

Each of the get_local operations, which can work on local variables and parameters alike, pushes the 32-bit integer argument onto the stack. The i32.add operation then pops the top two (and currently only) values from the stack to perform the addition. The sum from the add operation is then the one and only value on the stack and thereby becomes the value returned from the $doubleit function.

When the WebAssembly code is translated into machine code, the WebAssembly stack as scratchpad should be replaced, wherever possible, by general-purpose registers. This is the job for the JIT compiler, which translates WebAssembly virtual stack-machine code into real-machine code.

Web programmers are unlikely to write WebAssembly in text format, as compiling from some high-level language is far too attractive an option. Compiler writers, by contrast, might find it productive to work at this fine-grained level.

Wrapping up

Much has been made of WebAssembly's goal of achieving near-native speed. But as the JIT compilers for JS continue to improve, and as dialects well-suited for optimization (e.g., TypeScript) emerge and evolve, it may be that JS also achieves near-native speed. Would this imply that WebAssembly is wasted effort? I think not.

WebAssembly addresses another traditional goal in computing: meaningful code reuse. As even the short examples in this article illustrate, code in a suitable language, such as C or TypeScript, translates readily into a WebAssembly module, which plays well with JS code—the glue that connects a range of technologies used in the web. WebAssembly is thus an inviting way to reuse legacy code and to broaden the use of new code. For example, a high-performance program for image processing, written originally as a desktop application, might also be useful in a web application. WebAssembly then becomes an attractive path to reuse. (For new web modules that are compute-bound, WebAssembly is a sound choice.) My hunch is that WebAssembly will thrive as much for reuse as for performance.