Optimising a JavaScript library with WebAssembly, a failed attempt!

Rust is becoming more and more the language of choice to build tools for the web. Last example to date, Rome announced it will be using Rust.

Historically, Rust has also been one of the languages of choice to target WebAssembly, which has now shipped in all major browsers. One of the main benefit of using WebAssembly is that it is more performant than plain JavaScript code, in most cases. Hence, the idea to try and optimise the latest library I released (https://github.com/AntonioVdlC/range) by re-writting it in Rust!

But first things first. A very smart person once said that you could only improve what you could measure. So before going any further, let's look at how we can measure the performance of the @antoniovdlc/range library.

There are a few good options to run benchmarks in Node (for example, the aptly named benchmark library, or tiny-benchy which is used by Parcel), but for the sake of this exercise, let look into a lower-level API and directly use Node's perf_hooks

#!/usr/bin/env node

const { performance, PerformanceObserver } = require("perf_hooks");

const range = require("../dist/index.cjs");

const testBenchmark = performance.timerify(function testBenchmark() {
  let sum = 0;
  let i = 0;

  const r = range(0, process.env.SIZE);
  while (!r.next().done) {
    sum += i;
    i++;
  }

  return sum;
});

const obs = new PerformanceObserver((list) => {
  const entries = list.getEntries();
  const avgDuration =
    entries.reduce((sum, cur) => (sum += cur.duration), 0) / entries.length;

  console.log(`range(0, ${process.env.SIZE}): ${avgDuration}s`);
  obs.disconnect();
});

obs.observe({ entryTypes: ["function"] });

for (let i = 0; i < 1000; i++) {
  testBenchmark();
}

What the code above does is run 1,000 times a function which loops over a range of a given size and does a simple sum operation in each iteration. The benchmark is then calculated as the average time of all those 1,000 runs.

With the in hand, let's first look at the current implementation's performance:

range(0, 100): 0.007962769627571106s
range(0, 1000): 0.015898147106170653s
range(0, 10000): 0.08853049981594086s
range(0, 100000): 0.8147728093862534s
range(0, 1000000): 7.5012646638154985s

Honestly, not too shabby! Can we do better with Rust and WebAssembly?

If you want to follow along, please make sure to install Rust and its toolchain.

To compile our Rust code to WebAssembly, we will be using wasm-pack. It can be installed either with Cargo, or directly via npm:

npm i -D wasm-pack

We can then add the following script to our package.json:

{
  ...
  "scripts": {
    ...
    "build:wasm": "wasm-pack build --target nodejs"
  }
}

Now let's write some Rust code!

The first thing we will do is declare a struct called Range, which would be very similar to our implementation of ranges in JavaScript.

#[wasm_bindgen]
pub struct Range {
    _start: i32,
    _stop: i32,
    _step: i32,
    _inclusive: bool,

    // Counter used for iteration, so that we can iterate multiple times over
    // the same range
    i: i32,
}

#[wasm_bindgen]
impl Range {
    #[wasm_bindgen(constructor)]
    pub fn new(start: i32, stop: i32, step: i32, inclusive: bool) -> Range {
        Range {
            _start: start,
            _stop: stop,
            _step: if step != 0 { step } else { 1 },
            _inclusive: inclusive,
            i: start,
        }
    }
}

To surface a similar API to what we first implemented in JavaScript, we also write the following range function:

#[wasm_bindgen]
pub fn range(start: i32, stop: i32, step: i32, inclusive: bool) -> Result<Range, JsValue> {
    if start > stop {
        return Err(Error::new(
            (format!("Cannot create a range from {} to {}", start, stop)).as_str(),
        )
        .into());
    }

    return Ok(Range::new(start, stop, step, inclusive));
}

We can go on and implement the getters and other methods, but before investing too much on this exercise, let's focus on implementing the .next() method so that we can run our benchmarks on the compile WebAssembly code.

#[wasm_bindgen]
pub struct JsIteratorResult {
    pub value: Option<i32>,
    pub done: bool,
}

#[wasm_bindgen]
impl Range {
    #[wasm_bindgen]
    pub fn next(&mut self) -> JsIteratorResult {
        if self._inclusive && self.i <= self._stop || self.i < self._stop {
            let value = self.i;
            self.i = self.i + self._step;

            return JsIteratorResult {
                value: Some(value),
                done: false,
            };
        }

        self.i = self._start;

        return JsIteratorResult {
            value: None,
            done: true,
        };
    }
}

The above implementation is again extremely similar to the JavaScript code.

After compiling the above Rust code into WebAssembly, let's look at the benchmark ...

range(0, 100): 0.018000024318695067s
range(0, 1000): 0.09116293668746948s
range(0, 10000): 2.4152168154716493s
...

... and unfortunately, the numbers where more than disappointing.

It seems like the WebAssembly version of that specific library is orders of magnitude slower. This is probably mostly due to my inexperience with Rust and WebAssembly in general, and there are definitely ways to look deeper into what is causing such a lackluster performance, but it is also OK to fail, stop, and look for the next challenge!

This was an interesting experiment, and even though the end result wasn't as expected, it was still a great learning opportunity!

You might also like...