Can multiple web workers boost the speed of your web application?
What I learned from creating a web application by experimenting with using multiple web workers (similar to a client-side actor system)
Introduction
In this article I will tell about some experiments I did creating a web-application called cloc-web, you can find it here.
The application is a single web page that implements a very simple version of CLOC (Count Lines Of Code). If you don’t know what I’m talking about, I suggest you go see the original one here, I think it’s a “legendary” library. Anyway, a TLDR is that it’s a software that receives a file or a directory as input and returns the number of lines of code (with some additional information, like comments are separated from code) as output.
Disclaimer: the original library is AMAZING, and by no means is my application meant to be an attempt to do something better, mine was just an experiment, if you are interested in using it to CLOC one of your projects, know that my version has A LOT of missing features, so either use the original one or this new two cool implementation, one in Rust and one in Go for something serious.
Why
CLOC is a cool type of application to learn a new language for example, so it’s something that I occasionally use for experimentation, it forces you to touch different aspects of a language and it’s fun to build.
For its nature, you will also face the potential issue of application slowness, by this I mean that if you want to count the lines of code of a big project, the time this can take quickly ramp up, and this is a perfect use case for web-workers in my opinion:
Imagine you press the button to CLOC a project, if it takes 10 seconds and the UI is frozen the entire time, you are probably not going to like that web application, aren’t you? If, instead, in these 10 seconds, there is a loader or something else that lets you know that something is going on, it becomes acceptable (at least you know that it’s doing something).
When I started the project, as far as I know, there was no web app implementing CLOC entirely client-side on the browser (every version were CLI application or client that then make the calculation server side).
This is probably enforced by the fact that browsers had no nice API to interact with the file system until this days.
With Chrome 86 (released in October 2020), a new set of API called File System Access API were released, and they are amazing. I wanted to try them because I think this set of APIs enables a new set of web applications that interact with the File system (e.g. nicer web editors).
Beware also that this APIs are not widely supported yet, so not all devices will allow you to try my application.
Note about benchmarks
All the data you will read below consider the Vuetify monorepo as the counted project. From my tests, it appears to have almost 7.000 files and more than 850.000 lines of code (at the time I developed the application), it’s not a huge repo, but neither is small. I’m not counting some files and folders, e.g., I’m ignoring node_modules.
Values are taken as an average among 10 runs on my mac.
First, simple version with one web worker
The first version is not even an experiment, it’s the most straightforward implementation that you can come up with (still involving web workers), the app creates one web worker, and when the user selects the folder of the project to count, the DirectoryHandle is sent from the main thread to the web worker. This worker then recursively reads the number of lines of all the files in the specified directory.
This solution is very simple and works fine, but it’s not fast, I get the result in 8–9 seconds, and I don’t like that we’re not leveraging the multicore architectures of modern devices.
The good point is that we keep the user interface interactive and snappy by running everything inside a web worker.
Second version, one web worker per file
For the second version I wanted to try something almost ‘extreme’. Inspired by the actor model I wanted to try to create one worker per file. If this sounds crazy, you are right, but I still wanted to try because I saw in my past some backend application that used massive amounts of actors.
Let me say first that this is possible only because the DirectoryHandle is serializable (also the FileHandle) and I was very surprised these references are serializable (kudos to people that created this set of API).
In this solution I have one “main” web worker, he cycle through all the directories and files, for each file he create a new worker and sends to it the FileHandle. The new worker just count the number of lines and return the results to the main worker. When the main worker receives all the results from the spawned workers, it puts everything together and sends the final result back to the application in the main thread to be shown to the user.
This version was a complete failure (as expected). It was crashing to count the Vuetify monorepo on my Mac, pretty sure that I was spawning too many workers (1 per file, so around 850.000 max theoretically).
Third version, pool of workers
Clearly I cannot spawn 1 worker per file (and it doesn’t make any sense probably), but I want to leverage more parallel work and utilise the multi-core architectures that new computers have.
So I decided to create a pool of workers (I tried different numbers, 4, 8, 16 to see if I had different results) that the main worker can use. When the application starts it creates the main worker and he creates the pool of workers. Similar to the previous version, when the the main worker receives the user input it cycles through the directories and files and send the FileHandle to another worker, this time picked from the pool.
I wanted to make sure to pick a free worker, not one that is already counting something, to achieve this I tried two solutions:
Waiting free workers with promises
This is the first approach that came to my mind, because the nature of promises, it looked perfect.
When the main worker sends the FileHandle to a worker, it associates a Promise to it and resolves it only when he get the response back, so while there is a pending promise he cannot send another FileHandle.
Not sure if I got something wrong, but this solutions was getting me very bad results:
- with 8 workers it took between 25–30 seconds
- with 16 workers it took between 38–42 seconds
Incrementing the amount of workers was actually degrading the performances. I could not figure out what this was caused by, it may have been a dumb error from mine.
I then decide to try a different approach:
Waiting free workers with polling
I tried to change the “waiting for promises” with a “polling mechanism” to get the free workers, I wasn’t expecting this to actually improve anything, but I was so surprised when I saw the measurements
- with 8 workers it took around 3.9s
- with 16 workers around 3.0s
- with 24 workers around 3.5s
These are nice results, it’s taking less than half the time to the solution with just one worker, I was pretty happy about this.
Not sure if the previous approach that used Promises was slower due to some overhead in the Promise API, in the end we are creating a lot of them, but it’s just an hypothesis.
Fourth version, count number of files and wait for that amount of worker responses
The fourth (and last) version works a little bit differently from the previous one, this time I’m not trying to use a free worker, I’m sending FileHandles to each worker continuosly (even if they are still counting), then waiting all the results.
So the main web worker iterates through the file system, for each file found, he increments a number that represents the number of workers to wait the response for, then he sends the FileHandle to a worker (picked with a round-robin fashion).
When a worker ends the file counting, he sends the data back to the main worker, the latter check if every worker has responded, if so returns all the aggregated results to the main thread to be shown to the user.
Results:
- with 8 workers: 4.5s
- with 16 workers: 4.2s
- with 24 workers: 4.2s
Honestly, I was expecting this solution to be faster compared to the poller one (because it doesn’t have some ‘waiting’ mechanism), but after some reasoning I may have found a possible motivation.
In this solution we are sending the first file to the first worker, the second to the second, and (if we have 8 workers) the ninth to the first worker again (round robin), so there may be the chance that one of the workers is unlucky and gets all the heaviest file of the project, this would lead to a minor bottleneck on it’s side, again this is just an hypothesis.
Conclusion
It was pretty funny to build cloc-web, and you can play with it here https://cloc-web.netlify.app/ (it works 100% client-side; no data is sent anywhere, so don’t worry about secrets etc. You can see in the network just one http request that I use to gather analytics-like and error tracking).
The deployed version uses the third solution (with polling), if you open the ‘settings’, you can also play with the dimension of the worker pool and set some files to ignore.
From the results taken with different approaches, I can see that we can have client-side advantages (performance-wise) by using multiple web workers to leverage multi-threaded devices.
The improvements are not incredible, and (probably) the higher complexity of the codebase it’s not worth 99% of the use cases, but there may be some particular web apps that can leverage this.
I still recommend using at least one web worker to run heavy processes client-side, considering that nowadays it’s straightforward to integrate workers in web applications using modern bundlers (or even better if you use frameworks like Next Nuxt and similar… E.g. https://nextjs.org/docs/basic-features/script#offloading-scripts-to-a-web-worker-experimental)
Resources
- App link https://cloc-web.netlify.app/
- Repository https://github.com/albertodeago/cloc-web
- Web Worker API https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API
- FileSystemAccessAPI https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API
More content at PlainEnglish.io.
Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.
Interested in scaling your software startup? Check out Circuit.