sn-bindgen-web - Typelevel stack on Scala Native 0.5

Goals
Architecture
Stack
Slow and steady
Valley of despair
The new hope
Conclusion

TLDR:

Web frontend for sn-bindgen using Scala Native 0.5, http4s, skunk, and smithy4s
Try it out https://sn-bindgen-web.indoorvivants.com/
Github repo is a treasure trove of useful information

In December 2021 I started work on sn-bindgen – a CLI and SBT plugin to generate idiomatic Scala 3 Native bindings to C libraries. This project enabled me to develop various applications, from a pure Scala Native Twitter clone, through a small game with Raylib and MacOS desktop application with Scala and Swift, to a Postgres library using the official C protocol implementation.

The project is not finished or perfect by any means, but it has enabled me to explore the world at the intersection of Scala and C/C++ ecosystems. One thing that always bothered me is the fact that depending on Clang limits the ability to "just run it" on various platforms – making it harder to demonstrate the value of proper bindings when helping someone on Discord. And getting rid of Clang is pretty much impossible at this point, even if I was a trillion dollar corporation, and not a tired solo dev.

So I decided to build a web application that can help showcase some aspects of the generated code without having to install anything.

This work started in 2023, at the time when lots of libraries were already published for Scala Native 0.4, and in particular libraries such as http4s and smithy4s, my trusted companions from previous jobs. The project is creatively named sn-bindgen-web, and what follows is some highlights from my time working on it.

demo

Goals

The User should be able to submit C header code and package name and receive back generated (and hopefully valid...) Scala Native code, along with optional glue C code. Generated bindings should be accessible via a permalink indefinitely (or nearly indefinitely).

In terms of non-functional requirements, binding should be processed asynchronously, separating the traffic pressure on the service from the CPU intensive process of compiling C code, using Postgres database as a buffer.

The app and all of its native dependencies should be deployable as a single Docker container, bootstrappable from a single docker build . -t sn-bindgen-web command.

From a development perspective, there should be a way to run the app locally without much effort, and have a live-reloading setup for both frontend and all parts of the backend.

Architecture

When the project started, Scala Native was single-threaded (if you haven't heard, it's multi-threaded now). This informed some of the architectural decisions – I needed to make sure that all parts of the application could scale horizontally. This lead me to separate the application into two main parts – web and worker. If you've ever done Rails, you remember just how much stuff those applications would stuff into background jobs handled by something like Sidekiq.

The web group will contain multiple processes responsible for handling all the HTTP requests coming from the user. This process group has no direct connection to the database – it can only interrogate the worker group and submit bindings.

The worker group is horizontally scalable, and expose a HTTP API to submit binding information and retrieve things like status and results. Workers themselves can steal work that hasn't been updated in a while – ensuring that even if the processes die, eventually all work should be processed, unless there is a catastrophic flaw in the worker itself.

All communications over HTTP are handled via Smithy4s-generated protocols, ensuring the APIs are strongly typed and synchronised across the entire project.

At the outset, using NGINX Unit and its Scala Native wrapper SNUnit was almost a requirement in order to orchestrate this multi-process architecture and put every processor group behind a HTTP load balancer.

With Scala Native being multi-threaded, there's less of a need to have those horizontally scalable layers – but given their relatively low resource requirements, running multiple stateless replicas helps distribute the load and isolate possible catastrophic issues.

Stack

Even though I saw the entire Typelevel stack (and ecosystem libraries, such as Skunk and Smithy4s) published for (then single-threaded) Scala Native 0.4, I didn't have any obvious ideas on how to utilise this, for blog or not. Planning the work on sn-bindgen-web forced my hand (thankfully):

to invoke the bindgen programmatically, the entire app has to be in Scala Native
to make the service less prone to abuse, bindings should be generated asynchronously – this involves queues, persistent state, polling, and many other things I would rather not write from scratch
multi-process architecture for asynchronous processing would require communication boundaries – and with polling, statuses, delivery of errors, etc. I'd rather no write all this protocol code by hand

So I decided to bite the bullet and implement the entire app with as much Scala Native as I could possible fit in it.

I have written in great detail about the libraries I've used in the README, it's well worth a look. For example, I used smithy4s-fetch in the frontend to reduce the frontend bundle size by 40%. Neat, huh?

Slow and steady

The development process initially was similar to developing any other application – on the code level, you're just writing some services using Http4s and Smithy4s. Lots of companies already do that.

Idiosyncrasies start to creep in as you start working with native dependencies (which I try to alleviate by using my vcpkg wrapper) and with NGINX Unit and its process model.

On top of that, Scala Native 0.4 was quite slow when trying to build http4s applications, mostly due to the optimizer having no restraint and going deep into the weeds of http4s classes and functions (of which there are a lot).

But nevertheless, through writing custom SBT tasks, bash scripts, and Dockerfiles, I slowly progressed to having the app working properly locally. I was extremely pleased with myself and grateful for the work people put in Scala Native and Cats Effect ecosystem to make it all work relatively painlessly.

Valley of despair

The joy turned sour in my mouth when things came to deploying the app. I struggled tremendously with native dependencies, as even though they were available in vcpkg, some of them required various arcane tools to build – from Python 2 to Ruby, via autoconf and autoreconf. Make, automake, cmake, ninja, libtool, and often many others. Before you go to sleep tonight, hug your JVM libraries and their delightfully easy going Maven dependencies tightly – you never know what horrors will tomorrow bring, you might have to work with this fucking mess of building C and C++ libraries.

After wading through the poisonous bog of trying to build the four C libraries I needed for my app, I entered a new rung of hell, unable to make sqlite (yes, I started with sqlite! some dudes on the internet were very convincing about it) reliably work with Fly.io's volumes and my app, constantly seeing weird artifacts like data corruption. When I gave up and switched to Postgres, I had troubles connecting to the provided postgres URL using Skunk, and instead opting for Supabase which seemed to work better. Note that I'm not laying any blame on Fly.io – I know things work there, I deployed many a thing there. Just in this particular case there was a multitude of factors that compounded the frustration.

All of this was accompanied by slow deployment processes, due to the self-imposed need to have the application build cleanly from a single Dockerfile.

And after going through all this, I had the app deadlock reliably on startup on the server, something I couldn't reproduce on any device I had.

At that point I was going to give up.

The new hope

A couple of years have passed, and a sense of de ja vu came over me – a new version of Scala Native, and finally, a new release of Cats Effect for Native, with full support of multi-threading!

Excitedly, I hurried to see which of the libraries I will need to publish locally to build my app against the micron-thick bleeding edge of the bleeding edge. A fair few it turns out. Getting that script and all associated PRs to work took me a couple of days, scattered over early mornings and late evenings.

After getting the build a few long overdue touchups, but without actually modifying any of the logic, I finally had the full app running on Native 0.5! Getting there was frankly much easier than I anticipated, and build times were dramatically quicker. Proving my excellent ability to prioritise what matters, I dove deep into CSS and frontend rework which took me days.

But it was all worth it as the app no longer deadlocked on the server! Finally I could show the results of my work. I was ecstatic. I don't know if it's "just" the multi-threading or some other bug that was fixed in the years between my attempts, but that didn't matter – it just worked!

I then spent a whole Saturday deploying the app on Hetzner VPS instead of Fly.io (because of the extortionate pricing of the latter) using Besom (more on that in a later post), and finally I had the app deployed and working the way I wanted.

https://sn-bindgen-web.indoorvivants.com/

Look at her go.

Conclusion

I've made a lot of things harder for myself in order to accelerate the learning process – the hammer of knowledge swung ferociously, bashing the nuggets of knowledge into my skull while the problem at hand appeared to remain unsolved.

But I have no regrets – the final result is far better than what it was in the beginning, and I walk away a tad more knowledgeable about things long forgotten.

With careful framing, Scala Native 0.5 is a perfectly workable stack for all sorts of things – despite requiring bold and ambitious operators to truly realise it's potential. That said, who doesn't want to be bold and amibitious?

Don't wait for someone to tell you when something is ready for production – you are the production, you make the rules.