Open Source Scala I: versions, platforms, artifacts
- Who is this for?
- Users of Scala libraries
- Scala platforms
- What are Scala artifacts?
- How are Scala dependencies resolved?
- Where are the artifacts published?
A frequent (and not entirely incorrect) take on developing and maintaining Scala libraries is that it is unnecessarily complicated.
Rather than frothing against that notion, I would like to explore how the complexity of maintaining a Scala library can grow dramatically once we extend our reach further out to
- platforms other than JVM
- major, minor, and experimental Scala versions
- code quality tools (in a later installment)
- major releases of the dependencies we build upon
- documentation tools (in a later installment)
Each direction brings its own complexity. As any software engineer knows, solving several relatively simple problems does not prepare you for solving a combination of them.
More than anything, this is a brain dump of a lot of things I found necessary to understand the ecosystem of modern Scala libraries. A lot of the things will be already known to the reader, some - less so. All of them were useful when I started contributing to existing open source projects, which boosted my confidence and skillset.
The tools and techniques I've learned studing Scala open source ecosystem directly led to improvements of development processes at work.
Where and how will my library be used?
For Scala, the answer can get pretty complex. In its simplest form, when you want to make your libraries only available on the JVM for a particular binary Scala version, your life will be very, very easy.
TL;DR: just publishing your library for JVM-only, Scala 2.13 will guarantee that it will reach most of Scala's users who have the need for such a library.
That said, in many cases the overhead of publishing for other Scala versions and platforms can be kept quite low, and can ensure an even wider audience.
Currently, the most used Scala versions are 2.12 and 2.13.
Spark users have been locked to Scala 2.11 for a long time, but more recent versions have started using 2.12 exclusively, and some work is ongoing to provide support for 2.13.
2.x lineage, minor releases are binary incompatible with each
other. This means that if the library is only built using
projects that are only built with
2.12.* will not be able to use it. On
the other hand, patch versions are fully binary compatible with each other. That
means that our library can continue being built with
the users can run
2.13.6 respectively and consume our library's
artifacts without any issues.
While the usage of Scala 2.12 is shrinking, there's still a huge amount of actively developed code in the wild that is locked to 2.12 for a variety of reasons (missing dependencies, incompatibilities, reliance on 2.12-only language features, the list goes on).
Because of that, most libraries are cross-published for Scala 2.12 and 2.13 at least. Some library maintainers go as a far as removing 2.12 from their builds, to ease the maintenance burden, but I personally feel it's very optimistic. That said, it may well be the lever that convinces more engineering managers to invest in upgrading to Scala 2.13.
As long as the library we're building is published for any patch version of
2.13, we can cover a very large group of potential users.
Scala 3 (previously called Dotty) is a new Scala compiler, written completely from scratch.
A lot of effort has been put into making sure that vast majority of Scala 2 libraries can be built with Scala 3. A subset of Scala 3's syntax, type system, and features is compatible with Scala 2.13, which helped the transition.
The various communities within Scala ecosystem made significant efforts to provide Scala 3 artifacts of their libraries:
At the time of writing, most of the functional ecosystem (be it Cats/Cats Effect, Monix, and many libraries built on top of them) are fully available for Scala 3 and have been for some time. Trailblazers already report Scala 3 services running in production, which is very impressive and terrifying.
Play and Akka ecosystems are in the process of making their artifacts available for Scala 3. The process is usually stifled by the usage of Scala 2 macros, which are not supported and will be very hard to port to Scala 3's metaprogramming system.
As of version 2.6.18, Akka publishes artifacts for Scala 3
Most of the
os-lib) is also published for Scala 3. Ammonite added initial Scala 3 support in 2.4.0, while Fastparse is currently Scala 2 only as it relies heavily on Scala 2 macros
All of this is a long way of saying that if you want to future-proof your library today, it's better to make sure it's being published for Scala 3. We will come back to this subject in part II.
The good news that once Scala 3 pushes out most Scala 2 versions (years from now), the binary compatibility story will be much better - similar to the binary compatibilities that Scala.js has been offering:
...a library that was compiled against the Scala 3.2 standard library can be safely used with Scala 3.4. There is no need for library maintainers to re-publish when a new Scala 3.x minor release becomes available
As with many things, the compatibility is harder than it looks at first sight, so Scala 3 team is working on improving the forward compatibility after the experience of Scala 3.1 propagating through the ecosystem and forcing library maintainers and downstream users to upgrade.
The most popular platform and where experience is the most polished. Doesn't mean that it's perfect, but most library maintainers tend to put the most effort towards this platform as this is where vast majority of Scala users are.
Main pain points here are related mostly to the following:
Different versions of the Java platform (the good old days of assuming that nobody runs anything above JDK 1.8 are over)
This mostly relates to incompatible versions of the bytecode produced when the library artifact was built, or the usage of APIs that are removed/deprecated/added in the version of the JDK that the users of the library will be using.
The bytecode compatibility story is simpler, as by default the scala compiler will produce bytecode version 8, which can be read by all the versions of Java runtime above it. As long as you're not messing with the
-targetflag of the compiler.
Scala 2 compiler maintains compatibility with Java 8 (while also adding support for newer JDKs) and possibly will retain it forever.
In terms of features, one must be careful in two situations:
If you have Java sources in the project, make sure they're compiled targeting JRE 8 - by using a
--release 8flag. From SBT you can pass flags to the java compiler using
javacOptionssetting (we'll touch on that in a later post)
If you interact with some features of the JDK that were added in later versions, your users might not be able to run the binaries you produce. A useful site for comparing different JDK releases is Java Version Almanac
JVM's execution model with its super-late linking.
What it means in practice is that prior to running your application, the code you wrote and the code from external libraries is represented as a loose, flat collection of
.classfiles, which reference methods, classes, and values from each other by name.
This means that if you (or the build tool, or the external library author) get this list of
.classfiles wrong (incompatibilities in defined methods/classes/parameters lists, etc.), you will get a nasty, non-recoverable exception only at runtime.
Potential difficulties that publishing for Scala.js brings are:
Scala.js compiler's own versioning system.
As the compiler is evolving, it might need to make some breaking changes in the APIs, meaning the libraries built for one version of Scala.js might not be usable on another.
For example, here's the note about what a minor release means in Scala.js versioning system (i.e. a bump from 1.6.0 to 1.7.0):
It is backward binary compatible with all earlier versions in the 1.x series: libraries compiled with 1.0.x through 1.6.x can be used with 1.7.0 without change.
It is not forward binary compatible with 1.6.x: libraries compiled with 1.7.0 cannot be used with 1.6.x or earlier.
It is not entirely backward source compatible: it is not guaranteed that a codebase will compile as is when upgrading from 1.6.x (in particular in the presence of -Xfatal-warnings).
In practice this is usually quite simple - most projects often bump Scala.js to latest release without a second thought.
An old experimental project that has recently been taken under the Scala Center's wing and has received numerous improvements, with added support for Scala 2.13 being the most welcome.
Experimental support for Scala 3 has been released in version
As Scala Native is more active than ever before, a lot of maintainers add Native support to their libraries.
Another way a library's build can become more complex is if we want to target two different incompatible versions of some major library. In that case, we need to produce two distinct artifacts (different in version and/or name itself) for users of different versions of the dependency.
Examples of that can be different versions of
AWS SDK (v1 and v2 are completely incompatible), or
Cats Effect (versions
3.xare both source and binary incompatible, and both are used in the wild extensively), or
Http4s which has incompatible lineages for Cats Effect 2 and Cats Effect 3
Many other libraries that are following different bincompat guarantees (like Play ecosystem, which allows breaking changes in minor versions, e.g. 2.7.x to 2.8.x)
In some cases it might be impossible or unjustifiably difficult to create and maintain a codebase that caters and publishes for different versions of the same library.
As a personal note, if your dependencies maintain two binary lineages, then you can either do the same, or choose one and force the users to upgrade. With reality of open source maintenance often being a burden, choose what is right for your mental health and the amount of time you have to dedicate to OSS.
No one is getting any younger or healthier.
The general process is always the same - someone wrote the code, that code was compiled, and resulting artifacts are packaged in some way and uploaded somewhere where user's build tool can find and download those dependencies:
Discovery of source files depending on your module structure
Interacting with the necessary compiler (Scala or Java) to produce
(optionally) Injecting Scala.js or Scala Native compiler plugins into the compilation pipeline to produce necessary intermediary files
.jarartifacts with necessary metadata
.jar format is used in an overwhelming majority of the scenarios, and it
houses both regular
.class files understood by the JVM, and the
.nir intermediate files for Scala.js/Scala Native.
For example, here's the location of Cats' jar file on the Maven Central repository:
❯ curl -s -Lo cats.zip https://repo1.maven.org/maven2/org/typelevel/cats-core_2.13/2.6.1/cats-core_2.13-2.6.1.jar ❯ unzip -l cats.zip | grep class | head 6357 2010-01-01 00:00 cats/Align$$anon$1.class 29657 2010-01-01 00:00 cats/Align$$anon$2.class 4737 2010-01-01 00:00 cats/Align$.class 345 2010-01-01 00:00 cats/Align$AllOps.class 4111 2010-01-01 00:00 cats/Align$Ops.class 3373 2010-01-01 00:00 cats/Align$ToAlignOps$$anon$4.class 1147 2010-01-01 00:00 cats/Align$ToAlignOps.class 1279 2010-01-01 00:00 cats/Align$nonInheritedOps$.class 3324 2010-01-01 00:00 cats/Align$ops$$anon$3.class 959 2010-01-01 00:00 cats/Align$ops$.class
We can do the same trick if we use the location of the Scala Native version of this artifact:
❯ curl -s -Lo cats.zip https://repo1.maven.org/maven2/org/typelevel/cats-core_native0.4_2.13/2.6.1/cats-core_native0.4_2.13-2.6.1.jar ❯ unzip -l cats.zip | grep Align | head 2641 2010-01-01 00:00 cats/Align$$Lambda$1.nir 2571 2010-01-01 00:00 cats/Align$$Lambda$2.nir 1889 2010-01-01 00:00 cats/Align$$Lambda$3.nir 2567 2010-01-01 00:00 cats/Align$$Lambda$4.nir 3237 2010-01-01 00:00 cats/Align$$Lambda$5.nir 6357 2010-01-01 00:00 cats/Align$$anon$1.class 17824 2010-01-01 00:00 cats/Align$$anon$1.nir 29657 2010-01-01 00:00 cats/Align$$anon$2.class 95940 2010-01-01 00:00 cats/Align$$anon$2.nir 4737 2010-01-01 00:00 cats/Align$.class
And you can see the
.nir files that Scala Native will need when linking
(producing a single binary/dynamic library/static library) the application that
depends on Cats. A similar picture can be seen in the Scala.js version of this
artifact, but instead you'll see
.jar format being relatively simple, the craft lies in supplying the correct combination of compiler options,
compile dependencies, Scala sources, etc., to ensure the produced artifacts can
be pulled by the end user and relied on without problems.
The multitude of Scala versions and Scala platforms lead to questions about how those artifacts are named, resolved, and uniquely identified - and whether build tools need to be aware of those.
When it comes to dependency resolution, one of the goals of the build tool is to transform some metadata that we specify about a dependency into a physical URL of the JAR that could be located in one of the repositories specified in the build.
The formation of such URL is very much convention based, and that convention comes from the Maven build tool, and its notion of Maven coordinates.
Here's an example of defining a dependency on Cats, and the resulting URL that will be tried by the build tool.
We are using the dependency specification format
used by SBT,
but Mill has something similar, instead using
: instead of
% in most places.
┌─────────────────┐┌──┐┌─────────────┐┌─┐┌───────┐ │ "org.typelevel" ││%%││"cats-effect"││%││"3.3.5"│───────────┐ └─────────────────┘└──┘└─────────────┘└─┘└───────┘ │ │ │ ┃ │ │ │ │ ┃ │ │ └┐ └───┳────┻─┐ ┌──────┘ │ │ ┃ │ │ │ ▼ ▼ ▼ ▼ ▼ <repo>/org/typelevel/cats-effect_2.13/3.3.5/cats-effect_2.13-3.3.5.jar ╔═══════════════════════════════════════════════════════╗ ║ <repo> is https://repo1.maven.org/maven2/ for Maven ║ ║ Central ║ ╚═══════════════════════════════════════════════════════╝
In this case, Maven terminology defines these named components:
You can see that two transformations occurred:
groupId(also known as
organizationsetting in SBT) are replaced with
_2.13was appended1 to
The latter point is very important:
Maven does not understand Scala's binary
versions or Scala's platform - at its heart it's a flat storage of uniquely
Therefore to support the various incompatible Scala versions (2.12, 2.13, 3) and platforms (JVM, Scala.js, Scala Native), build tools publish and resolve artifacts using pre-defined suffixes in particular order.
Let's consider a few examples of how artifact name varies depending on Scala version and platform:
JVM platform, Scala 2.12:
JVM platform, Scala 2.13:
JVM Platform, Scala 3:
cats-core_3(no minor version)
Scala.js platform (version
1.x), Scala 2.13:
Scala Native platform (version
0.4.x), Scala 2.12:
The exact suffixes depend on the how committed the maintainers of Scala.js, Scala and Scala Native are to binary compatibility guarantees:
In case of Scala.js, the
1.xlineage maintains some level of binary compatibility, and therefore the artifacts don't need the full Scala.js version in the name
In case of Scala Native, the
0.4.xlineage is deemed stable, and therefore the suffix is
_native0.4. I will speculate that once Scala Native reaches 1.x status, it will follow Scala.js' rules and practices.
Maintainers of main Scala 2 compiler commit to binary compatibility up to the minor version, and this is why we have
_2.13suffixes (which always come last).
Scala 3 changes the way binary compatibility work, and all Scala 3 artifacts are published with a sole
_3suffix. This is potentially a game changer in library maintainers sanity, as it means minor releases will no longer require maintainers to re-publish everything.
Note, however, that out of the box SBT only handles Scala versions in this
artifactId transformation. Both Scala.js and Scala Native (at least their SBT
plugin versions) depend on
adds a new operator to SBT,
%%%, which will produce the correct artifact
depending on whether the project being built is a JVM, Scala.js, or a Scala
The most popular location, most trusted (implicitly, granted), the defaultest of the defaults in any build tool.
If you want to release your library and make it easily discoverable by your users, it has to be on Maven Central. I personally recommend using sbt-ci-release plugin which also includes detailed and easy to follow instructions on setting up your publishing credentials on Sonatype
Both SBT and Mill (and any other JVM build tool out there) have this repository enabled by default, without any user configuration.
It's managed by Sonatype OSS, and can be publicly searched.
A much better view of the same data is MVNRepository, which understands Scala artifacts very intimately, down to the different platforms and binary versions. In my experience is indispensable when upgrading dependencies and doing general updates management.
Another aggregator of this data (maintained by ScalaCenter) is called Scaladex, and it contains various platform matrices and ability to issue graphical badges to indicate latest versions of the artifact for each major Scala version/platform.
Bintray was considered to be a lower barrier of entry for authors publishing JVM artifacts. In particular, it was the distribution mechanism of choice for authors of SBT plugins.
From May 1st, 2021 Bintray was shut down.
The process of shutdown was gradual, where at first new uploads were rejected, and by May 1st all Bintray services (download and upload) were shut down.
If you are in the process of helping someone's library to get up to speed with newer Scala versions and platform, it is possible you will discover the bintray publishing logic, which will no longer work.
You will have to work with the maintainer of the library to set up a Sonatype account, credentials on the CI, and the new publishing logic.
Companies set up their instances (sometimes public) of Maven-compatible services, using, for example JFrog's Artifactory
Jitpack allows on-demand building of artifacts based solely on their Github coordinates - and it supports SBT
Here are the key takeaways:
- Dependencies and artifacts in Scala are just archives with
.classfiles with special names, uploaded somewhere
- Scala 2 has several main versions, and these versions are incompatible with each other
- Scala 3 aims to make compatibility story easier for maintainers and users