Smithy4s full stack (p.1): Introduction

smithysmithy4sscalaseries:smithy4s

Series TL;DR

Hello and welcome to yet another installment of "this was supposed to be short and sweet". Building yet another full-stack Scala application and deploying to a new, small cloud provider.

This time, my plan was simple and I was going to use things that I know well (Cats Effect, JVM, Weaver-test), along with things I can manage (Scala 3, Laminar), with the addition of things I have very little experience with (Smithy4s, Skunk).

And even though I've already built a fullstack app before, it was using Scala Native with libraries and tools that I had to either build myself or heavily modify. To make this particular installment a challenge that will (hopefully?) be interesting to follow, I've decided to see whether we can use Smithy4s to do the heavy lifting for us - namely generating HTTP routes, API client for the frontend, along with a plethora of newtypes for improved typesafety.

In terms of rules of engagement, this series will be shorter than previous one (turns out, not having to build your own library for postgres access is a tremendous time saver), and will not have as its goal to show you all the code. The code is available in the Github repository, and we'll focus on the system design and the specifics I find interesting.

I've briefly introduced Smithy and Smithy4s in my previous blogpost, and you can listen to this podcast appearance by Olivier Melois (still outrageously twitterless), who has been the driving force for adoption of Smithy in Disney Streaming and creation of Smithy4s.

Let's dig into the specifics of why I feel Smithy4s is attractive for full-stack Scala development.

Why Smithy and Smithy4s?

Frameworks vs libraries

Let's start from enough distance to make the topic unnecessarily contentious. For as long as I have worked in the Scala ecosystem, there raged a quiet war - frameworks versus libraries. Bloodshed, destruction, and muted rumblings of offended parties have wreaked havoc on an otherwise peaceful ecosystem.

To summarise in a way that annoys either side:

  1. The appeal of frameworks is low upfront cost and opinionated set of tools designed around some notion of mainstream developer and their needs

    Common theme among frameworks is the existence of either templates or starter projects, often available as part of framework's CLI tool (not just talking about Scala here, of course).

  2. The appeal of (good) libraries is improved composability at the cost of high investment upfront.

    Libraries are intended to solve a particular part of the larger problem, and the exercise of putting libraries together is usually left to the user.

Frameworks are often accompanied by a snappy GIF or a video, demonstrating how blazing fast it is to setup a simple CRUD application with auto-generated user interface, database models, etc.

Library proponents laugh at such childish desire to get started quickly, and they laugh loudly as they walk back to their multi-day grind to add a new database model.

My own take on this, after having worked with several frameworks in PHP, JavaScript, and (briefly) Scala, that the upfront cost of cobbling together libraries is real, but so is the struggle against the rigidity of frameworks as the project evolves and no longer anything is as simple as running a nifty, GIFable CLI command to generate an endpoint.

Cutting down on boilerplate

Nobody likes boilerplate. Zealots will argue until blue in the face, protecting boilerplate as the cost of doing things the right way, but if the cost is too high, it takes a really strong culture to get people over the initial difficulties of trying to achieve something simple.

If seen from that perspective, Smithy4s touts the following promise:

If your service can be defined in terms of a restricted language, that service can be generated with high fidelity models, HTTP endpoint definitions, and performance equal to or surpassing that of a hand-written HTTP server

This promise is not on Smithy4s' website, but this is how I interpret the claims and my own understanding of generated code.

Additionally, by delegating the HTTP code generation to Smithy4s we make an explicit choice to deal only with high level entities within our domain. In other words, when we write business logic we don't think about

POST /cart/add?id=25
Content-type: application/json 
Content-length: 34


{"item": "pot-ah-to-es", "id": 11}

instead, we want to think in terms of

case class ItemAttributes(item: ItemTitle, id: ItemId)

and

trait CartService:
   def add(cartId: CartId, item: ItemAttributes)

and for large majority of applications this data model should be expressible in a very simple language, and therefore fully processable by a relatively simple program.

That program is composed of Smithy language parser and model builder, and Smithy4s - code generator, which produces HTTP server and client definitions using the following libraries:

Instead of replicating the tutorial in full, I recommend you skim through the Quick Start section that covers all the fundamentals.

Language agnostic protocol definition

Lucky are the services that get to live out the rest of their days in full isolation. Alas, the majority of services end up succumbing the worst plague a piece of software can suffer - users. Those users can be actual human persons, clicking and submitting with gay abandon, or it can be mobile applications, or web frontend, or other services.

On a lower level, the services handling those interactions accept a set of JSON payloads they deem "well formed" - and that definition can come from either a set of request/response examples, or derived from an Interface Definition Language specification, or IDL.

In that sense, Smithy is an IDL, allowing to describe both data and interactions (operations, services, errors) in a clear and concise manner. It can be used as the input to guide generation of any sort of client or server code, in any language - even if the generator itself has to be written in a JVM language, as that's what Smithy compiler currently targets.

AWS (authors of Smithy) themselves use it to generate SDKs for various languages.

Application and requirements

Unlike last time, I won't bother with the specification of every possible feature of the app, but rather will focus on the main theme, general feature set, and, most importantly, the techniques and aspects of a fully working app I want to demonstrate.

So what is it that we're building? The thing that brings joy to any child's heart - a basic job website, where users can create companies and post vacancies.

I chose this particular app and domain because there's at least 3 distinct "services" I could identify:

  1. Jobs
  2. Companies
  3. Users

each of which manages a relatively small number models, but nonetheless rich enough to demonstrate various features of Smithy for data modeling.

This project is fairly light on features:

  1. Users can login and register
  2. Registered users can create companies
    • Companies have name, URL, and description
  3. Registered users can add jobs in the companies they created
    • Jobs have name, URL, description, and a salary range

The important difference between this and the previous app I've built is that we will be using Smithy specs as the ground truth for the API supported by our backend - we shall avoid adding any manually defined HTTP endpoints, only staying within the confines of what Smithy allows us to express.

This in turn will make it easier to use the Scala.js version of the generated code on the frontend, and avoid having to handcraft any HTTP interactions whatsoever.

Project structure

With Smithy4s plugin being enabled on two projects - Shared JVM and Shared Scala.js. The App module will be the one producing a deployable package for our cloud platform of choice.

This post will focus on writing all of the Smithy definitions for our application, along with configuring the shared module.

Hosting

Speaking of platform of choice - our app will be hosted on the PaaS called Platform.sh.

There were a few reasons why I chose it:

  1. Support for Java applications

  2. Postgres as a service

  3. Swanky website

  4. Free trial

    Which ended up expiring because I screwed around for too long. So now I'm paying for all this. And for Heroku which I wanted to keep to handle multi-cloud failover.

  5. CLI operations

    I enjoyed the experience with Fly.io and how simple everything was to setup on Github actions. Platform.sh turned out to have decent CLI for the operations that I need (like browsing logs), but by default the deployment strategy seems to be based on pushing to a particular Git remote.

    Personally, I dislike this style of deployments, especially for how laborious the setup for existing Github repository is.

We will configure all the bits and bobs Platform.sh needs in the second part of this series, which focuses on backend.

Heroku makes a wild appearance

I spent so much time procrastinating on this post, that my Platform.sh trial ran out and I decided to move the app to Heroku.

Reasons being:

  1. I already have minimal experience with setting up a Scala + Postgres app on it
  2. It's damn trivial to deploy from Github actions
  3. It works with just docker images if you wish, which is what originally drew me towards Fly.io - I still think "throw a self-contained Docker container somewhere" is the best deployment strategy.

Rather than lose the information about Platform.sh I've written down, I will duplicate instructions for both.

Apologies for rather jagged narrative.

Smithy definitions

Now we will define the Smithy specifications for the features our app will support at this point.

It's worth noting, that defining all your API specifications upfront is an excellent dream, but for the most part an unachievable one - I've been making tweaks to the specs continuously, adding features, changing and renaming definitions, restructuring payloads, etc.

For narrative purposes we'll assume that the author of these specs possesses the ultimate power of hindsight.

SBT build setup

First of all, we need to setup a basic SBT build and introduce the Smithy4s plugin to it.

project/build.properties

sbt.version=1.7.1

project/plugins.sbt

// plugins we need for this section
addSbtPlugin("com.disneystreaming.smithy4s" % "smithy4s-sbt-codegen" % "0.14.2")
addSbtPlugin("com.eed3si9n"     % "sbt-projectmatrix"   % "0.9.0")
// plugins that we'll need for the build anyways
addSbtPlugin("io.spray"         % "sbt-revolver"        % "0.9.1")
addSbtPlugin("com.github.sbt"   % "sbt-native-packager" % "1.9.9")
addSbtPlugin("org.scala-js"     % "sbt-scalajs"         % "1.10.1")
addSbtPlugin("org.scalameta"    % "sbt-scalafmt"        % "2.4.6")

The only two plugins we need is Smith4s' own codegen plugin, and sbt-projectmatrix.

build.sbt

val Versions = new {
  val Scala          = "3.1.3"
  ...
}

lazy val shared = projectMatrix
  .in(file("modules/shared"))
  .defaultAxes(defaults*)
  .jvmPlatform(Seq(Versions.Scala))
  .jsPlatform(Seq(Versions.Scala))
  .enablePlugins(Smithy4sCodegenPlugin)
  .settings(
    libraryDependencies ++= Seq(
      "com.disneystreaming.smithy4s" %%% "smithy4s-http4s" % smithy4sVersion.value
    ),
    Compile / doc / sources := Seq.empty
  )

lazy val defaults =
  Seq(VirtualAxis.scalaABIVersion(Versions.Scala), VirtualAxis.jvm)

[the more you know] the defaults definition is only there so that sbt-projectmatrix doesn't add Scala 3 version to generated project names it still assumes 2.13 by default, so our project would've been named shared3 instead of shared

This definition will generate projects named shared and sharedJS, both of which will have Smithy4s plugin enabled, meaning all the *.smithy files in the folder modules/shared/src/main/smithy/ will be processed by the plugin, triggering code generation.

And this will be the folder where we will put all of our Smithy specs. Note that the input folders can be configured, and the *.smithy files can even come from dependency jars. This allows to support two potential ways that service definitions can be shared with consumers:

  1. Specs are stored in a centralised location, and each service embeds it using some form of source dependency (or who am I kidding - using Git submodules).

  2. Specs are published as JARs from a centralised location, and each service treats them as any other dependency.

Editor setup

If you've read my notes about working with Smithy files then you already know that for the most part it's facilitated by the means of LSP server.

The LSP server itself cannot interpret SBT build definition (it's build-tool agnostic), so it needs to be told that in the case of smithy4s projects, some definitions may come from other places.

To do that, we will place a file named .smithy-build.json at the root of our project, with the following contents:

{
    "mavenDependencies": [
        "com.disneystreaming.smithy4s:smithy4s-protocol_2.13:latest.stable"
    ]
}

And the LSP will download this artifact and extract Smithy definitions from it.

Shared errors

_globals.smithy

There will be some definitions that we want to share across services:

  1. Validation error

    A generic way to represent operation failing due to user input. We will use a primitive version of it, but there are ways one can improve on it, which might have a payoff if you are dealing with larger models or multi-step forms.

  2. User authentication errors

    Some operations are only available to authenticated users.

  3. User authorization errors

    Certain operations are allowed only to specific users - for example the only user that can delete a company is the one that created it.

With those in mind, our first Smithy spec will look like this:

namespace jobby.spec

@error("client")
@httpError(400)
structure ValidationError { // 1
  @required
  message: String
}

@error("client")
@httpError(401)
structure UnauthorizedError { // 2
  message: String
}

@error("client")
@httpError(403)
structure ForbiddenError {} // 3

string AuthHeader // 4
  1. We define a structure called ValidationError, which has a single required field named message of type String

    Additionally, we indicate that this structure represents a client-side error, and within HTTP semantics it should be reported with HTTP code 400

  2. UnauthorizedError is similar, only message is optional, and HTTP code is 401

  3. ForbiddenError has no fields whatsoever

  4. This definition is interesting.

    Basically, we're defining a newtype called AuthHeader which is a plain string under the hood.

The exact way the structures are rendered in input/output and whether AuthHeader will indeed be a newtype in generated code is completely down to the code generator.

Users

users.smithy

Moving on, let's define the API for various user and auth operations.

Here are the operations I would like to support:

  1. Login

    No special requirements, just login and password as input, access token as output.

  2. Register

    Login and password as input, no output

  3. Refresh

    This will be a bit of a spoiler for the backend part of this series. In the previous project we implemented authentication on the client side by persisting a long-lived access token on the client, which is quite insecure and is riddled with bad practices

In this project we will pay a lot more attention to the security aspect, and instead implement a complex token refresh loop, where access tokens are very short lived and not persisted anywhere.

Before we define services and operations, let's define newtypes and errors:

namespace jobby.spec

use smithy4s.api#simpleRestJson
use smithy4s.api#uuidFormat

// ...

@uuidFormat
string UserId
string UserLogin
string UserPassword

string AccessToken
string RefreshToken
string Cookie
integer TokenExpiration

@error("client")
@httpError(400)
structure CredentialsError {
  @required
  message: String
}

Most are obvious, apart from UserId - it's annotated with @uuidFormat, which is imported from smithy4s.api namespace. What is it?

Well, as part of Smithy4s, you're receiving certain definitions that the codegen will actually understand and interpret accordingly. This particular trait (they are called traits in Smithy, and they use annotation syntax you might be familiar with from Java/Scala/Kotlin) is defined in smithy4s protocol as such:

@trait(selector: "string") // "only applies to strings"
structure uuidFormat {
}

And traits are very important, for they are extensibility points that can be interpreted by code generators and other tools. This particular trait, when applied a string-like newtype, will instruct Smithy4s to generate a newtype definition backed by a java.util.UUID, instead of String:

package jobby.spec

import java.util.UUID
import smithy4s.Newtype
import smithy4s.syntax._

object UserId extends Newtype[UUID] {
  val id: smithy4s.ShapeId = smithy4s.ShapeId("jobby.spec", "UserId")
  val hints : smithy4s.Hints = smithy4s.Hints(
    id,
    smithy4s.api.UuidFormat(),
  )
  // ...
}

Smithy4s provides a few other traits, but we will use only this one and simpleRestJson that I will introduce next.

Let's define the input/output structures needed for each operation.

  1. Login. Input: LoginInput, output: Tokens

    structure LoginInput {
      @required 
      login: UserLogin,
    
      @required 
      password: UserPassword
    }
    
    structure Tokens {
      @required
      access_token: AccessToken,
    
      @httpHeader("Set-Cookie")
      cookie: Cookie,
      expires_in: TokenExpiration
    }
    
  2. Register. Input: RegisterInput, output: N/A

    structure RegisterInput {
      @required 
      login: UserLogin,
    
      @required 
      password: UserPassword
    }
    
  3. Refresh. Input: RefreshInput, output: RefreshOutput

    structure RefreshInput {
      @httpHeader("Cookie")
      refreshToken: RefreshToken,
    
      @httpQuery("logout")
      logout: Boolean
    }
    
    structure RefreshOutput {
      access_token: AccessToken,
    
      @httpHeader("Set-Cookie")
      logout: Cookie,
    
      @required
      expires_in: TokenExpiration
    }
    

A few things stand out - some of the fields we've annotated with either httpQuery or httpHeader - why is that?

This is where our HTTP semantics are starting to peek through a little bit - we're using built-in Smithy traits which will be interpreted by Smithy4s as instructions to either

  • write a particular part of the output structure into a named header (like logout in RefreshOutput, which will end up as a Set-Cookie header) or

  • read particular part of input structure from a named header (like refreshToken in RefreshInput) or a query parameter (like logout in RefreshInput)

Fields marked with those special traits will not be rendered as part of JSON output, and will not be read from JSON input either.

Now that we have inputs and outputs for all the operations, let's define them:

@http(method: "POST", uri: "/api/users/login", code: 200)
operation Login {
  input: LoginInput,
  output: Tokens,
  errors: [CredentialsError]
}

@idempotent
@http(method: "PUT", uri: "/api/users/register", code: 204)
operation Register {
  input: RegisterInput,
  errors: [ValidationError]
}

@http(method: "POST", uri: "/api/users/refresh", code: 200)
operation Refresh {
  input: RefreshInput,
  output: RefreshOutput,
  errors: [CredentialsError, UnauthorizedError]
}

Smithy documentation has more details on API operations and HTTP traits but I believe the definitions themselves are quite readable and understandable.

All that remains is to put all of these operations into a service:

@simpleRestJson
service UserService {
  version: "1.0.0",
  operations: [Login, Register, Refresh]
}

and here we use @simpleRestJson annotation - it's provided by Smithy4s and will be rendered according to the (opinionated) protocol specification that Smithy4s implements.

And that is it! The rest of the specs we won't be commenting on with such detail.

Companies

companies.smithy

This service will support the following operations:

  1. CreateCompany
  2. DeleteCompany
  3. GetCompany
  4. MyCompanies - companies added by the authenticated user
  5. GetCompanies - bulk get companies (one of those operations that was inspired by frontend needs)

Models

@error("client")
@httpError(404)
structure CompanyNotFound {}

@uuidFormat
string CompanyId
string CompanyUrl
string CompanyName
string CompanyDescription

structure CompanyAttributes {
  @required
  name: CompanyName,

  @required 
  description: CompanyDescription,

  @required 
  url: CompanyUrl
}


structure Company {
  @required
  id: CompanyId,

  @required
  owner_id: UserId,

  @required 
  attributes: CompanyAttributes
}

Inputs and outputs

GetCompany - where we are introduced to a very verbose way Smithy defines list-like shapes.

structure GetCompaniesInput {
  @required
  ids: CompanyIdList
}

list CompanyIdList {
  member: CompanyId
}

structure GetCompaniesOutput {
  @required
  companies: CompaniesList
}

list CompaniesList {
  member: Company
}

DeleteCompany - which has no outputs and requires an AuthHeader present - this is how we will be modeling authentication, which in general is a hard problem from an ergonomics/generality trade off, you can read more in a related issue

structure DeleteCompanyInput {
  @httpHeader("Authorization")
  @required
  auth: AuthHeader, 
  
  @httpLabel
  @required 
  id: CompanyId
}

Smithy really doesn't like us messing with the Authorization header (and I agree), so to silence this voice of reason we need to add a suppresion to the file:

metadata suppressions = [
    {
        id: "HttpHeaderTrait",
        namespace: "jobby.spec",
        reason: "I totally know what I'm doing"
    }
]

namespace jobby.spec
// ..

GetCompany - note that the output of that operation will be Company structure directly.

structure GetCompanyInput {
  @httpLabel
  @required 
  id: CompanyId
}

MyCompanies

structure MyCompaniesInput {
  @httpHeader("Authorization")
  @required
  auth: AuthHeader, 
}

structure MyCompaniesOutput {
  @required
  companies: CompaniesList
}

GetCompanies

structure GetCompaniesInput {
  @required
  ids: CompanyIdList
}

list CompanyIdList {
  member: CompanyId
}

structure GetCompaniesOutput {
  @required
  companies: CompaniesList
}

And finally, we can define all of our operations:

@idempotent
@http(method: "POST", uri: "/api/companies/", code: 200)
operation GetCompanies {
  input: GetCompaniesInput,
  output: GetCompaniesOutput,
  errors: [CompanyNotFound]
}


@readonly
@http(method: "GET", uri: "/api/companies/{id}", code: 200)
operation GetCompany {
  input: GetCompanyInput,
  output: Company,
  errors: [CompanyNotFound]
}

@idempotent
@http(method: "PUT", uri: "/api/companies", code: 200)
operation CreateCompany {
  input: CreateCompanyInput,
  output: CreateCompanyOutput,
  errors: [ValidationError]
}

@readonly
@http(method: "GET", uri: "/api/my_companies", code: 200)
operation MyCompanies {
  input: MyCompaniesInput,
  output: MyCompaniesOutput,
  errors: [ValidationError, UnauthorizedError]
}

@idempotent
@http(method: "DELETE", uri: "/api/companies/{id}", code: 204)
operation DeleteCompany {
  input: DeleteCompanyInput,
  errors: [UnauthorizedError, ForbiddenError]
}

Jobs

jobs.smithy

This spec follows the same structure as the companies one, so we won't be putting it here in full, you can see the it in the repository [[TODO]].

One thing I will call out is how we will model the salary range:

integer MinSalary
integer MaxSalary

structure SalaryRange {
  @required 
  min: MinSalary,

  @required
  max: MaxSalary,

  @required 
  currency: Currency
}

@enum([
   {value: "USD", name: "USD"}, 
   {value: "GBP", name: "GBP"}, 
   {value: "EUR", name: "EUR"}
])
string Currency

This introduces the @enum construct from Smithy, that allows you to define named enums, and Smithy4s will generate special code for it, where the companion object for Currency will contain all the permitted values named accordingly:

object Currency /* snip */ {

  case object USD extends Currency("USD", 0)
  case object GBP extends Currency("GBP", 1)
  case object EUR extends Currency("EUR", 2)

  val values: List[Currency] = List(
    USD,
    GBP,
    EUR
  )

  // ...
}

After all the definitions are done, you can run shared/compile or sharedJS/compile in your SBT shell to see the Smithy4s plugin generate the code and put it into the folders that SBT will automatically pick up.

Because we put all the definitions in the jobby.spec Smithy namespace, generated Scala definitions will be put into the jobby.spec Scala package.

That's it!

In the next part we will actually use those generated Scala definitions to define our entire backend API and deploy it to Platform.sh and Heroku.