Smithy4s full stack (p.2): Backend and deployment

smithysmithy4sscalaseries:smithy4s

Series TL;DR

In part 1 our focus was on getting the APIs and domain models right. During the prototyping phase this part usually runs along the backend implementation phase, as it's impossible to predict the full expressiveness of the API required to implement things like frontend, downstream users, event analytics pipelines, etc.

Because of the size of the application and the short attention span we're working with (I was told to re-tell these stories in dance TikToks for better audience engagement), we'll just write the backend as if the perfect API spec has already been handed to us by the Golden Gods themselves.

Structure

  • Backend - SBT module and code, which

    • defines HTTP routes
    • implements business logic
    • implements authentication
    • handles database interactions
    • defines interface and configuration required to bootstrap the HTTP server
  • App - SBT module and associated code responsible for

    • reading database configuration from environment
    • initiating launch sequence for the app
    • starting HTTP server and binding it to correct port/hostname
    • packaging the application in a way that the cloud provider can run it
    • embedding optimised frontend code into app's package

Only the App has an entry point that JVM recognises (something like def main(args: Array[String]).

Our goal is to implement everything in the Backend module in such a way that any externalities (configuration, environment, filesystem) are abstracted away, which makes testing much easier and cleaner. App is there to fulfill those interfaces with values from the runtime environment.

Backend

Build configuration

To avoid information overload, I will provide a minimal SBT project definition first, and then extend it as necessary in each of the following sections.

val Versions = new {
  val Scala          = "3.1.2"
  // ...
}

lazy val backend = projectMatrix
  .in(file("modules/backend"))
  .dependsOn(shared)
  .defaultAxes(defaults*)
  .jvmPlatform(Seq(Versions.Scala))
  .settings(
    scalaVersion            := Versions.Scala,
    // ... more to follow
  )

This is a good starting point, and the meaning of defaults and Versions is explained in previous part

Libraries

We will use the following awesome libraries:

  • Http4s as our HTTP server implementation

    It also happens to be the HTTP library used by Smithy4s in code generation, and it's the de-facto standard in the functional Scala ecosystem.

  • Skunk as our database access library

    Skunk is remarkable because it implements the Postgres' binary protocol from scratch, without relying on any JDBC implementation. It also has absolutely best-in-class error reporting, and excellent API. I've used Skunk before in private personal projects and it has been excellent.

    Even so I would still consider it highly experimental, and not suitable to be used in production for mission critical applications.

  • JWT Scala in particular its uPickle flavour

    This is the JWT library I've used multiple times before, it has no dependencies of its own and has good API.

    I'm using the uPickle flavour in particular because I failed to understand how to construct and parse under-specified JSON objects using jsoniter, which is brought in by Smithy4s.

  • Scribe for logging

    Reason for choosing Scribe is plain and simple - it's not a wrapper around SLF4J interface, it's not configured with XML (COME ON PEOPLE), it has recently added native Cats Effect support, and it has an optional SLF4J bridge.

  • Flyway for database migrations

    On the surface, Flyway seems to be the only game in town when it comes to database migrations on the JVM (I have little experience, so assume that's the case).

    We'll be using it without any external wrappers, even though I was recently pointed to a decent looking one currently in development.

Database access

We will once again choose Postgres as our main database - not only because I want to force myself to actually learn more about it, but also because it's often used as a dunking weapon in the "NoSQL ain't no shit" type of deep takes on Twitter.

Configuration

Just like Fly.io or Heroku do it, on Platform.sh, if you have Postgres "relationship" added to your application, then the access configuration is exposed in the environment variables.

Unlike Heroku or Fly.io, the configuration is a Base64-encoded JSON string. The platform does provide a pure-Java artifact to read the config from the environment, but it was last released in 2019 and it has 5 CVEs coming from its dependencies - so let's do it ourselves.

To access Postgres, the minimal amount of parameters that we need is as follows:

case class PgCredentials(
    host: String,
    port: Int,
    user: String,
    database: String,
    password: Option[String]
)

And for local development we will provide a way to create an instance from the environment, with sensible defaults:

object PgCredentials:
  def from(mp: Map[String, String]) =
    PgCredentials(
      host = mp.getOrElse("PG_HOST", "localhost"),
      port = mp.getOrElse("PG_PORT", "5432").toInt,
      user = mp.getOrElse("PG_USER", "postgres"),
      database = mp.getOrElse("PG_DB", "postgres"),
      password = mp.get("PG_PASSWORD")
    )
end PgCredentials

Note that we are not using sys.env - which is also a Map[String, String] - for the purposes of simpler testing later on.

For Platform.sh though we will need to parse the object in the special environment variable. It can contain multiple relationships, so let's wrap this logic in a class:

class PlatformShLoader(env: Map[String, String]):
  private val loaded = env
    .get("PLATFORM_RELATIONSHIPS")
    .flatMap { rels =>
      val decoded = new String(Base64.getDecoder.decode(rels))
      Try(ujson.read(decoded))
        .fold(
          { case NonFatal(error) =>
            scribe.error("Failed to parse PLATFORM_RELATIONSHIPS", error)
            Option.empty
          },
          Option.apply(_)
        )
    }

  def loadPgCredentials(relationshipName: String): Option[PgCredentials] =
    loaded.flatMap { json =>
      try
        val db = json.obj(relationshipName).arr(0).obj

        Some(
          PgCredentials(
            host = db("host").str,
            port = db("port").num.toInt,
            user = db("username").str,
            database = db("path").str,
            password = Some(db("password").str)
          )
        )
      catch
        case exc =>
          scribe.error("Failed to read relationships configuration", exc)
          None
    }
end PlatformShLoader

If later on we want to load, for example, Redis credentials, we just modify this class.

For Skunk itself we'll also group the parameters in a case class:

case class SkunkConfig(
    maxSessions: Int,
    strategy: skunk.Strategy,
    debug: Boolean
)

These values will be hardcoded in the app, so we won't add any environment reading logic.

The loader of configuration for Heroku is slightly simpler, as the database access URL is passed in JDBC format:

class HerokuLoader(env: Map[String, String]):
   def loadPgCredentials: Option[PgCredentials] =
     env.get("DATABASE_URL").flatMap { url =>
       Try {

         val parsed = new java.net.URI(url)

         val host     = parsed.getHost()
         val port     = parsed.getPort()
         val userInfo = parsed.getUserInfo()
         val dbName   = parsed.getPath().tail // dropping the first slash

         val userName = userInfo.split(":").apply(0)
         val password = userInfo.split(":").apply(1)

         PgCredentials(
           host = host,
           port = port,
           user = userName,
           password = Some(password),
           database = dbName
         )
       }.toOption

     }
 end HerokuLoader

Schema and migrations

For Flyway to correctly pick up our migrations during application startup, they will have to be placed into modules/backend/src/main/resources/db/migration/ folder.

Our initial schema is fairly simple:

V001__Initial_Schema.sql

CREATE TABLE users(
  user_id uuid PRIMARY KEY,
  login character varying(50) not null,
  salted_hash character(81) not null -- 64 for the hash + 16 for the salt + ':',
);
-----
CREATE UNIQUE INDEX users_login_idx ON users (LOWER(login));

CREATE TABLE companies(
  company_id uuid PRIMARY KEY,
  owner_id uuid not null,
  name character varying(128) not null,
  description text,
  url character varying(512) not null,
  CONSTRAINT fk_owner FOREIGN KEY(owner_id) REFERENCES users(user_id) ON DELETE CASCADE
);

CREATE UNIQUE INDEX company_name_idx ON companies (LOWER(name));
-----
CREATE TABLE jobs(
  job_id uuid primary key,
  company_id uuid not null,
  job_title character varying(256) not null,
  job_description text not null,
  job_url character varying(512) not null,
  min_salary int4 not null,
  max_salary int4 not null,
  CONSTRAINT fk_company FOREIGN KEY(company_id) REFERENCES companies(company_id) ON DELETE CASCADE
);

And I've only applied two migrations:

V002__Add_Job_Timestamp.sql

ALTER TABLE jobs ADD COLUMN added timestamp with time zone not null;

V003__Add_Currency_Enum.sql

Which defines a new Postgres type to specifically represent the currency Enum.

CREATE TYPE currency_enum AS ENUM ('USD', 'GBP', 'EUR');
ALTER TABLE jobs ADD COLUMN currency currency_enum not null default 'GBP';

Codecs

Smithy4s will generate a lot of newtype definitions for us - things like UserId, UserLogin, JobTitle, etc. are strings and uuids at runtime, but at compile time they're represented as completely different type.

I've been writing and re-writing the codebase with these newtypes many times over, and, perhaps being late to the party, I feel like newtypes are the single biggest boost to productivity, as long as they don't come with boilerplate. Which in case of Smithy4s generators - they don't.

Skunk does have a lot of codecs for built-in types like String or UUID, but we need to provide some extra constructs for our newtypes. Thankfully it's easy, both due to the fact that Smithy4s newtypes are expressed using a Newtype[T] abstract class, and the fact that Skunk's Codec has a imap method:

package jobby
package database

import skunk.Codec
import skunk.codec.all.*
import smithy4s.Newtype

import jobby.spec.*

object codecs:
  extension [T](c: Codec[T])
    private[database] def as(obj: Newtype[T]): Codec[obj.Type] =
      c.imap(obj.apply(_))(_.value)
// ...

And we can use this extension like this:

val userId: Codec[UserId]       = uuid.as(UserId)
val userLogin: Codec[UserLogin] = varchar(50).as(UserLogin)
// etc
val companyId          = uuid.as(CompanyId)
val hashedPassword     = bpchar(81).imap(HashedPassword(_))(_.ciphertext)
val companyName        = varchar(128).as(CompanyName)
val companyDescription = text.as(CompanyDescription)
val companyUrl         = varchar(512).as(CompanyUrl)
val minSalary          = int4.as(MinSalary)
val maxSalary          = int4.as(MaxSalary)
val jobId              = uuid.as(JobId)
val jobTitle           = varchar(256).as(JobTitle)
val jobDescription     = text.as(JobDescription)
val jobUrl             = varchar(512).as(JobUrl)

Where uuid and varchar(50) are built-in Skunk codecs.

But what to do about currency? In the database we defined it as such:

CREATE TYPE currency_enum AS ENUM ('USD', 'GBP', 'EUR');

thus creating a new type in Postgres. For Skunk to be able to read it, we need to name the type and provide a enum definition:

val currency =
    `enum`[Currency](_.value, Currency.fromString, Type("currency_enum"))

Smithy4s generates the required methods (value and fromString) to go to and from a string representation, which is the bases of all Postgres enums.

One thing that is not represented by Smithy4s (as it's really an implementation detail we don't want to expose in our API specs) is the hashed password we store and manipulate in the database:

class HashedPassword(val ciphertext: String):
  override def toString()           = "<hashed-password>"
  def process[A](f: String => A): A = f(ciphertext)

//... in codecs 

val hashedPassword     = bpchar(81).imap(HashedPassword(_))(_.ciphertext)

Additionally, because we decided to implement our timestamps as a timestamptz, we need to provide a codec between that and the Timestamp type that Smithy4s uses. Skunk represents timestamps with time zone using a OffsetDateTime type, and Smithy4s' Timestamp has just the conversions for that:

val added = timestamptz
            .imap(Timestamp.fromOffsetDateTime)(_.toOffsetDateTime)
            .as(JobAdded)

To produce more complicated codecs, that convert a flat sequence of fields into a case class, we will use gimap:

val jobAttributes =
    (jobTitle ~
      jobDescription ~
      jobUrl ~
      salaryRange).gimap[JobAttributes]

val job =
    (jobId ~
      companyId ~
      jobAttributes ~
      added).gimap[Job]

val companyAttributes =
    (companyName ~
      companyDescription ~
      companyUrl).gimap[CompanyAttributes]

val company =
    (companyId ~
      userId ~
      companyAttributes).gimap[Company]

Operations

From my previous experience with writing raw SQL queries in Scala, I wanted to abstract that part away as much as possible, hoping it could help with testing as well. Let's represent some database operation as a set of generic inputs and outputs, along with a fully formed Skunk query:

sealed abstract class SqlQuery[I, O](val input: I, query: skunk.Query[I, O]):
  def prepare(session: Session[IO]): Resource[IO, PreparedQuery[IO, I, O]] =
    session.prepare(query)

Then for example a database to get user credentials can look like this:

case class GetCredentials(login: UserLogin)
    extends SqlQuery(
      login,
      sql"""
        select user_id, salted_hash 
        from users where lower(login) = lower($userLogin)
      """.query(userId ~ hashedPassword)
    )

And creating a job would look like this:

case class CreateJob(
    company: CompanyId,
    attributes: JobAttributes,
    jobAdded: JobAdded
) extends SqlQuery(
      ((company, attributes), jobAdded),
      sql"""
        insert into jobs(job_id, company_id, job_title, job_description, job_url, min_salary, max_salary, currency, added)
        values          (gen_random_uuid(), $companyId, $jobAttributes, $added)
        returning job_id
      """.query(jobId)
    )

Notice how we're structuring the input parameters ((company, attributes), jobAdded), this is due to the fact that Skunk's ~ operator (twiddle) used to build codecs is subject to Scala's operation precedence.

This is possible to solve, and in fact I've solved it before in Roach and I believe Scodec doesn't suffer from this problem either. Hopefully at some point I will have the time to contribute this to Skunk.

Query execution

Instead of working directly with the interfaces that Skunk offers (things like Session, PreparedQuery, etc.), let's build up our operations abstractions a bit more, and design an interface that can execute those queries:

trait Database:
  def stream[I, O](query: SqlQuery[I, O]): fs2.Stream[IO, O]

  def vector[I, O](query: SqlQuery[I, O]): IO[Vector[O]] =
    stream(query).compile.toVector

  def option[I, O](query: SqlQuery[I, O]): IO[Option[O]] =
    vector(query).map(_.headOption)

The only abstract member is stream - the rest can be implemented in terms of it, but implementers can also provide an override, if it's more efficient.

For the implementation of this trait that uses Skunk, all we need is a Resource[IO, Session[IO]] - where Session[IO] represents access to the actual connection to Postgres.

The stream implementation looks like this:

class SkunkDatabase(sess: Resource[IO, Session[IO]]) extends Database:
  def stream[I, O](query: SqlQuery[I, O]): fs2.Stream[IO, O] =
    for
      sess     <- fs2.Stream.resource(sess)
      prepared <- fs2.Stream.resource(query.prepare(sess))
      q        <- prepared.stream(query.input, 128)
    yield q
// ...

Ignore the hardcoded value - it can be easily part of the parameters, but I decided not to complicate matters.

For the purposes of this exercise, we'll leave vector to be implemented in terms of stream as is by default, but will override option as Skunk has a built-in method for that:

override def option[I, O](query: SqlQuery[I, O]): IO[Option[O]] =
  sess.use { s =>
    query.prepare(s).use(_.option(query.input))
  }

Having the Skunk interactions abstracted in this way will allow us to simplify testing further down the line.

Authentication

Now let's start working towards the actual functionality. We'll start with matters of authentication, namely:

  1. Password hashing
  2. JWT tokens

Because we're in the sweet warm embrace of the JVM again, there's no need to build bindings to OpenSSL like last time - we'll just use built-in MessageDigest API.

Same with JWTs - we already agreed to use JWT Scala library, so there's nothing really to setup.

To outline our set of requirements:

  1. Passwords should be using SHA256 hashing, with a 16-character salt.

  2. JWT tokens come in two flavours:

    enum Kind:
      case AccessToken, RefreshToken
    
    • Refresh tokens are long-lived, and their sole purpose in life is to be exchanged for an access token

    • Refresh tokens cannot be themselves refreshed, once they expire, the user will need to re-login

    • Access tokens are short-lived (15 minutes tops) and give the user access to all the operations.

This is a more convoluted and restricted approach to tokens than what I've done in Twotm8, and will affect dramatically the frontend logic and parts of the backend logic.

Password hashing

A very inefficient implementation can look like this:

crypto.scala

import jobby.spec.*
import cats.effect.*
import java.security.SecureRandom
import cats.effect.std.Random
import java.security.MessageDigest

object Crypto:
  def hashPassword(raw: UserPassword): IO[HashedPassword] =
    Random.javaSecuritySecureRandom[IO].flatMap { r =>
      for
        seed <- r.nextString(16)
        seeded = seed + ":" + raw.value
        digest = sha256(seeded)
      yield HashedPassword(seed + ":" + digest)
    }

  def sha256(s: String): String =
    val hash = MessageDigest.getInstance("SHA-256")
    hash.update(s.getBytes)
    bytesToHex(hash.digest)

  private def bytesToHex(bytes: Array[Byte]): String =
    val sb = StringBuilder()
    bytes.foreach { b =>
      sb.append(String.format("%02x", b))
    }
    sb.result
end Crypto

It's inefficient because we're constructing a new instance of random generator every time the function is invoked.

Next LinkedIn this app is not, so let's look at this inefficiency in disgust and move on.

JWT tokens

The two main operations we need are

  1. "Minting" the tokens
  2. Validating tokens

The contents of the token will be controlled by some configuration, which we'll specify as such:

case class JwtConfig(
    secretKey: Secret,
    algorithm: JwtHmacAlgorithm,
    audience: JWT.Kind => String,
    expiration: JWT.Kind => FiniteDuration,
    issuer: JWT.Kind => String
)

And create/validate use this config as such:

def create(kind: Kind, userId: UserId, config: JwtConfig) =
  val claim = JwtClaim(
    issuer = Some("jobby"),
    expiration = Some(
      Instant.now
        .plusSeconds(config.expiration(kind).toSeconds)
        .getEpochSecond
    ),
    issuedAt = Some(Instant.now.getEpochSecond),
    audience = Option(Set(config.audience(kind))),
    subject = Option(userId.value.toString)
  )

  JwtUpickle.encode(claim, config.secretKey.plaintext, config.algorithm)
end create

def validate(token: String, kind: Kind, config: JwtConfig): Try[UserId] =
  JwtUpickle
    .decode(token, config.secretKey.plaintext, Seq(config.algorithm))
    .flatMap { claim =>
      val aud      = claim.audience.getOrElse(Set.empty)
      val expected = Set(config.audience(kind))
      if aud == expected then Success(claim)
      else
        Failure(
          new Exception(s"Audience: $aud didn't match expected: $expected")
        )
    }
    .flatMap { claim =>
      claim.subject match
        case None    => Failure(new Exception("no subject in JWT"))
        case Some(i) => Try(UUID.fromString(i)).map(UserId.apply)
    }

Token security

The presence of two types of tokens is inspired by the OWASP JWT Cheatsheet which states several rules for secure token storage.

We make our access token short-lived, so that if it's stolen, it is not useful for a long time.

We will also store the access token only in-memory on the web page, and refresh token will be stored as a secure, hardened cookie with restricted sitepath, so that scripts running on the page couldn't access it.

We're certainly not going nearly far enough, but it's acceptable for now.

Services

Users

Smithy4s will generate the necessary models and service traits from our specs, so all we need to do is provide the implementations, i.e.

class UserServiceImpl(
    db: Database,
    auth: HttpAuth,
    logger: Scribe[IO],
    deployment: Deployment
) extends UserService[IO]:

  override def login(login: UserLogin, password: UserPassword): IO[Tokens] = ???

  override def register(login: UserLogin, password: UserPassword): IO[Unit] = ???

override def refresh(
    refreshToken: Option[RefreshToken],
    logout: Option[Boolean]
): IO[RefreshOutput] = ???

For example, implementation of login roughly looks like this:

db.option(op.GetCredentials(login)).flatMap {
  case None => IO.raiseError(CredentialsError("User not found"))
  case Some(id -> hashed) =>
    val seed :: hash :: Nil = hashed.process(_.split(":").toList)
    val requestHash         = Crypto.sha256(seed + ":" + password.value)

    if !requestHash.equalsIgnoreCase(hash) then
      IO.raiseError(CredentialsError("Wrong credentials"))
    else
      val (refresh, maxAgeRefresh) = auth.refreshToken(id)
      val (access, maxAgeAccess)   = auth.accessToken(id)

    // ... 
    // Produce a Tokens object with the generated tokens
  1. Look up the user in database using GetCredentials operation
  2. If the user not found, raise an error
  3. If the user is found, we have their hashed password and id retrieved
  4. Verify hashed password against one stored in database, if they're not the same - raise an error
  5. If everything looks good, generate a pair of refresh and access tokens

In the Scala code, at this point we just need to return a Tokens object, which is generated by Smithy4s from this specification:

structure Tokens {
  @required
  access_token: AccessToken,

  @httpHeader("Set-Cookie")
  cookie: Cookie,

  expires_in: TokenExpiration
}

You can see that the cookie field will actually end up in a Set-Cookie header - so it needs to have the proper format. Gladly we already have the http4s library brought in, so we can just use ResponseCookie from that:

private def secureCookie(name: String, value: String, expires: HttpDate) =
  ResponseCookie(
    name,
    value,
    httpOnly = true,
    secure = deployment == Deployment.Live,
    path = Some("/api/users/refresh"),
    expires = Some(expires),
    sameSite = Some(SameSite.Strict)
  ).renderString

You can see that this cookie is valid on only 1 sitepath, only set for HTTP requests (i.e. JS has no access to it), and only works on the site that set it.

As long as we return a Tokens object, Smithy4s will transform it into the correct HTTP response, rendering part as JSON, and part as the designated header.

The implementation of register is thankfully much shorter:

override def register(login: UserLogin, password: UserPassword): IO[Unit] =
  val validation = (
    validateUserLogin(login), 
    validateUserPassword(password)
  ).traverse(IO.fromEither)

  validation *>
    Crypto
      .hashPassword(password)
      .flatMap { hash =>
        db.option(op.CreateUser(login, hash))
          .onError(ex => logger.error("Registration failed", ex))
          .adaptErr { case _ =>
            ValidationError("Failed to register")
          }
      }
      .void
end register

Those validate methods are interesting - ideally we'd want to provide user some immediate feedback in the frontend, without relying exclusively on server-side validation.

We can achieve this by putting the validators code into the shared module which will make it available to frontend as well!

package jobby
package validation

import jobby.spec.*

private[validation] def err[T](msg: String) =
  Left[ValidationError, T](ValidationError(msg))

private[validation] def ok =
  Right[ValidationError, Unit](())

def validateUserLogin(login: UserLogin) =
  val str = login.value.trim
  if str.length == 0 then err("Login cannot be empty")
  else if str.length < 5 || str.length > 50 then
    err("Login cannot be shorter than 5, or longer than 50 characters")
  else ok

def validateUserPassword(password: UserPassword) =
  val str = password.value
  if str.exists(_.isWhitespace) then
    err("Password cannot contain whitespace characters")
  else if str.length < 12 || str.length > 128 then
    err("Password cannot be shorter than 12 or longer than 128 characters")
  else ok

Now, finally, the refresh operation is the most complicated, but here's the implementation plan:

  1. if the logout parameter is set to true, we need to "unset" the existing cookie on the client - this is done by setting maxAge attribute to 0 in the cookie.

  2. if the refresh_token cookie is present and is valid (not expired), then we can mint a new access token, and return it to the user

Note that at no point do we issue a new refresh token - that's a job for the login operation.

Refresh also presents a prime opportunity to verify that the user in the refresh token exists, that their credentials have not changed since the time this refresh token was issued, and many other important things. Of which we will do none.

Companies

Thankfully, companies service is a lot simpler.

Longest method is create:

class CompaniesServiceImpl(db: Database, httpAuth: HttpAuth)
    extends CompaniesService[IO]:

  override def createCompany(
      auth: AuthHeader,
      attributes: CompanyAttributes
  ): IO[CreateCompanyOutput] =
    httpAuth.access(auth).flatMap { userId =>
      val validation = List(
        validateCompanyName(attributes.name),
        validateCompanyDescription(attributes.description),
        validateCompanyUrl(attributes.url)
      ).traverse(IO.fromEither)

      validation *>
        db.option(
          op.CreateCompany(userId, attributes)
        ).flatMap {
          case None => IO.raiseError(ValidationError("Company already exists"))
          case Some(id) => IO.pure(CreateCompanyOutput(id))
        }
    }
  end createCompany
  // ...

The rest are basically 2-3 lines each:

  override def deleteCompany(auth: AuthHeader, id: CompanyId): IO[Unit] =
    httpAuth.access(auth).flatMap { userId =>
      db.option(op.DeleteCompanyById(id, userId)).flatMap {
        case None     => IO.raiseError(ForbiddenError())
        case Some(id) => IO.unit
      }
    }

  override def getCompanies(ids: List[CompanyId]) =
    ids
      .traverse(id => db.option(op.GetCompanyById(id)))
      .map(_.flatten)
      .map(GetCompaniesOutput.apply)

DeleteCompanyById database operation has to be implemented like this to enforce ownership:

case class DeleteCompanyById(company: CompanyId, user: UserId)
    extends SqlQuery(
      company -> user,
      sql"""
        delete from companies where company_id = $companyId and owner_id = $userId
        returning 'ok'::varchar
      """.query(varchar)
    )

There's nothing really interesting about it, so let's move on...

Jobs

...to another disintresting service - jobs!

The only difference here is that we will abstract away the operation of getting the current time (used as the added attribute on the job), in order to later be able to manipulate the time in tests, for example when comparing ordering of jobs.

To do that we'll define a poorly named trait TimeCop with this minimal interface and default implementation:

trait TimeCop:
  def nowODT: IO[OffsetDateTime]
  def timestamp: IO[Timestamp] = nowODT.map(Timestamp.fromOffsetDateTime)
  def timestampNT(nt: Newtype[Timestamp]): IO[nt.Type] =
    timestamp.map(nt.apply)

object TimeCop:
  val unsafe: TimeCop = new:
    def nowODT = IO.realTimeInstant.map(_.atOffset(ZoneOffset.UTC))

note the timestampNT method - it will be useful when working with any of Smithy4s' newtypes that are backed by the Timestamp class.

With that, here's the implementation of createJob:

override def createJob(
    authHeader: AuthHeader,
    companyId: CompanyId,
    attributes: JobAttributes
): IO[CreateJobOutput] =
  for
    userId        <- auth.access(authHeader)
    companyLookup <- db.option(op.GetCompanyById(companyId))
    company <- IO.fromOption(companyLookup)(
      ValidationError("company not found")
    )

    _ <- IO.raiseUnless(company.owner_id == userId)(ForbiddenError())

    _ <- List(
      validateJobTitle(attributes.title),
      validateJobDescription(attributes.description),
      validateJobUrl(attributes.url),
      validateSalaryRange(attributes.range)
    ).traverse(IO.fromEither)

    added <- timeCop.timestampNT(JobAdded)

    createdJob <-
      db.option(
        op.CreateJob(
          companyId,
          attributes,
          added
        )
      )
    jobId <- IO.fromOption(createdJob)(
      ValidationError("well you *must have* done something wrong")
    )
  yield CreateJobOutput(jobId)
  end for
end createJob

Validations are omitted for brevity purposes.

Believe it or not, that's it - we now have implementations for our services, which for the most part don't deal with HTTP semantics or routes.

Let's see what it takes to turn this into an HTTP application.

HTTP routes

Serving frontend

Let's get the simpler part out of the way. When we bundle the application, the generated Scala.js frontend will be added to the resources location, recognised by the JVM.

This is the responsibility of the build tool, and we'll cover how it's done later on, in the frontend part of this series.

What's important is that the backend can expect that if the client requests a .js file, it should be served from the app's resources.

We can encode it like this:

object Static:
  def routes =
    val indexHtml = StaticFile
      .fromResource[IO](
        "index.html",
        None,
        preferGzipped = true
      )
      .getOrElseF(NotFound())

    HttpRoutes.of[IO] {
      case req @ GET -> Root / "assets" / filename
          if filename.endsWith(".js") || filename.endsWith(".js.map") =>
        StaticFile
          .fromResource[IO](
            Paths.get("assets", filename).toString,
            Some(req),
            preferGzipped = true
          )
          .getOrElseF(NotFound())
      case req @ GET -> Root        => indexHtml
      case req if req.method == GET => indexHtml

    }
  end routes
end Static

Which will serve any *.js file requested at /assets/ path.

Note the last two branches in the match block:

case req @ GET -> Root        => indexHtml
case req if req.method == GET => indexHtml

This ensures that the Single-Page Application URLs that client requests will serve the index.html page - and the frontend code loaded from it will parse the URL by itself.

Service routes

Smithy4s provides SmithyRestJsonBuilder which takes a service implementation and renders it as a http4s' HttpRoutes, implementing a custom json-in/json-out protocol aptly named SimpleRestJson: https://disneystreaming.github.io/smithy4s/docs/protocols/simple-rest-json/overview

Let's roughly split the entities in our system into two categories:

  1. Things that either change (write) to outside world or read from it
  2. Things that interact with outside world only by invoking methods on entities from group 1

Group 1: Logging, Database, AppConfig, TimeCop

Group 2: UserServiceImpl, CompaniesServiceImpl, JobsServiceImpl

Group 1 is what we will most often see injected as parameters in group 2 - and the entire dependency graph of our app can be built by starting from group 1. So if we want to build the HTTP routes, we can define a method with this signature:

def Routes(
    db: Database,
    config: AppConfig,
    logger: Scribe[IO],
    timeCop: TimeCop
) = //...

And the services themselves can be bootstrapped from those dependencies. This is not super related to HTTP routes per se, but it's more of the approach I take when thinking about dependency graphs and method signatures, as it will necessarily affect what I can and cannot test by providing fake implementations.

With the addition of generic error handler, this is all we need to do to create the full HTTP API for our app:

def Routes(
    db: Database,
    config: AppConfig,
    logger: Scribe[IO],
    timeCop: TimeCop
): Resource[IO, HttpApp[IO]] = 
  def handleErrors(routes: HttpRoutes[IO]) =
    routes.orNotFound.onError { exc =>
      Kleisli(request => logger.error("Request failed", request.toString, exc))
    }

  val auth = HttpAuth(config.jwt, logger)

  for
    companies <- SimpleRestJsonBuilder
      .routes(CompaniesServiceImpl(db, auth))
      .resource

    jobs <- SimpleRestJsonBuilder
      .routes(JobServiceImpl(db, auth, timeCop))
      .resource

    users <- SimpleRestJsonBuilder
      .routes(UserServiceImpl(db, auth, logger, config.http.deployment))
      .resource
  yield handleErrors(jobs <+> companies <+> users <+> Static.routes)
  end for
end Routes

Isn't this just beautiful?

The last thing we will add is a way to bootstrap the entire app from just the configuration:

class JobbyApp(
    val config: AppConfig,
    db: Database,
    logger: Scribe[IO],
    timeCop: TimeCop
)(using natchez.Trace[IO]):
  def routes = Routes(db, config, logger, timeCop)
end JobbyApp

object JobbyApp:
  def bootstrap(config: AppConfig, logger: Scribe[IO])(using
      natchez.Trace[IO]
  ) =
    for db <- SkunkDatabase.load(config.postgres, config.skunk)
    yield JobbyApp(config, db, logger, TimeCop.unsafe)

Note that we're not bootstrapping from the values of the environment - this will be the job of the module that we will define next, the one with actual entry point for JVM to run.

App and deployment

Now let's define a module which will handle the following responsibilities:

  • Apply database migrations with flyway
  • Launch the actual HTTP server with the application routes
  • Read command line arguments and environment variables and pass them to config bootstrap logic

And lastly, this module will be packaged using sbt-native-packager to deploy the entire app to the cloud.

SBT configuration

No special configuration required for now:

lazy val app = projectMatrix
  .in(file("modules/app"))
  .dependsOn(backend)
  .defaultAxes(defaults*)
  .jvmPlatform(Seq(Versions.Scala))
  .settings(
    scalaVersion            := Versions.Scala,
    Compile / doc / sources := Seq.empty,
    libraryDependencies ++= Seq(
      "org.http4s"    %% "http4s-ember-server" % Versions.http4s,
      "org.postgresql" % "postgresql"          % Versions.Postgres,
      "org.flywaydb"   % "flyway-core"         % Versions.Flyway
    )
  )

Flyway migrations

Let's wrap Flyway's API in a very primitive method that will work reasonably well with the rest of our codebase:

import org.flywaydb.core.Flyway
import org.flywaydb.core.api.exception.FlywayValidateException

def migrate(postgres: PgCredentials) =
  import postgres.*
  val url =
    s"jdbc:postgresql://$host:$port/$database"

  val flyway =
    IO(Flyway.configure().dataSource(url, user, password.getOrElse("")).load())
      .flatMap { f =>
        val migrate = IO(f.migrate()).void
        val repair  = IO(f.repair()).void

        migrate.handleErrorWith {
          case _: FlywayValidateException =>
            repair.redeemWith[Unit](
              ex => IO.raiseError(ex),
              _ => migrate
            )
          case other => IO.raiseError(other)
        }
      }

  Resource.eval(flyway)
end migrate

Not much to say here - this works reasonably well, but it may well be a completely wrong way of using Flyway - if that's the case I'd love to know.

HTTP server

Assuming we have a HttpConfig structure:

enum Deployment:
  case Live, Local

case class HttpConfig(host: Host, port: Port, deployment: Deployment)

Our server logic would look like this:

package jobby

import org.http4s.ember.server.EmberServerBuilder
import cats.effect.IO
import org.http4s.HttpApp

def Server(config: HttpConfig, app: HttpApp[IO]) =
  EmberServerBuilder
    .default[IO]
    .withPort(config.port)
    .withHost(config.host)
    .withHttpApp(app)
    .build
end Server

Startup sequence

Finally, we have enough for the entire app launch sequence:

object Main extends IOApp:
  def run(args: List[String]) =
    import natchez.Trace.Implicits.noop

    Resource
      .eval(AppConfig.load(sys.env, args)) // Load config from env 
      .flatMap(JobbyApp.bootstrap(_, scribe.cats.io)) // bootstrap app 
      .flatTap(app => migrate(app.config.postgres)) // apply migrations
      .flatMap(jobbyApp =>
        jobbyApp.routes.flatMap(Server(jobbyApp.config.http, _))
      ) // create HTTP routes and start HTTP server
      .use(_ => IO.never)
  end run
end Main

Packaging

We will use sbt-native-packager to create an application bundle.

To enable packaging, add this line to the app project definition in build.sbt:

lazy val app = projectMatrix
  // ...
  .defaultAxes(defaults*)
  .jvmPlatform(Seq(Versions.Scala))
  .enablePlugins(JavaAppPackaging) // <--- the important bit!
  .settings(
    scalaVersion            := Versions.Scala,
  // ...

Now if we run sbt 'app/stage', the modules/app/target/jvm-3/universal/stage/ will contain our app bundle, which we can run using the generated script in bin/app.

Platform.sh configuration

I won't replicate the full tutorial here, but at the end of the setup we should have:

  1. Project visible in Platform.sh dashboard
  2. Our SSH key added to the project
  3. A special Git remote (unique per-project) added to the local Git repo (remote will be called platform)

Once all of this is done, all we need to set up is

  1. Build command
  2. Routes
  3. Relationships (e.g. PostgreSQL)

Platform.sh supports Java application up to version 14 (not sure why this particular version, probably hasn't been updated in a while), and by default the build environment doesn't have SBT installed. This is not a problem though, as SBT can bootstrap itself from a small bash script.

Let's download and check in the sbt script at the root of our repository.

curl -Lo sbt https://raw.githubusercontent.com/sbt/sbt/v1.6.2/sbt && chmod +x ./sbt

Then packaging our application is as simple as running ./sbt app/stage command.

To let Platform.sh build environment know how to build and run the app, we need to create a file called .platform.app.yaml at the root of the project:

name: app

type: "java:14"

disk: 1024

hooks:
    build: './sbt app/stage'

relationships:
    database: "db:postgresql"

variables:
    env:
        JAVA_OPTS: '-Xmx3G'

web:
    commands:
        start: modules/app/target/jvm-3/universal/stage/bin/app $PORT 
  1. The hooks.build block sets up a command that needs to be run at build time

  2. the relationships block creates a relationship named database, which references a service called db which is of type postgresql

    We'll define the db service in a second.

  3. web.commands.start is the command that needs to be run to launch the application. Note the usage of $PORT env variable - it's set by Platform.sh and we pass it into the app itself, to make sure the HTTP server binds to the right port.

Currently we're referencing a service called db which we need to define - this is done in .platform/services.yaml:

db:
  type: postgresql:13
  disk: 1024

And Platform.sh requires us to be explicit about the routing of requests, which can be set up in .platform/routes.yaml, sending all requests to app upstream (for that is what we named it):

"https://www.{default}/":
    type: upstream
    upstream: "app:http"

"https://{default}/":
    type: redirect
    to: "https://www.{default}/"

Which seems to be just a very simplified DSL to generate Nginx configurations.

If everything is set up, you should be able to just git push platform main and see the project being built and deployed! The whole setup process was quite simple, and I was quite happy to see the API responding:

> http https://www.main-bvxea6i-scixcepouwedi.uk-1.platformsh.site/api/companies/37f09cd4-4e18-49f6-ab65-a5cb358c8d03
HTTP/1.1 200 OK
Content-Length: 303
Content-Type: application/json
Date: Tue, 14 Jun 2022 11:18:24 GMT
Strict-Transport-Security: max-age=0
traceresponse: 00-16f878a2e3b6e900a83537ebaeea6f63-835198f9a4a24d07-00

{
    "attributes": {
        "description": "Just doing things with stuff, sometimes it works, sometimes it doesn't. what else do you want? how many characters is this supposed to be?",
        "name": "People doing things",
        "url": ""
    },
    "id": "37f09cd4-4e18-49f6-ab65-a5cb358c8d03",
    "owner_id": "33e9a360-b843-4e93-ab90-cee6213d3ac8"
}

Heroku configuration

Deployment with Heroku is much easier and much more flexible - we will be using Docker containers to achieve that.

On the app's module configuration in SBT we need to set some basic docker settings:

build.sbt

dockerBaseImage         := "eclipse-temurin:17",
Docker / packageName    := "jobby-smithy4s",

To make sure that the image is bound to the correct port, we will make sure that our HttpConfig is created using the $PORT environment variable that Heroku sets.

The app can be deployed directly from Github Actions once we configure the necessary secrets and add this step to the pipeline:

    - name: Deploy
      if: startsWith(github.ref, 'refs/tags/v') || (github.ref == 'refs/heads/main')        
      env:
        HEROKU_API_KEY: ${{ secrets.HEROKU_API_KEY }}
        HEROKU_DEV_APP_NAME: ${{ secrets.HEROKU_DEV_APP_NAME }}
      run: | 
        sbt --client app/Docker/publishLocal
        # SECURITY YO
        curl https://cli-assets.heroku.com/install.sh | sh
        heroku container:login
        docker tag jobby-smithy4s:0.1.0-SNAPSHOT registry.heroku.com/$HEROKU_DEV_APP_NAME/web
        docker push registry.heroku.com/$HEROKU_DEV_APP_NAME/web 
        heroku container:release web -a $HEROKU_DEV_APP_NAME

And it seems to work!

~ > http https://jobby-smithy4s.herokuapp.com/api/companies/4a35faad-67c2-4f35-ae41-0ec5526ba3ff
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 414
Content-Type: application/json
Date: Sun, 24 Jul 2022 12:05:07 GMT
Server: Cowboy
Via: 1.1 vegur

{
    "attributes": {
    ...

I liked this setup so much that I stopped bothering redeploying to platform.sh

But instead of testing things manually, let's write some actual tests in the laziest way possible - see you in part 3!