Twotm8 (p.4): Building the backend

scalascala3scala-nativefly.iosn-bindgencnginx-unitseries:twotm8


Series TL;DR


Meta note - this post is huge, as we're building the entire backend from start to finish. Please use the Table of contents above and the navigation on the side.


Requirements

As mentioned in the introduction to the series, we are building what is essentially a superior version of Twitter, with all the possible sources of positivity removed.

Due to the limited amount of time and margin space, we need to reduce our feature set to just the most important features, and take some shortcuts. This means that the design we're developing will very likely not scale beyond a few thousands users, and our approach to authentication goes against all possible recommendations.

With this important disclaimer out of the way, let's go over the features we must support.

Functional

  • Thought leaders must be able to register

    No emails, or any other personal information. Just the nickname and password, where

    • password must be between 8 and 64 characters long
    • nickname must be between 4 and 32 characters long
    • nicknames must be unique
  • Registered thought leaders must be able to login

    After successfully logging in, their authentication must be preserved for 2 weeks

  • Authenticated thought leaders must be able to create and delete twots

    • twots must be between 1 and 128 characters long
  • Thought leaders can follow each other

    • following yourself is prohibited
  • Thought leaders have their own "wall"

    • wall consists of twots by the authenticated user and all the thought leaders they follow
  • Thought leaders can react with "uwotm8" to any twots, by any users

    • each user can only react once to each twot, and that reaction may be withdrawn at any point

Security

  • Passwords should not be stored in plaintext
  • Password hashes should be salted with exactly 16 random characters per salt
  • Authentication should be hard to spoof, and should use signed JWTs
  • It should be impossible to use specific API routes without a valid token

Non-functional

  • Application should handle failures gracefully

    • This excludes failures designed to restart the app, as per our deployment model
  • All requests must complete within 1 second

    • This is not very ambitious and in reality our app is much, much faster

Database schema

Code

When I embarked on this project, I've realised just how long I've gone without working with a RDBMS - most of my career has been revolving around various NoSQL databases.

So I attempted to design a database schema that mostly works, and it should be pretty obvious which bits just won't scale at all.

Thought leaders table

  • If in the future we want to allow thought leaders to change their nickname (even though by very definition they should get it "right" on first attempt), we want to have a stable identifier in the database and potentially mutable nickname

  • As we have no profile information, all we need to ensure is that credentials are stored alongside stable ID, and there's a unique constraint on the nickname

  • Additionally, we will create an index on the nicknames, for faster lookups

Here's the schema:

CREATE TABLE thought_leaders(
  thought_leader_id uuid PRIMARY KEY,
  nickname character varying(50) not null,
  salted_hash character(81) not null -- 64 for the hash + 16 for the salt + ':',
);
CREATE UNIQUE INDEX thought_leaders_nickname_idx ON thought_leaders (LOWER(nickname));

Twots table

Twots don't have any information outside of

  • unique twot ID
  • author's ID
  • text
  • "added at" timestamp, for sorting

This, and a foreign key constraint leads to this table schema:

CREATE TABLE twots(
  twot_id uuid primary key,
  author_id uuid not null,
  content character varying(128),
  added timestamptz not null,
  CONSTRAINT fk_author FOREIGN KEY(author_Id) 
    REFERENCES thought_leaders(thought_leader_id)
);

Followers table

This will be our first truly non-scalable decision - we will store pairs of <leader, follower> in a separate table, which for people with lots and lots of followers will lead to reading of huge amount of data.

CREATE TABLE followers(
  leader_id uuid not null,
  follower uuid not null,
  UNIQUE (leader_id, follower),
  CONSTRAINT fk_leader FOREIGN KEY(leader_id) 
    REFERENCES thought_leaders(thought_leader_id),
  CONSTRAINT fk_follower FOREIGN KEY(follower) 
    REFERENCES thought_leaders(thought_leader_id)
);

CREATE INDEX followers_leader_idx ON followers(leader_id);

CREATE INDEX followers_follower_idx ON followers(follower);

Uwotm8 storage

Continuing with our non-scalable decisions, we will store each uwotm8 as row in the database. So if there's a lot of negativity on the platform, it will cost us a pretty penny in storage.

CREATE TABLE uwotm8s(
  author_id uuid not null,
  twot_id uuid not null,
  UNIQUE (author_id, twot_id),
  CONSTRAINT fk_author FOREIGN KEY(author_id) 
    REFERENCES thought_leaders(thought_leader_id),
  CONSTRAINT fk_twot FOREIGN KEY(twot_id) 
    REFERENCES twots(twot_id)
);

CREATE INDEX twot_uwotm8s_idx ON uwotm8s(twot_id);

CREATE VIEW uwotm8_counts AS (
  SELECT twot_id, count(*)::int4 AS uwotm8Count 
  FROM uwotm8s 
  GROUP BY twot_id
);

Note that we are enforcing the uniqueness of <author_id, twot_id> pair. We're also creating a uwotm8_counts view to make our queries a bit more readable.

Domain model

Code

Now let's write a bit of Scala, finally!

Just from the cursory look at our database schema, we can see several identifiers that are backed by the same type, but are semantically different (say, AuthorId vs TwotId, Nickname vs Text, etc.), so we would like to avoid using raw types (UUID or String in Scala land), to enhance compile-time safety.

Additionally, we have an interesting case of identifiers referring to same entity types, but being semantically different in particular context - when talking about leaders and followers, both of them are AuthorId, but are very different and should not be mistaken one for another.

When writing Scala 3 code, I've gotten into the following habit a lot:

  1. Define opaque types
  2. Define companion object skeleton with some general methods

Let's define a general OpaqValue trait:

trait OpaqueValue[T, X](using ap: T =:= X):
  self =>
  inline def apply(s: X): T = ap.flip(s)
  inline def value(t: T): X = ap(t)
  extension (k: T)
    inline def raw = ap(k)
    inline def into[T1](other: OpaqueValue[T1, X]): T1 = other.apply(raw)
    inline def update(inline f: X => X): T =
      apply(f(raw))
end OpaqueValue

Where X is the underlying, runtime type (like UUID) and T is the semantically appropriate, higher level type (like AuthorId).

Note the T =:= X evidence we request - the compiler will synthesise it for us if possible, and it gives us both compile-time assurances, and runtime apply(...) and flip.apply(...) functions to go between types known to be the same.

We can use it to define types like this:

opaque type AuthorId = UUID
object AuthorId extends OpaqueValue[AuthorId, UUID]

opaque type Follower = UUID
object Follower extends OpaqueValue[Follower, UUID]

opaque type Nickname = String
object Nickname extends OpaqueValue[Nickname, String]

opaque type TwotId = UUID
object TwotId extends OpaqueValue[TwotId, UUID]

opaque type Text = String
object Text extends OpaqueValue[Text, String]

opaque type Uwotm8Count = Int
object Uwotm8Count extends OpaqueValue[Uwotm8Count, Int]

opaque type JWT = String
object JWT extends OpaqueValue[JWT, String]

At compile-time, AuthorId and Follower will be treated as completely different types. At run-time, they will be completely erased to their underlying UUID type.

Using this generic OpaqueValue trait allows us to express going between different opaque types, like the function into that we defined. It allows us to do the following:

def makeFollower(aid: AuthorId): Follower = 
  aid.into(Follower)

For booleans (not that we have many..) we can add pre-defined values

trait YesNo[A](using ev: Boolean =:= A) extends OpaqueValue[A, Boolean]:
  val Yes: A = ev.apply(true)
  val No: A = ev.apply(false)

opaque type Uwotm8Status = Boolean
object Uwotm8Status extends YesNo[Uwotm8Status]

This is convenient (and allocation-free) for types where we want to preserve some inherent characteristics, namely things like hashCode and toString.

But for passwords and secrets that's the opposite of what we want - we don't want to accidentally log the password in plaintext, or even its hash, or the secret we will be using for signing JWTs.

For those cases we will resort to the regular classes:

class Secret(val plaintext: String):
  override def toString() = "<secret>"

class Password(plaintext: String):
  override def toString() = "<password>"
  def process[A](f: String => A): A = f(plaintext)

class HashedPassword(val ciphertext: String):
  override def toString() = "<hashed-password>"

For passwords we're going arguably too far in terms of paranoia and only allow processing the password, slightly complicating the direct retrieval.

The rest of the domain models are basically aggregators of our primitive types:

case class ThoughtLeader(
    id: AuthorId,
    nickname: Nickname,
    following: Vector[AuthorId],
    followers: Vector[Follower],
    twots: Vector[Twot]
)

case class Twot(
    id: TwotId,
    author: AuthorId,
    authorNickname: Nickname,
    content: Text,
    uwotm8Count: Uwotm8Count,
    uwotm8: Uwotm8Status
)

case class Token(
    jwt: JWT,
    expiresIn: Long
)

Note that our models will sometimes both be used for interpreting database results, and for rendering json responses.

For composite models (like Twot) it's a bad idea in general, but for our small app it will be fine. Note that we're also not putting and typeclass instances in companion objects - we will define them separately and bring them into the scope when necessary.

App database interface

Code

Now that we have some definition of domain, we can refine the raw database interface we defined in part 2, and define a new one, that reflects the level of interaction we need from the database.

The interface I arrived at is as such:


trait DB:
  def get_twots(authorId: AuthorId): Vector[Twot]
  def get_twots_perspective(authorId: AuthorId, viewedBy: AuthorId): Vector[Twot]
  def connectionIsOkay(): Boolean
  def get_credentials(nickname: Nickname): Option[(AuthorId, HashedPassword)]
  def get_wall(authorId: AuthorId): Vector[Twot]
  def delete_uwotm8(authorId: AuthorId, twot_id: TwotId): Uwotm8Status
  def add_uwotm8(authorId: AuthorId, twot_id: TwotId): Uwotm8Status
  def add_follower(follower: Follower, leader: AuthorId): Unit
  def delete_follower(follower: Follower, leader: AuthorId): Unit
  def get_followers(leader: AuthorId): Vector[Follower]
  def get_following(leader: AuthorId): Vector[AuthorId]
  def get_thought_leader_nickname(id: AuthorId): Option[Nickname]
  def get_thought_leader_id(id: Nickname): Option[AuthorId]
  def create_twot(authorId: AuthorId, text: Text): Option[TwotId]
  def delete_twot(authorId: AuthorId, twotId: TwotId): Unit
  def register(nickname: Nickname, pass: HashedPassword): Option[AuthorId]
end DB

Most of the methods should be self-explanatory through our generous use of rich data types. One that might standout is get_twots_perspective, as dual to get_twots.

We have two versions to easier differentiate between the cases when a guest is browsing thought leader's twots, vs an authenticated user. In the first case, the guest won't see any uwotm8 reactions of their own, and in the second - they might.

So it's important from whose perspective the twots are being retrieved.

Note also the usage of Follower and AuthorId in add_folower - getting the order of those wrong would be disasterous but we can help ourselves with different types.

The reason we define this as a separate interface is to make testing easier - not that I plan to write any tests at this point. Mocking database access is hard on its own, and in the land of native code it's (I think) impossible. Which leaves us with fakes/stubs/duals - which is the testing strategy I prefer anyways.

So we will define a private class that implements this interface, and it will be backed by the connection to Postgres:

object DB:
  def postgres(db: roach.Database)(using Zone): DB = new PostgresDB(db)

  private class PostgresDB(db: Database)(using Zone) extends DB:
    import roach.codecs.*

  //.. method implementations go here

As this is purely a database interface, any necessary logic should have happened before accessing any of these methods. So most of our methods will be purely of "execute this prepared query and read the results" nature. With that in mind, lets define a few helper methods for 3 distinct usecases:

  1. Returning zero or one results (methods returning Option[..])
  2. Returning all retrieved results (methods returning Vector[..])
  3. Returning nothing (Unit)
private def one[T, X](
    prep: roach.Database.Prepared[T],
    value: T,
    codec: Codec[X]
): Option[X] =
  Using.resource(prep.execute(value).getOrThrow)(_.readOne(codec))

private def all[T, X](
    prep: roach.Database.Prepared[T],
    value: T,
    codec: Codec[X]
): Vector[X] =
  Using.resource(prep.execute(value).getOrThrow)(_.readAll(codec))

private def exec[T](
    prep: roach.Database.Prepared[T],
    value: T
): Unit =
  Using.resource(prep.execute(value).getOrThrow)(_ => ())

Note that the codecs we have for Postgres types are expressed in terms of raw Scala types. To lift them into our richer opaque types, we can define the following extension method:

extension [X](c: Codec[X])
  private[db] inline def wrap[T](o: OpaqueValue[T, X]): Codec[T] =
    c.bimap(o.apply(_), o.value(_))

Which means we can do the following:

def asAuthorId(c: Codec[UUID]): Codec[AuthorId] = 
  c.wrap(AuthorId)

And, as the final piece of the puzzle, we can finally apply the fruits of our labour with codec composition and define complicated codecs, for example for some of our classes!

private val hashedPasswordCodec =
  bpchar.bimap(s => HashedPassword(s), _.ciphertext)

private val twotCodec: Codec[Twot] =
  (uuid.wrap(TwotId) ~
    uuid.wrap(AuthorId) ~
    varchar.wrap(Nickname) ~
    varchar.wrap(Text) ~
    int4.wrap(Uwotm8Count) ~
    bool.wrap(Uwotm8Status)).as[Twot]

Finally! What follows will be the actual SQL queries used to implement the methods in our desired interface. If you don't want to read repetitive walls of SQL, feel free to skip to the next section

Twot related queries

Creating twots: this one is pretty straightforward

def create_twot(authorId: AuthorId, text: Text): Option[TwotId] =
  one(create_twot_prepared, authorId -> text, uuid.wrap(TwotId))

private lazy val create_twot_prepared =
  db.prepare(
    """
    insert into twots(twot_id, author_id, content, added) 
              values (gen_random_uuid(), $1, $2, NOW()) 
              returning twot_id
              """,
    "create_twot",
    uuid.wrap(AuthorId) ~ varchar.wrap(Text)
  ).getOrThrow

Deleting twots

def delete_twot(authorId: AuthorId, twotId: TwotId): Unit = 
  exec(delete_twot_prepared, authorId -> twotId)

private lazy val delete_twot_prepared =
  db.prepare(
    """delete from twots where author_id = $1 and twot_id = $2""",
    "create_twot",
    uuid.wrap(AuthorId) ~ uuid.wrap(TwotId)
  ).getOrThrow

Twots viewed by guest

def get_twots(authorId: AuthorId): Vector[Twot] =
  all(get_twots_prepared, authorId, twotCodec)

private lazy val get_twots_prepared =
  db.prepare(
    """
      select 
        t.twot_id, 
        t.author_id, 
        a.nickname, 
        t.content, 
        coalesce(u.uwotm8Count, 0), 
        false 
      from 
        twots t 
        left outer join uwotm8_counts u on t.twot_id = u.twot_id 
        inner join thought_leaders a on t.author_id = a.thought_leader_id 
      where 
        t.author_id = $1 
      order by 
        t.added desc
    """,
    "get_twots",
    uuid.wrap(AuthorId)
  ).getOrThrow

Twots viewed by a logged in user

Note the usage of CASE WHEN w.author_id IS NULL THEN false ELSE true END - we optionally join with the table that contains all the uwotm8s reactions, and check whether the viewer (logged in user) has indeed reacted to a particular twot.

def get_twots_perspective(
    authorId: AuthorId,
    viewedBy: AuthorId
): Vector[Twot] =
  all(get_twots_perspective_prepared, authorId -> viewedBy, twotCodec)

private lazy val get_twots_perspective_prepared =
  db.prepare(
    """
      select 
        t.twot_id, 
        t.author_id, 
        a.nickname, 
        t.content, 
        coalesce(u.uwotm8Count, 0), 
        CASE WHEN w.author_id IS NULL THEN false ELSE true END 
      from 
        twots t 
          left outer join uwotm8_counts u on t.twot_id = u.twot_id 
          inner join thought_leaders a on t.author_id = a.thought_leader_id 
          left outer join uwotm8s w on t.twot_id = w.twot_id 
                                                and w.author_id = $2 
      where 
        t.author_id = $1 
      order by 
        t.added desc
    """,
    "get_twots_viewed_by_authed_user",
    uuid.wrap(AuthorId) ~ uuid.wrap(AuthorId)

Wall

It's painfully similar to some of the queries above, but we need to compose the wall out of user's twots and the twots of thought leaders they follow.

def get_wall(authorId: AuthorId): Vector[Twot] =
  all(get_wall_prepared, authorId, twotCodec)

private val get_wall_prepared =
  db.prepare(
    """
    select 
      t.twot_id, 
      t.author_id, 
      a.nickname, 
      t.content, 
      coalesce(u.uwotm8Count, 0),
      CASE WHEN w.author_id IS NULL 
        THEN false 
        ELSE true 
      END
    from 
      twots t 
        left outer join uwotm8_counts u on t.twot_id = u.twot_id 
        inner join thought_leaders a on t.author_id = a.thought_leader_id
        left outer join uwotm8s w on t.twot_id = w.twot_id and w.author_id = $1
    where 
      t.author_id in (select distinct leader_id from followers where follower = $1) or 
      t.author_id = $1
    order by t.added desc
    """,
    "get_wall",
    uuid.wrap(AuthorId)
  ).getOrThrow

Uwowtm8 related queries

def delete_uwotm8(authorId: AuthorId, twot_id: TwotId): Uwotm8Status =
  exec(delete_uwotm8_prepared, authorId -> twot_id)
  Uwotm8Status.No

def add_uwotm8(authorId: AuthorId, twot_id: TwotId): Uwotm8Status =
  exec(add_uwotm8_prepared, authorId -> twot_id)
  Uwotm8Status.Yes

private lazy val add_uwotm8_prepared =
  db.prepare(
    """
    insert into uwotm8s(author_id, twot_id) 
              values ($1, $2)
              on conflict do nothing
              returning 'ok'::text
              """,
    "add_uwotm8",
    uuid.wrap(AuthorId) ~ uuid.wrap(TwotId)
  ).getOrThrow

private lazy val delete_uwotm8_prepared =
  db.prepare(
    """delete from uwotm8s where author_id = $1 and twot_id = $2""",
    "delete_uwotm8",
    uuid.wrap(AuthorId) ~ uuid.wrap(TwotId)
  ).getOrThrow

Follower related queries

def add_follower(follower: Follower, leader: AuthorId): Unit =
  exec(add_follower_prepared, leader -> follower)

def delete_follower(follower: Follower, leader: AuthorId): Unit =
  exec(delete_follower_prepared, leader -> follower)

def get_followers(leader: AuthorId): Vector[Follower] =
  all(get_followers_prepared, leader, uuid.wrap(Follower))

def get_following(leader: AuthorId): Vector[AuthorId] =
  all(get_following_prepared, leader, uuid.wrap(AuthorId))

private lazy val add_follower_prepared =
  db.prepare(
    """
    insert into followers(leader_id, follower) 
              values ($1, $2)
              on conflict do nothing
              returning 'ok'::text
              """,
    "add_follower",
    uuid.wrap(AuthorId) ~ uuid.wrap(Follower)
  ).getOrThrow

private lazy val delete_follower_prepared =
  db.prepare(
    """delete from followers where leader_id = $1 and follower = $2""",
    "delete_follower",
    uuid.wrap(AuthorId) ~ uuid.wrap(Follower)
  ).getOrThrow

private lazy val get_followers_prepared =
  db.prepare(
    """
    select follower from followers where leader_id = $1
    """,
    "get_followers",
    uuid.wrap(AuthorId)
  ).getOrThrow

private lazy val get_following_prepared =
  db.prepare(
    """
    select leader_id from followers where follower = $1
    """,
    "get_following",
    uuid.wrap(AuthorId)
  ).getOrThrow

Thought leader related queries


def get_credentials(
    nickname: Nickname
): Option[(AuthorId, HashedPassword)] =
  one(
    get_thought_leader_credentials_prepared,
    nickname,
    uuid.wrap(AuthorId) ~ hashedPasswordCodec
  )

def register(nickname: Nickname, pass: HashedPassword): Option[AuthorId] =
  one(
    register_thought_leader_prepared,
    nickname -> pass,
    uuid.wrap(AuthorId)
  )

def get_thought_leader_nickname(id: AuthorId): Option[Nickname] =
  one(get_thought_leader_by_id, id, varchar.wrap(Nickname))

def get_thought_leader_id(id: Nickname): Option[AuthorId] =
  one(get_thought_leader_by_nickname, id, uuid.wrap(AuthorId))

private lazy val register_thought_leader_prepared =
  db.prepare(
    """
    insert into thought_leaders(thought_leader_id, nickname, salted_hash) 
              values (gen_random_uuid(), lower($1), $2) 
              on conflict do nothing
              returning thought_leader_id
              """,
    "register_thought_leader",
    varchar.wrap(Nickname) ~
      hashedPasswordCodec
  ).getOrThrow

private lazy val get_thought_leader_credentials_prepared =
  db.prepare(
    "select thought_leader_id, salted_hash from thought_leaders where lower(nickname) = lower($1::varchar)",
    "get_thought_leader",
    varchar.wrap(Nickname)
  ).getOrThrow

private val get_thought_leader_by_id =
  db.prepare(
    "select nickname from thought_leaders where thought_leader_id = $1",
    "get_thought_leader_by_id",
    uuid.wrap(AuthorId)
  ).getOrThrow

private val get_thought_leader_by_nickname =
  db.prepare(
    "select thought_leader_id from thought_leaders where lower(nickname) = lower($1)",
    "get_thought_leader_by_nickname",
    varchar.wrap(Nickname)
  ).getOrThrow

And at the very end, we will provide a method to be called by the future health check:

def connectionIsOkay() = db.connectionIsOkay

For an app as simple as this, it turns out that having just those methods is often enough to implement the required logic in full. So we will avoid copying over the method signatures and will try to delegate directly to DB in cases where it's appropriate.

Authentication

Code

This is a toy application and cannot in good conscience be called secure, please educate yourself on best practices and don't copy this approach

The simplest thing for us would be requiring users to re-enter their credentials every time they visit our website. Regrettably, people will complain, citing silly reasons like "I like visiting other sites, too", "closing browser occasionally is okay", etc.

Short of popping their password in the cookie, we would like to store some token that authenticates them. We would also like to verify that the token is issued by us.

And when I tried for 5 minutes and failed to compile libjwt I decided to write the most minimal implementation of a single-algorithm JWT I could. After all - no bad ever came from trying to write your own cryptography utilities without having any appropriate knowledge.

We will be persisting a cryptographically signed long-lived token on the client, using local storage. Note, that this violates several security best practices immediately:

  1. Local storage is accessible to any scripts running on the page
  2. Long lived tokens are very problematic, as they cannot be revoked without centralised storage
  3. Tokens that don't have a secure fingerprint sent along in a hardened cookie are vulnerable to being stolen and used anywhere the attacker wants.

You can read about these problem in an OWASP Cheatsheet.

With that said, this is not a tutorial on secure authentication, and refresh token loops add a lot of complexity, so let's dive into JWT implementation.

JWT implementation

At its core, JWT is really two pieces of Base64-encoded JSON, along with a cryptographically strong signature. The latter is implemented using HMAC, and we will delegate this part of the process to OpenSSL, as it's the most sensitive to incompetence.

Computing HMAC

As with the SHA256 example in part 2, this is merely a Scala adaptation of a C example found on the internet:

object OpenSSL: 
  // ...
  private val enc = Base64.getUrlEncoder().withoutPadding()

  def hmac(plaintext: String, key: String)(using Zone) =
    import libhmac.functions.*
    import libhmac.types.*
    val message = toCString(plaintext)
    val ckey = toCString(key)
    val mdctx = EVP_MD_CTX_new()
    val pkey = EVP_PKEY_new_mac_key(
      Crypto.get_EVP_MAC_KEY(),
      null,
      ckey.asInstanceOf[Ptr[CUnsignedChar]],
      string.strlen(ckey).toInt
    )

    assert(pkey != null, "EVP PKEY is null")
    assert(mdctx != null, "EVP ctx is null")

    val md_len = stackalloc[libhmac.types.size_t](1)

    assert(EVP_DigestSignInit(mdctx, null, EVP_sha256(), null, pkey) == 1)
    assert(EVP_DigestUpdate(mdctx, message, string.strlen(message)) == 1)
    assert(EVP_DigestSignFinal(mdctx, null, md_len) == 1)
    val md_value = stackalloc[CUnsignedChar](!md_len)

    assert(EVP_DigestSignFinal(mdctx, md_value, md_len) == 1)

    val ar = Array.newBuilder[Byte]

    for i <- 0 until (!md_len).toInt do ar.addOne(md_value(i).toByte)

    EVP_MD_CTX_free(mdctx)

    EVP_PKEY_free(pkey)

    new String(enc.encode(ar.result()))
  end hmac

The specific type of the Base64 encoder is very important, otherwise produced results will differ from the specification.

Creating tokens

Because we're only supporting one type of signing algorithm (HS256), we can hardcode the header part of the JWT:

auth.scala

val enc = Base64.getUrlEncoder().withoutPadding()
val dec = Base64.getUrlDecoder()

val headers =
  ujson.Obj(
    "alg" -> ujson.Str("HS256"),
    "typ" -> ujson.Str("JWT")
  )

val headersString = new String(
  enc.encodeToString(upickle.default.writeToByteArray(headers))
)

Let's extract both the signing key and the token expiration time into a separate class that we will be passing implicitly:

settings.scala

package twotm8

import scala.concurrent.duration.FiniteDuration

case class Settings(
  tokenExpiration: FiniteDuration,
  secretKey: Secret
)

With that, we can fully define the method that constructs a token:

auth.scala

def token(authorId: AuthorId)(using Zone)(using
    config: Settings
): Token =
  val exp =
    (System.currentTimeMillis / 1000) + config.tokenExpiration.toSeconds

  val content = upickle.default.writeToByteArray(
    ujson.Obj(
      "sub" -> ujson.Str(authorId.raw.toString),
      "exp" -> ujson.Num(exp),
      "iss" -> ujson.Str("io:twotm8:token")
    )
  )

  val payload = new String(enc.encodeToString(content))

  val signature =
    OpenSSL.hmac(headersString + "." + payload, config.secretKey.plaintext)

  Token(JWT(headersString + "." + payload + "." + signature), exp)
end token

Note that the only piece of data we're putting into the token is the ID of the user this token can authenticate. We're using their stable UUID identifier, which means it won't be invalidated if the user changes their nickname.

There's also nothing in the token that indicates that it was produced for a specific version of the credentials - which means that if the attacker steals your token, for two weeks they will be posting under your name. Not great, and hopefully we can address it in the future, before this inevitably pops up on The Register.

Validating tokens

Now, to validate a token that was sent to us, say, in a Authorization header, we just need to break the token up into its constituents, re-compute the signature with the secret that we know, and verify that it matches. Additionally, we need to make sure that the token hasn't expired.

Overall, it's pretty simple:

auth.scala

case class AuthContext(author: AuthorId)

object Auth:
  // ..
  def validate(jwt: JWT)(using Zone)(using
      settings: Settings
  ): Option[AuthContext] =
    val fragments = jwt.raw.split('.')
    if fragments.size != 3 then None
    else
      val header = fragments(0)
      val payload = fragments(1)
      val signature = fragments(2)
      if header != headersString then None
      else
        val expectedSignature =
          OpenSSL.hmac(
            headersString + "." + payload,
            settings.secretKey.plaintext
          )

        Option
          .when(expectedSignature.equalsIgnoreCase(signature)) {
            // signature is valid, let's look at the token's contents
            try
              val js = ujson.read(dec.decode(payload))
              val id = js.obj("sub").str
              val exp = js.obj("exp").num.toInt

              assert(js.obj("iss").str == "io:twotm8:token")

              if (exp > System.currentTimeMillis / 1000) then
                Some(AuthContext(AuthorId(UUID.fromString(id))))
              else None
            catch
              case exc =>
                scribe.error("Processing a JWT threw an exception", exc)
                None
          }
          .flatten
      end if
    end if
  end validate

If the token is valid, then the information from it is extracted in AuthContext, which contains only the user's ID.

App logic

Code

As I mentioned before, our app is so simple that most of the operations can go directly through database with no additional processing.

To give the rest of business logic a place to live, Let's create a class called App:

package twotm8

import openssl.OpenSSL
import twotm8.db.DB

import scala.scalanative.unsafe.*

class App(db: DB)(using z: Zone, config: Settings):
  // ...

So let's use the new export clause:

class App(db: DB)(using z: Zone, config: Settings):
  export db.{
    get_wall,
    create_twot,
    delete_twot,
    get_twots,
    add_follower,
    delete_follower,
    add_uwotm8,
    delete_uwotm8
  }
  // ..

So if the caller has a reference to an App, they can call get_wall and others on it directly, and DB can remain an implementation detail only known to App's implementation.

Now that is composition vs inheritance gore I'm here for.

With those methods exported, we can actually define implementations for more complicated ones.

Login

Assuming that the user of App passed in the nickname and plaintext password obtained from user, we need to do the following:

  1. Look up the thought leader by nickname in the database, proceed only if found
  2. Extract the salt and hashed password from user's database entry
  3. Hash the provided password, compare to the database value, proceed only if they match
  4. Issue a JWT token for the found user

Here's what the implementation can look like:

auth.scala

// class App...
// ...
def login(nickname: Nickname, plaintextPassword: Password): Option[Token] =
  db.get_credentials(nickname) match
    case None => None
    case Some(authorId -> hashedPassword) =>
      val List(salt, hash) = hashedPassword.ciphertext.split(":").toList
      val expected =
        plaintextPassword.process(pl => OpenSSL.sha256(salt + ":" + pl))

      Option.when(expected.equalsIgnoreCase(hash))(
        Auth.token(authorId)
      )
end login

Registration

Even simpler, after we salt and hash the password, we can directly send it to the database - if nickname is taken, the database will just reject it.

auth.scala

def register(nickname: Nickname, pass: Password): Option[AuthorId] =
  val salt = scala.util.Random.alphanumeric.take(16).mkString
  val hash = pass.process(pl => OpenSSL.sha256(salt + ":" + pl))
  val saltedHash = HashedPassword(salt + ":" + hash)

  db.register(nickname, saltedHash)
end register

Retrieving thought leader profile

The way we're doing it is quite inefficient, basically stitching several queries sequentially:

def get_thought_leader(id: AuthorId): Option[ThoughtLeader] =
  db.get_thought_leader_nickname(id).map { nickname =>
    ThoughtLeader(
      id,
      nickname,
      db.get_following(id),
      db.get_followers(id),
      db.get_twots(id)
    )
  }

This version of the operation will be used when rendering a logged in user's profile, because the ID is coming from the token.

And if the user is logged in and browsing other leader's profile, we need to slightly modify the function:

def get_thought_leader(
    nick: Nickname,
    watching: Option[AuthorId]
): Option[ThoughtLeader] =
  db.get_thought_leader_id(nick).map { id =>
    ThoughtLeader(
      id,
      nick,
      db.get_following(id),
      db.get_followers(id),
      watching match
        case None          => db.get_twots(id)
        case Some(watcher) => db.get_twots_perspective(id, watcher)
    )
  }

This version can be used for both guests and logged in users, the only difference will be how the list of twots is retrieved.

As a last operation, we want to provide a version of JWT validation that doesn't depend on having a Zone or Settings available:

def validate(token: JWT): Option[AuthContext] =
  Auth.validate(token)

Backend API

Code

Now it's finally time to put together everything we've made so far, plus the helpers defined in part 3, to write out the full definition of our backend API.

Our API is JSON-based, and we use the HTTP methods in a specific way to help the frontend decide whether certain requests can be retried. And they occasionally will need to be retried, because we will be putting strict timeouts on each API request on the client side. We will be following the Semantics of HTTP methods wherever possible.

Our entire backend API will go into a single class, named Api:

package twotm8
package api

class Api(app: App):
  import ApiHelpers.{*, given}

The set of client-submitted payloads is quite simple:

object Payload:
  case class Login(nickname: Nickname, password: Password)
  case class Register(nickname: Nickname, password: Password)
  case class Create(text: Text)
  // used for both adding and deleting uwotm8
  case class Uwotm8(twot_id: TwotId)
  // used for both following and unfollowing
  case class Follow(thought_leader: AuthorId)

JSON codecs

Code

First thing we need to solve is how to provide JSON codecs for our rich types, like AuthorId and Nickname.

Gladly, we can define a general function for the types defined using OpaqueValue:

json.scala

package twotm8
package json

import upickle.default.{ ReadWriter, Reader }

object codecs:
  inline def opaqValue[T, X](obj: OpaqueValue[T, X])(using
      rw: ReadWriter[X]
  ): ReadWriter[T] =
    rw.bimap(obj.value(_), obj.apply(_))
  
  // primitive types
  given ReadWriter[AuthorId] = opaqValue(AuthorId)
  given ReadWriter[Follower] = opaqValue(Follower)
  given ReadWriter[Nickname] = opaqValue(Nickname)
  given ReadWriter[TwotId] = opaqValue(TwotId)
  given ReadWriter[Text] = opaqValue(Text)
  given ReadWriter[JWT] = opaqValue(JWT)
  given ReadWriter[Uwotm8Count] = opaqValue(Uwotm8Count)
  given ReadWriter[Uwotm8Status] = opaqValue(Uwotm8Status)
  given Reader[Password] = summon[Reader[String]].map(Password(_))

We're using the fact that ReadWriter from upickle has a bimap method, used similarly to what we defined on our database Codec.

There's a way to make this derivation automatic, by putting special instances in the companion object of each newtype (as that's what they are), but in this instance I prefer to manually control what is allowed to be serialised to json, and what isn't.

For the classes and case classes we can either use bimap, or the macroRW macro, which will derive the codec for us:

json.scala

object codecs:
  // ..
  given ReadWriter[Twot] = upickle.default.macroRW[Twot]
  given ReadWriter[Token] = upickle.default.macroRW[Token]
  given ReadWriter[ThoughtLeader] = upickle.default.macroRW[ThoughtLeader]

And finally, our payloads are also case classes, so we follow the same approach, but this time we only define the JSON reader, not writer. Our payloads are only supposed to be read and never serialised (for Login and Register), so there's no point to define too much.

json.scala

object codecs:
  // ..
  import upickle.default.{Reader, macroR}

  given Reader[api.Payload.Login] =
    macroR[api.Payload.Login]
  given Reader[api.Payload.Create] =
    macroR[api.Payload.Create]
  given Reader[api.Payload.Uwotm8] =
    macroR[api.Payload.Uwotm8]
  given Reader[api.Payload.Register] =
    macroR[api.Payload.Register]
  given Reader[api.Payload.Follow] =
    macroR[api.Payload.Follow]

With those definitions in mind, we can always import them using

import twotm8.json.given

and in such scope the JSON helpers we defined in part 3 will work nicely.

Protected routes

As mentioned in the requirements, there are certain routes which should never be accessed without a valid access token.

Let's build some utilities to make marking routes as protected easier. First, we need to handle the processing of authorization headers - we will be using the Authorization: Bearer <access-token> type of authorization, and it should be pretty easy to extract it from a request:

inline def extractAuth(request: Request): Either[String, AuthContext] =
  val auth = request.headers.find(_._1.equalsIgnoreCase("Authorization"))

  auth match
    case None => Left("Unauthorized")
    case Some((_, value)) =>
      if !value.startsWith("Bearer ") then Left("Invalid bearer")
      else
        val jwt = JWT(value.drop("Bearer ".length))
        app.validate(jwt) match
          case None =>
            Left(s"Invalid token")
          case Some(auth) =>
            Right(auth)
  end match
end extractAuth

This method will either return a reason why the header is missing or invalid, or extracted AuthContext - which can then be passed to route implementations.

So what would it mean to "protect a route"? Let's say we have a route handler, defined as ArgsHandler[A], where A is a type of argument this route takes. What I would to do is to "lift it" into a route handler ArgsHandler[Authenticated[A]], where Authenticated is simply:

case class Authenticated[A](auth: AuthContext, value: A)

i.e. we're attaching the AuthContext to an arbitrary value.

And we can go in the opposite direction: if we have a ArgsHandler[Authenticated[A]], we can transform it into a new handler ArgsHandler[A] which will just send BadRequest if there's no authorization, and proceed as normal otherwise.

The latter idea is implemented as such:

inline def protect[A](
    inline unprotected: ArgsHandler[Authenticated[A]]
): ArgsHandler[A] =
  (request, a) =>
    extractAuth(request) match
      case Left(msg) =>
        request.unauthorized(msg)
      case Right(auth) =>
        unprotected.handleRequest(request, Authenticated(auth, a))

You will see how this is used in a second, but let's tackle another requirement - there are some routes that we might want to handle differently depending on the user's authentication status.

Retrieving another thought leader's profile is valid regardless of whether you do it as a guest or not, and ideally we'd want it to happen with the same URL.

We can express this idea as having a ArgsHandler[Either[A, Authenticated[A]]]:

inline def optionalAuth[A](
    unprotected: ArgsHandler[Either[A, Authenticated[A]]]
): ArgsHandler[A] =
  (request, a) =>
    extractAuth(request) match
      case Left(msg) =>
        unprotected.handleRequest(request, Left(a))
      case Right(auth) =>
        unprotected.handleRequest(request, Right(Authenticated(auth, a)))
end optionalAuth

Full API

We now finally have enough to put together the full backend API.

Remember that we defined handleException, builder, and api helpers in part 3. With them, the full API definition for our backend looks like this:

// All within the App class
inline def routes = handleException(
  api(
    Method.GET ->
      builder(
        Root / "api" / "thought_leaders" / "me" -> protect(get_me),
        Root / "api" / "thought_leaders" / Arg[String] -> optionalAuth(get_thought_leader),
        Root / "api" / "twots" / "wall" -> protect(get_wall),
        Root / "api" / "health" -> get_health
      ),
    Method.POST ->
      builder(
        Root / "api" / "auth" / "login" -> login,
        Root / "api" / "twots" / "create" -> protect(create_twot)
      ),
    Method.PUT ->
      builder(
        Root / "api" / "auth" / "register" -> register,
        Root / "api" / "twots" / "uwotm8" -> protect(add_uwotm8),
        Root / "api" / "thought_leaders" / "follow" -> protect(add_follower)
      ),
    Method.DELETE ->
      builder(
        Root / "api" / "thought_leaders" / "follow" -> protect(delete_follower),
        Root / "api" / "twots" / "uwotm8" -> protect(delete_uwotm8),
        Root / "api" / "twots" / Arg[UUID] -> protect(delete_twot)
      )
  )
)

You can see we have a mixture of protected routes (get_me), unprotected (register) and optionally protected (get_thought_leader).

To support the Arg[UUID] parameter, not included into trail by default, let's define a Trail codec for it:

api.helpers.scala

  given Codec[UUID] with
    def encode(t: UUID) = Some(t.toString)
    def decode(v: Option[String]) =
      v match
        case None => None
        case Some(str) =>
          try Some(UUID.fromString(str))
          catch case exc => None

The actual route handler definitions are simple functions.

adding/deleting uwotm8 reactions

private val add_uwotm8: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Uwotm8](req) { uwot =>
    req.sendJson(StatusCode.OK, app.add_uwotm8(i.auth.author, uwot.twot_id))
  }

private val delete_uwotm8: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Uwotm8](req) { uwot =>
    req.sendJson(
      StatusCode.OK,
      app.delete_uwotm8(i.auth.author, uwot.twot_id)
    )
  }

following/unfollowing

private val add_follower: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Follow](req) { follow =>
    if i.auth.author == follow.thought_leader then
      req.badRequest("You cannot follow yourself")
    else
      app.add_follower(
        follower = i.auth.author.into(Follower),
        leader = follow.thought_leader
      )
      req.noContent()
  }

private val delete_follower: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Follow](req) { follow =>
    app.delete_follower(
      follower = i.auth.author.into(Follower),
      leader = follow.thought_leader
    )
    req.noContent()
  }

logged in thought leader's profile

private val get_me: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  app.get_thought_leader(i.auth.author) match
    case None => req.unauthorized()
    case Some(tl) =>
      req.sendJson(StatusCode.OK, tl)

thought leader's personal wall

private val get_wall: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  val twots = app.get_wall(i.auth.author)
  req.sendJson(StatusCode.OK, twots)

creating twots

Should data validation be intertwined the HTTP logic? Probably not.

private val create_twot: ArgsHandler[Authenticated[Unit]] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Create](req) { createPayload =>
    val text = createPayload.text.update(_.trim)
    if (text.raw.length == 0) then req.badRequest("Twot cannot be empty")
    else if (text.raw.length > 128) then
      req.badRequest(
        s"Twot cannot be longer than 128 characters (you have ${text.raw.length})"
      )
    else
      app.create_twot(i.auth.author, text.update(_.toUpperCase)) match
        case None =>
          req.badRequest("Something went wrong, and it's probably your fault")
        case Some(_) =>
          req.noContent()
    end if
  }

deleting twots

private val delete_twot: ArgsHandler[Authenticated[UUID]] = (req, uuid) => 
  val authorId = uuid.auth.author
  val twotId = TwotId(uuid.value)

  app.delete_twot(authorId, twotId)
  req.noContent()

registration

private val register: ArgsHandler[Unit] = (req, i) =>
  import twotm8.json.codecs.given
  json[Payload.Register](req) { reg =>
    val length = reg.password.process(_.length)
    val hasWhitespace = reg.password.process(_.exists(_.isWhitespace))
    val nicknameHasWhitespace =
      reg.nickname.raw.exists(_.isWhitespace)

    if hasWhitespace then
      req.badRequest("Password cannot contain whitespace symbols")
    else if length == 0 then req.badRequest("Password cannot be empty")
    else if length < 8 then
      req.badRequest("Password cannot be shorter than 8 symbols")
    else if length > 64 then
      req.badRequest("Password cannot be longer than 64 symbols")
    else if reg.nickname.raw.length < 4 then
      req.badRequest("Nickname cannot be shorter than 4 symbols")
    else if reg.nickname.raw.length > 32 then
      req.badRequest("Nickname cannot be longer that 32 symbols")
    else if nicknameHasWhitespace then
      req.badRequest("Nickname cannot have whitespace in it")
    else
      app.register(reg.nickname, reg.password) match
        case None =>
          req.badRequest("This nickname is already taken")
        case Some(_) =>
          req.noContent()
    end if
  }

thought leader's profile

This is where we handle the optional auth context.

private val get_thought_leader
    : ArgsHandler[Either[String, Authenticated[String]]] =
  (req, i) =>
    import twotm8.json.codecs.given
    val nickname = i match
      case Left(name)                    => name
      case Right(Authenticated(_, name)) => name

    val watcher = i.toOption.map(_.auth.author)

    app.get_thought_leader(Nickname(nickname), watcher) match
      case None =>
        req.notFound()
      case Some(tl) =>
        req.sendJson(StatusCode.OK, tl)

login

private val login: ArgsHandler[Unit] = (req, _) =>
  import twotm8.json.codecs.given
  json[Payload.Login](req) { login =>
    app.login(login.nickname, login.password) match
      case None      => req.badRequest("Invalid credentials")
      case Some(tok) => req.sendJson(StatusCode.OK, tok)
  }

Server entrypoint and configuration

Code

We are nearly there. All that remains is the entrypoint for our application and the NGINX Unit configuration.

When NGINX Unit starts our binary, we want to notify it of the request handlers we've defined. This is the job of SNUnit's SyncServerBuilder.

To initialise the rest of our application, we are missing two pieces:

  1. The correct Postgres connection string, provided by Fly.io
  2. Secret string for signing JWTs

The JWT secret is easy. Let's say you generated a long, secret string in a file called secret. Then, to make Fly.io provide this secret as an env variable to your app, you need to use the following command:

cat secret | flyctl secrets set JWT_SECRET=-

we're using - as the value to indicate to flyctl that the value should be taken from standard input.

With Postgres, we first need to create the database:

$ flyctl postgres create
# answer some questions here, choose the cluster name as twotm8-web-db
$ flyctl postgres attach --postgres-app twotm8-web-db

What these two commands will do is

  • Provision a Postgres cluster
  • Attach said cluster to our app - this involves creating a DATABASE_URL secret in our app, which will have correct Postgres connection string.

After setting up the cluser, I would drop into the psql tool by connecting to my cluster:

$ flyctl postgres connect -a <db-app-name> --database <app-name>

And create all the necessary tables. We haven't created anything for applying migrations, so it will have to do for now.

For local testing, I would usually point the app to a local database, where it made sense to use individual environment variables. We can use a function like this:

def connection_string() =
  sys.env.getOrElse(
    "DATABASE_URL", {
      val host = sys.env.getOrElse("PG_HOST", "localhost")
      val port = sys.env.getOrElse("PG_PORT", "5432")
      val password = sys.env.getOrElse("PG_PASSWORD", "mysecretpassword")
      val user = sys.env.getOrElse("PG_USER", "postgres")
      val db = sys.env.getOrElse("PG_DB", "postgres")

      s"postgresql://$user:[email protected]$host:$port/$db"
    }
  )

If you are running Unit locally, you can add any variables to the environment object under the application specification, see documentation

After this, all we need to do is wire our dependencies correctly:

@main def launch =
  val postgres = connection_string()

  scribe.Logger.root
    .clearHandlers()
    .withHandler(
      writer = scribe.writer.SystemErrWriter,
      outputFormat = scribe.output.format.ANSIOutputFormat
    )
    .replace()

  Zone { implicit z =>
    Using.resource(Database(postgres).getOrThrow) { pgConnection =>
      given Settings = Settings(
        tokenExpiration = 14.days,
        secretKey = Secret(
          sys.env.getOrElse[String](
            "JWT_SECRET",
            throw new Exception("Missing token configuration")
          )
        )
      )

      val app = App(DB.postgres(pgConnection))
      val routes = api.Api(app).routes

      SyncServerBuilder.build(routes).listen()
    }
  }
end launch

I'm explicitly reconfiguring Scribe because for some reason on Scala Native it is required.

To make sure our routing is correct, we should use the following config:

config.json

{
  "listeners": {
    "*:8080": {
      "pass": "routes"
    }
  },
  "routes": [
    {
      "match": {
        "uri": "/api/*"
      },
      "action": {
        "pass": "applications/app"
      }
    },
    {
      "match": {
        "uri": "~^((/(.*)\\.(js|css|html))|/)$"
      },
      "action": {
        "share": "/www/static$uri"
      }
    },
    {
      "action": {
        "share": "/www/static/index.html"
      }
    }
  ],
  "applications": {
    "app": {
      "processes": {
        "max": 16,
        "spare": 2,
        "idle_timeout": 180
      },
      "type": "external",
      "executable": "/usr/bin/twotm8",
      "limits": {
        "timeout": 1,
        "requests": 1000
      }
    }
  }
}

The rules are very simple:

  • all requests to /api/* are routed to our backend app

  • all static js/css/html requests are served from the /www/static/ folder

  • all other requests serve the index.html from static folder

    This is very important for our Single Page Application, which we will develop in the next part.

And now we can simply flyctl deploy and it will

  • Build our builder container (from part 3)
  • Build the app
  • Build the runtime container
  • Deploy the container to Fly.io's infrastructure

And we should have a running backend in the cloud!

Conclusion

We've built the entire backend for our app in one go, including database schema, JWT management, authentication, and all the necessary API endpoints.