Series TL;DR
- We are deploying a Smithy4s full stack application with Scala.js frontend (with Laminar) on Platforms.sh and Heroku
- We are using Scala 3 heavily
- Code on Github
- Deployed app
- Full series
- Navigation:
Originally I planned to just completely ignore tests completely. Not just for the all important reason of laziness, but also because testing software is somehow a contentious topic.
Apart from the "write test first" vs "write implementation first" conflict, the one that online Scala community has warmly adopted is whether to use mocking or not. Mocking in this case refers to runtime deep-patching of method implementations and stubbing out calls.
Not wanting to write a whole piece on just that subject alone (it's been done before, and has been debated to unsavoury death), I will approach testing in a way I see fit given the very comfortable circumstances this app was born in:
-
I'm the only developer, yay! No arguing about naming, structuring, frameworks, testing depth, etc.
-
I contribute to and maintain Weaver, which turns my admiration for it into a visibly biased obsession
-
I'm lazy! So it should be possible to set what it means for the codebase to be appropriately tested without having to satisfy a grumpy engineer waving a 30 year-old book in my face.
Another reason for doing testing is the relative ease with which we managed our backend - not really utilising the more complex features of Smithy, and aiming for well-defined happy path, there isn't enough meat on the bones of this project to warrant a multi-part series.
So here's our ambitious plan for testing:
-
Three levels of testing:
-
"Unit" testing - testing of pure functions and classes only containing pure functions
-
"Stub" testing - Testing side-effectful functions (and whole services) by turning them into pure with plugging the impure boundaries with in-memory fakes
I was going to call that "Fake" testing but didn't want to infringe on the runtime mocking terminology.
-
"Integration" testing - where we test entire services against real parts of our system where side-effects are performed
-
-
"Stub" and "Integration" testing must share as much code as possible, specifically the entire specs must be exactly the same
-
Single framework for everything
For Stub testing there's not that many holes we need to plug up (making entire implementation pure) - we mostly just log things and access the database. The routes implementation can just be processing requests without starting any actual HTTP servers - thankfully Http4s' model is ideal for that.
For Integration, we would like to stand up and tear down a real Postgres database using Testcontainers, and the exercised spec should actually invoke HTTP endpoints for each service, with requests going through the actual network stack.
The only testing framework we will use will be Weaver, and we'll opt in for the Scalacheck integration as well:
lazy val server =
projectMatrix
// ...
.settings(
libraryDependencies ++= Seq(
"com.disneystreaming" %% "weaver-cats" % Versions.Weaver,
"com.disneystreaming" %% "weaver-scalacheck" % Versions.Weaver
),
testFrameworks += new TestFramework("weaver.framework.CatsEffect")
)
Unit testing
There are some tests you want to get absolutely right - things like correct JWT config propagation, password-hashing roundtrip, config processing, input validation, etc. but they are easy to set up - there's rarely any interaction with anything but stateless library code. Think about verifying JWT tokens - the library itself is stateless and performs no side effects, your code takes at most 1 parameter (some static JWT config) and you directly assert on the result.
We won't demonstrate many of them here, but let's consider an example: writing property tests for validation logic.
To enable ScalaCheck-specific functionality on a Weaver spec, all we need is to mixin
the Checkers
trait:
package jobby
package tests
package unit
import weaver.*
import jobby.spec.*
import weaver.scalacheck.*
import org.scalacheck.Gen
object ValidationPropertyTests extends SimpleIOSuite with Checkers:
override def checkConfig: CheckConfig =
CheckConfig.default.copy(
minimumSuccessful = 500,
initialSeed = Some(13378008L)
)
What we're additionally doing here is explicitly modifying the property checking config to require slightly more examples to succeed for each tests, and explicitly set the generator's seed for reproducibility - in case a subtle bug is introduced, it's better to be able to reliably break the tests.
Our approach to the property testing validation rules can be summarised as following:
Validation can either succeed, or some non-empty subset of distinct rules is violated
In the current state of our validation rules there's a duplication in terms of where
the rules are mentioned. Take, for example, validateJobDescription
:
def validateJobDescription(login: JobDescription) =
val minLength = 100
val maxLength = 5000
val str = login.value.trim
if str.length == 0 then err("Description cannot be empty")
else if str.length < minLength || str.length > maxLength then
err(
s"Description cannot be shorter than $minLength or longer than $maxLength characters"
)
else ok
end validateJobDescription
There are two problems (that I see) with it:
-
The min/max values are constants within the function - ideally they should be part of some configuration object, taken as
(using ValidationConfig)
for ergonomics -
The exact rules are expressed as boolean conditions, locked inside of the function and must necessarily be duplicated in our tests
As a side-project to this side-project, I would love to over-engineering this whole thing.
But let's see how a property test could look for this:
test("jobs: description") {
forall(org.scalacheck.Gen.asciiPrintableStr) { str =>
val trimmed = str.trim
val isValid =
jobby.validation.validateJobDescription(JobDescription(str)).isRight
expect(isValid) or
expect(
trimmed.trim.isEmpty
|| trimmed.length < 100
|| trimmed.length > 5000
)
}
}
There's those duplicated constants again!
Running 1000 of these tests takes ~400ms on my laptop, but thankfully weaver runs them in parallel and the entire spec executes in less than a second:
[info] jobby.tests.unit.ValidationPropertyTests
[info] + users: username 382ms
[info] + users: password 411ms
[info] + companies: name 381ms
[info] + companies: description 362ms
[info] + jobs: title 407ms
[info] + jobs: description 396ms
[info] + jobs: salary range 420ms
[info] Passed: Total 7, Failed 0, Errors 0, Passed 7
[success] Total time: 1 s, completed 23 Jun 2022, 20:58:44
Apart from extracting the properties and constants, can we improve the usefulness of these tests?
Even if our validation functions and tests work perfectly, so far we have not confirmed that important parts of our system actually do invoke these validation functions.
If we take a look at the register
function in the UserServiceImpl
:
override def register(login: UserLogin, password: UserPassword) =
val validation = (validateUserLogin(login), validateUserPassword(password))
.traverse(IO.fromEither)
validation *>
Crypto
.hashPassword(password)
.flatMap { hash =>
db.option(op.CreateUser(login, hash))
.onError(ex => logger.error("Registration failed", ex))
.adaptErr { case _ =>
ValidationError("Failed to register")
}
}
.void
end register
With a database implementation that always succeeds for CreateUser
operation, this function can only fail
if either the user login validation or user password validation fail.
So we could express our test differently - generate random logins and passwords, assert that
register
can only fail if either login or password don't match validation rules.
I will leave it as an exercise to the reader because scaling it seems hard and I don't wanna.
Stub and Integration testing
These tests are much more high level, and can be better expressed as testing user journeys and usecases. For example:
- Users can register and login, receiving valid auth tokens
- Users can't use incorrect credentials to login
- Users can create companies
- Users can only delete companies they created
etc.
To write such test cases succinctly we would like to provide high level tools available in test cases.
Probe (no, not that one 👆)
Those high level tools include:
- API client (
Api
) - Data generator (
Generator
) - Config used (to, say, compare returned values with one supposedly injected into the app) (
AppConfig
) - Collection of common API "snippets" that we will call
Fragments
(i.e.createUser
,createCompany
, etc.) - Something to inspect the logs accumulated by the app, if possible
We will group all of them under the same class called Probe
:
case class Probe(
api: Api,
auth: HttpAuth,
gen: Generator,
config: AppConfig,
getLogs: IO[Vector[scribe.LogRecord]]
):
def fragments = Fragments(this)
API client
Let's start with the API client. One big promise of Smithy is that using exactly the same Scala interface generated for your services, you can construct a HTTP client and point it to an arbitrary URL.
Our API then is just an aggregation of the services we need:
case class Api(
companies: CompaniesService[IO],
jobs: JobService[IO],
users: UserService[IO]
)
And all we need to build it is:
- Actual HTTP client implementation from Http4s (
Client[IO]
) - Base
Uri
Using SimpleJsonRestBuilder
from Smithy4s, we can construct Api
like this:
object Api:
def build(client: Client[IO], uri: Uri): IO[Api] =
val companies = IO.fromEither(
SimpleRestJsonBuilder(CompaniesService)
.client(client, uri)
)
val jobs = IO.fromEither(
SimpleRestJsonBuilder(JobService)
.client(client, uri)
)
val users = IO.fromEither(
SimpleRestJsonBuilder(UserService)
.client(client, uri)
)
(companies, jobs, users).mapN(Api.apply)
end build
end Api
In-memory logger
One feature of Weaver that I really like is the way it prints out the logs only if the tests have failed. One caveat - this applies only to logs written through Weaver's logger.
Regardless, I'm quite partial to the quiet, pristine view of test results - if tests at your job are not polluted by walls of SLF4J printouts, I envy you!
For these tests, I would like for all the logs sent to Scribe loggers to eventually be reported through Weaver's logger.
To do that, let's first create a log collector. Good news: if you have an
instance of Scribe logger, you can give it a LogHandler
, and do with the message
as you please. Bad news: the interface of LogHandler
is as such:
trait LogHandler {
def log(record: LogRecord): Unit
}
No bother! We can use the excellent Dispatcher to execute any IO actions our logger requires. Those IO actions
will just be writing to a Ref[IO, Vector[LogRecord]]
.
Note that Dispatcher doesn't strictly guarantee any ordering, until that is Cats Effect 3.4.0 lands with its configurable dispatchers. It's not an issue for us though,
as each LogRecord
comes with a timestamp we can order the logs by - this will give
us "good enough for tests" results.
Our in-memory logger is a pair of two inter-connected things:
- A Scribe logger that writes to some
Ref
- And action to read the current state of that
Ref
:
class InMemoryLogger private (
val logs: IO[Vector[LogRecord]],
val scribeLogger: Scribe[IO]
)
And here's how we build it:
object InMemoryLogger:
def build: Resource[IO, InMemoryLogger] =
// create a dispatcher
Dispatcher[IO].evalMap { disp =>
// create a ref
Ref.ofEffect(IO(Vector.empty[LogRecord])).map { ref =>
// create a Scribe LogHandler, that uses the dispatcher
// to execute an `IO` action writing the log message into the
// ref
val handler = scribe.handler.LogHandler(Level.Info) { msg =>
disp.unsafeRunSync(ref.update(_.appended(msg)))
}
// an orphan logger with no handlers but the one we
// created above
val logger =
scribe.Logger.empty
.orphan()
.clearHandlers()
.withHandler(handler)
.f[IO]
new InMemoryLogger(
ref.get,
logger
)
}
}
end InMemoryLogger
Data generator
Our data space is very simple - we mostly operate on UUIDs and strings, with occasional restriction on length.
We'll also provide some helpers to work with newtypes that Smithy4s provides.
First, our Generator
class starts like this:
import cats.effect.*
import cats.effect.std.*
import cats.syntax.all.*
case class Generator private (random: Random[IO], uuid: UUIDGen[IO]):
//...
A method that generates uuid-backed newtypes is simple:
def id(nt: Newtype[UUID]): IO[nt.Type] =
uuid.randomUUID.map(nt.apply)
Same with a method for int-backed newtypes (like MinSalary
/MaxSalary
):
def int(nt: Newtype[Int], min: Int, max: Int): IO[nt.Type] =
random.betweenint(min, max).map(nt.apply)
And now onto strings, where my main requirement was being able to easily identify the random strings generated for particular newtypes - so let's prefix them with newtype's name, while preserving the length requirements.
def str(
nt: Newtype[String],
lengthRange: Range = 0 to 100
): IO[nt.Type] =
for
length <- random.betweenInt(lengthRange.start, lengthRange.end)
chars <- random.nextAlphaNumeric.replicateA(length).map(_.mkString)
str = nt.getClass.getSimpleName.toString + "-" + chars
yield nt(str.take(lengthRange.end))
Why go through all this ungodly trouble if we already have ScalaCheck in dependencies? I don't know. I really don't remember why.
Instantiating our Generator is simple:
object Generator:
def create: IO[Generator] =
(Random.scalaUtilRandom[IO], IO(UUIDGen[IO])).mapN(Generator.apply)
We've now defined everything we need to instantiate the Probe
:
object Probe:
def build(
client: Client[IO],
uri: Uri,
config: AppConfig,
logger: InMemoryLogger
) =
Resource.eval {
for
gen <- Generator.create
api <- Api.build(client, uri)
auth = HttpAuth(
config.jwt,
logger.scribeLogger
)
yield Probe(api, auth, gen, config, logger.logs)
}
end Probe
Weaver integration
Probe
will be the resource that we share across individual tests in Stub tests,
and across whole specs in Integration tests.
Let's provide a base trait for our specs, that will propagate Scribe logs into weaver logs.
The trait starts like this:
trait JobbySuite extends IOSuite:
override type Res = Probe
// ...
where we indicate that the shared resource is our Probe
class.
We can then provide a probeTest
method, that will delegate to one of the
methods implemented by weaver - specifically the version that takes both the
shared resources and the log as parameters:
def probeTest(name: weaver.TestName)(f: Probe => IO[weaver.Expectations]) =
test(name) { (probe, log) =>
// ...
where f
is the body of the test.
Let's write a sub-program that transfers the logs:
val dumpLogs = probe.getLogs.flatMap {
_.sortBy(_.timeStamp).traverse_ { msg =>
val msgText = msg.logOutput.plainText
msg.level match
case Level.Info => log.info(msgText)
case Level.Error => log.error(msgText)
case Level.Warn => log.warn(msgText)
case _ => log.debug(msgText)
}
}
We get the logs, sort them by timestamp, and write them into the Weaver logger.
Now all we need to do is run the test body, pass the logs, and re-raise any error or result back to Weaver's default test implementation:
f(probe).attempt
.flatTap(_ => dumpLogs)
.flatMap(IO.fromEither)
And that's it! If any of our stub tests fail, the logs for that test will be printed out. The output can certainly be tweaked, but this will do - you only see a wall of text in case of a failure.
Slow TimeCop
In part 2 we created an
interface called TimeCop
for performing the side-effect of getting the current
date and time.
This was foreshadowing - the ability to override that interface will be important to us to avoid dealing with real time (our tests execute very fast) and the miniscule difference between the real timestamps. Instead, our TimeCop will be generating sequential timestamps, 1 day apart:
package jobby
import cats.effect.*
import java.time.OffsetDateTime
import java.time.ZoneOffset
object SlowTimeCop:
def apply: IO[TimeCop] =
IO.realTimeInstant.flatMap { inst =>
val start = inst.atOffset(ZoneOffset.ofHours(0))
Ref.of[IO, Int](0).map { daysRef =>
new TimeCop:
def nowODT = daysRef.getAndUpdate(_ + 1).map { days =>
start.plusDays(days)
}
}
}
end SlowTimeCop
In-memory database
One of the major places where side-effects happen in our app is the database. To provide fast feedback loop, we would like to provide an in-memory implementation that is just good enough for our tests.
We won't use something that can interpret SQL (like H2), for two reasons:
- Our SQL code is Postgres-centric and will remain so
- We use Skunk, which is not using a JDBC layer, making it harder to fit a JDBC-based connector into our current model
For those reasons, our database will be just backed by Scala data structures in memory.
The state model is quite simple:
object InMemoryDB:
import jobby.spec.*
case class State(
jobs: Vector[Job] = Vector.empty,
companies: Vector[Company] = Vector.empty,
users: Vector[(UserId, UserLogin, HashedPassword)] = Vector.empty
)
// ...
And the database itself will need the state, data generator (for identifiers),
and an instance of TimeCop
to generate timestamps:
case class InMemoryDB(
state: Ref[IO, InMemoryDB.State],
gen: Generator,
timecop: TimeCop
) extends Database:
// ...
As a reminder, the only abstract method we need to implement is this:
trait Database:
def stream[I, O](query: SqlQuery[I, O]): fs2.Stream[IO, O]
And all we need is to pattern match on query
and implement the handling
of various operations. You'll know when you're done when the compiler stops complaining - the SqlQuery
class is sealed after all!
Let's define a small helper method that will help us express the situation where something is not found in the state:
private def opt[T](s: InMemoryDB.State => Option[T]): fs2.Stream[IO, T] =
fs2.Stream
.eval(state.get)
.map(s)
.flatMap(fs2.Stream.fromOption(_))
With that, our first operation (get company by ID) is implemented trivially:
def stream[I, O](query: SqlQuery[I, O]) =
query match
case GetCompanyById(companyId) =>
opt(_.companies.find(c => c.id == companyId))
Finding user credentials is simple as well:
case GetCredentials(login) =>
opt(st => st.users.find(_._2 == login))
.map { case (id, _, password) =>
id -> password
}
and so is creating the user:
case CreateUser(login, hashedPassword) =>
val insert =
gen.id(UserId).flatMap { userId =>
val user = (userId, login, hashedPassword)
state
.update(st => st.copy(users = st.users.appended(user)))
.as(userId)
}
fs2.Stream.eval(insert)
Adding a job is more complicated, we need to generate both the id and the timestamp:
case CreateJob(companyId, attributes, _) =>
val insert = gen.id(JobId).flatMap { jobId =>
timecop.timestampNT(JobAdded).flatMap { ja =>
val job = Job(
id = jobId,
companyId = companyId,
attributes = attributes,
added = ja
)
state.update(st => st.copy(jobs = st.jobs.appended(job))).as(jobId)
}
}
fs2.Stream.eval(insert)
You should be able to spot a deficiency here - we're not checking that the company with that id exists! In an integration test, the database will be enforcing this constraint (well, at least your code will hope that the constraint is enforced).
Through gradual improvements to in-memory DB you should achieve parity with your DB constraints, and keep them in sync because the same specs should be executing successfully against in-memory stubs and real DB.
The question is - is it worth it? I believe it is - implementing those constraints is significantly simpler than in the real database, it's low risk as it only affects tests, and you're getting a pretty functional in-memory DB out of it - something that can be published as part of service's testkit, useful for other components of the system.
Stub: fixture and resources
All we need to do now is to
-
Create a method that will tie together all the stubs, fakes, and what have you, into a single
Resource[IO, Probe]
. -
Fill in various configs we have lying around with bogus values - most of these values won't be asserted on anyways
Here's what the method will look like for our stub tests:
package jobby
package tests
package stub
// imports...
object Fixture:
def resource(using natchez.Trace[IO]): Resource[IO, Probe] =
for
db <- Resource.eval(InMemoryDB.create)
timeCop <- Resource.eval(SlowTimeCop.apply)
logger <- InMemoryLogger.build
// Create the app using our stubbed DB, logger and timecop
routes <- JobbyApp(
appConfig,
db,
logger.scribeLogger,
timeCop
).routes
// (1) sick!
client = Client.fromHttpApp(routes)
generator <- Resource.eval(Generator.create)
// finally construct and return the probe
probe <-
Probe.build(
client,
Uri.unsafeFromString("http://localhost"),
appConfig,
logger
)
yield probe
end for
end resource
- Yes, it should say sick. We use the built-in method from Http4s that turns
a HTTP app definition (
HttpApp
) into aClient[IO]
which just invokes the desired endpoints directly from theHttpApp
, without running any servers.
This Fixture.resource
method is fully self-contained - we can run as many
probes in parallel as we want.
And for our stub tests this might well be the ticket, because the setup doesn't require any global resources, like Postgres or running HTTP server. This means that we don't need to resort to Weaver's global resources - we can use per-suite resources, which are much easier to set up.
We can express this as a StubSuite
trait:
package jobby
package tests
package stub
import weaver.*
import cats.effect.*
import natchez.Trace.Implicits.noop
trait StubSuite extends JobbySuite:
override def sharedResource: Resource[IO, Res] = Fixture.resource
We now have everything to write our first actual spec!
Specifications
All of our specs will be expressed as traits that are mixed into some class or
object along with JobbySuite
(which contains our probeTest
) method.
For example, for users we'd like to test that you need to use correct credentials to successfully login:
package jobby
package tests
import jobby.spec.*
import cats.effect.IO
trait UsersSuite:
self: JobbySuite =>
probeTest("Using wrong credentials") { probe =>
import probe.*
for
// generate data
login <- gen.str(UserLogin, 5 to 50)
login1 <- gen.str(UserLogin, 5 to 50)
password <- gen.str(UserPassword, 12 to 128)
password1 <- gen.str(UserPassword, 12 to 128)
// invoke API methods
_ <- api.users.register(login, password)
ok <- api.users.login(login, password).attempt
wrongLogin <- api.users.login(login1, password).attempt
wrongPass <- api.users.login(login, password1).attempt
everythingWrong <- api.users.login(login1, password1).attempt
yield expect.all(
ok.isRight,
wrongLogin.isLeft,
wrongPass.isLeft,
everythingWrong.isLeft
)
end for
}
//...
And here's how you can test that returned access/refresh tokens can be used:
probeTest("Registration and authentication") { probe =>
import probe.*
for
login <- gen.str(UserLogin, 5 to 50)
password <- gen.str(UserPassword, 12 to 128)
_ <- api.users.register(login, password)
resp <- api.users.login(login, password)
// extract access and refresh token
// from response
refreshCookie <- IO
.fromOption(resp.cookie)(
new Exception("Expected a refresh cookie ")
)
.map(_.value)
accessToken = resp.access_token.value
authHeader = AuthHeader("Bearer " + accessToken)
refreshToken = refreshCookie.split(";")(0).split("=", 2)(1)
validAccess <- auth.access(authHeader)
validRefresh <- auth.refresh(RefreshToken(refreshToken))
yield expect(validAccess == validRefresh)
end for
}
Note that we're directly invoking the methods we defined in the services - there's no JSON or HTTP serialisation happening at any point, we're just getting Scala values back and assert on them.
These tests by design don't test and will not catch protocol errors - be it JSON protocol or HTTP protocol. We're testing the business logic, operating purely with Scala values.
We can do the same for companies, for example verify that authenticated users can create companies, which will be associated with the user:
package jobby
package tests
import jobby.spec.*
import cats.effect.IO
trait CompaniesSuite:
self: JobbySuite =>
test("Creation by authenticated user") { probe =>
import probe.*
for
authHeader <- fragments.authenticateUser
userId <- auth.access(authHeader)
attributes <- fragments.companyAttributes
companyId <- api.companies
.createCompany(
authHeader,
attributes
)
.map(_.id)
retrieved <- api.companies.getCompany(companyId)
yield expect.all(
attributes.name == retrieved.attributes.name,
attributes.url == retrieved.attributes.url,
attributes.description == retrieved.attributes.description,
userId == retrieved.owner_id
)
end for
}
Here we are referencing fragments
, which weren't properly introduced yet. Fragments
are just reusable parts of our specifications. For example, here's the fragment
for user authentication:
package jobby
package tests
import jobby.spec.*
class Fragments(probe: Probe):
import probe.*
def authenticateUser =
for
login <- gen.str(UserLogin, 5 to 50)
password <- gen.str(UserPassword, 12 to 128)
_ <- api.users.register(login, password)
resp <- api.users.login(login, password)
refreshToken = resp.cookie
accessToken = resp.access_token.value
authHeader = AuthHeader(s"Bearer $accessToken")
yield authHeader
It uses the same structure and same Probe
as the tests themselves. It should be
especially useful to extract things like attributes generator:
def companyAttributes =
for
companyName <- gen.str(CompanyName, 3 to 100)
companyUrl <- gen.str(CompanyUrl)
companyDescription <- gen.str(CompanyDescription, 100 to 500)
attributes = CompanyAttributes(
companyName,
companyDescription,
companyUrl
)
yield attributes
Runnable Stub tests
To make the tests discoverable, we need to make them either
-
Objects that extend weaver's
IOSuite
, or -
classes with a single
GlobalRead
parameter
For stub tests, there's no global resource sharing, so we can just make them objects:
package jobby
package tests
package stub
import weaver.*
object UsersTests
extends StubSuite
with jobby.tests.UsersSuite
object CompaniesTests
extends StubSuite
with jobby.tests.CompaniesSuite
object JobsTests
extends StubSuite
with jobby.tests.JobsSuite
And we can now run our tests in SBT using this command
sbt:root> backend/testOnly jobby.tests.stub.*
[info] jobby.tests.stub.UsersTests
[info] + Using wrong credentials 251ms
[info] + Registration and authentication 250ms
[info] jobby.tests.stub.JobsTests
[info] + Creating jobs by authenticated company owner 87ms
[info] + Listing latest jobs 216mss
[info] jobby.tests.stub.CompaniesTests
[info] + Creation by authenticated user 20ms
[info] + Deletion by the owner 40ms
[info] Passed: Total 6, Failed 0, Errors 0, Passed 6
[success] Total time: 2 s, completed 25 Jun 2022, 13:53:08
Let's alias it in the build.sbt
:
addCommandAlias("stubTests", "backend/testOnly jobby.tests.stub.*")
addCommandAlias("unitTests", "backend/testOnly jobby.tests.unit.*")
addCommandAlias(
"fastTests",
"backend/testOnly jobby.tests.stub.* jobby.tests.unit.*"
)
Note that in fastTests
I didn't rely on the already defined commands
because I want SBT and weaver to run all the tests interleaved and in parallel -
not, say, unit tests first and then stub tests.
And I'm pleased to say that for our integration tests we won't need to touch the specs or fragments at all!
Integration: fixture and resources
What exactly do we mean by integration tests? We want to test different components that talk to the outside world (i.e. network, filesystem, any kinds of I/O) working together.
This means significant changes to how our Probe
is constructed:
-
We no longer wish to use in-memory database - this should be real Postgres database, with latest schema
-
Requests processed in memory need to be replaced with serialising and sending the request over the socket
To solve the database problem we'll use TestContainers - JVM interface to launching and managing containers for popular services, like Redis, Postgres, MySQL, etc.
There even exists a Scala wrapper for it, with a bit of extra type safety and idiomatic APIs.
Running actual Postgres will require DB schema migrations as well, so we need same dependencies that we use for our app, but in tests:
libraryDependencies ++=
Seq(
"com.dimafeng" %% "testcontainers-scala-postgresql" % Versions.TestContainers,
"org.postgresql" % "postgresql" % Versions.Postgres,
"org.flywaydb" % "flyway-core" % Versions.Flyway
"org.http4s" %% "http4s-blaze-client" % Versions.http4s,
"org.http4s" %% "http4s-blaze-server" % Versions.http4s,
).map(_ % Test)
Note that we also added actual HTTP server and client implementations as well - for we will be exercising the HTTP layer in tests now.
We need to define a new lifecycle resource for our integration tests, which will
still have the signature of Resource[IO, Probe]
, but it will do much more when
that resource is used:
- Start Postgres container with TestContainers (capture the JDBC url and credentials)
- Point Flyway at that Postgres instance and run the migrations
- Parse the JDBC URL into a config our own Skunk connector can understand
- Connect to the database
- ... proceed with the rest of initialisation
Starting the container is easy:
package jobby
package tests
package integration
// ..imports..
object Fixture:
// ...
private def postgresContainer: Resource[IO, PostgreSQLContainer] =
val start = IO(
PostgreSQLContainer(dockerImageNameOverride =
DockerImageName("postgres:14")
)
).flatTap(cont => IO(cont.start()))
Resource.make(start)(cont => IO(cont.stop()))
Could be even shorter if we didn't try to use the latest and greatest in what Postgres has to offer.
Note that we're making it a resource to make sure there's no lingering containers even if our tests have failed.
Flyway migration is equally easy, providing we have all the necessary credentials:
private def migrate(
url: String,
user: String,
password: String
): IO[MigrateResult] =
IO(Flyway.configure().dataSource(url, user, password).load()).flatMap { f =>
IO(f.migrate())
}
Note that I am not an expert (or even a confident user) of Flyway, so I'm not sure
if there's anything else this method needs to do (word baseline
comes to mind
but I don't know what it means).
Combining these two operations and returning a workable Skunk-backed Database
implementation just need an extra method to parse the JDBC URL correctly:
private def parseJDBC(url: String) = IO(java.net.URI.create(url.substring(5)))
def skunkConnection(using
natchez.Trace[IO]
): Resource[IO, (PgCredentials, Database)] =
postgresContainer // start Postgres
.evalMap(cont => parseJDBC(cont.jdbcUrl).map(cont -> _)) // read the configuration
.evalTap { case (cont, _) =>
// run flyway migrations
migrate(cont.jdbcUrl, cont.username, cont.password)
}
.flatMap { case (cont, jdbcUrl) =>
// parse configuration into our own config object
val pgConfig = PgCredentials.apply(
host = jdbcUrl.getHost,
port = jdbcUrl.getPort,
user = cont.username,
password = Some(cont.password),
database = cont.databaseName
)
// create a Skunk-backed Database instance
SkunkDatabase.load(pgConfig, skunk).map(pgConfig -> _)
}
Then to get our Probe
the lifecycle is similar to what we had for stubs,
except the whole database initialisation:
def resource(using natchez.Trace[IO]): Resource[IO, Probe] =
for
res <- skunkConnection
pgConfig = res._1
db = res._2
appConfig = AppConfig(pgConfig, skunk, http, jwt, misc)
generator <- Resource.eval(Generator.create)
timeCop <- Resource.eval(SlowTimeCop.apply)
logger <- InMemoryLogger.build
routes <- JobbyApp(
appConfig,
db,
logger.scribeLogger,
timeCop
).routes
// ..
But now these routes
need to be used to launch an actual HTTP server,
to which we need to point our HTTP client:
uri <- BlazeServerBuilder[IO]
.withHttpApp(routes)
.bindHttp()
.resource
.map(_.baseUri)
client <- BlazeClientBuilder[IO].resource
probe <-
Probe.build(
client,
uri,
appConfig,
logger
)
yield probe
end for
end resource
Now if you use this resource, you will receive a fully functioning Probe
that
can will execute HTTP requests that will write to the actual database with your
actual schema. I like this so much, it's wild.
So how do we make the final leap from having our specs and this new probe definition, to something that Weaver can actually run?
Runnable integration tests
We have two options:
-
Use this as a per-spec resource, meaning that if you have 3 specs (e.g. Users, Companies, Jobs) then you'll have 3 Postgres containers and 3 HTTP servers running in parallel
-
Utilise weaver's global resource sharing to launch only 1 HTTP server and 1 Postgres container, no matter how many specs you have.
At the time of writing, I think option (1) is actually not as bad as I initially thought - it definitely consumes a lot more resources but theoretically you have more control over how the probe is initialised, if you want to make changes to, say, configuration, or routes, or both.
Option (2) is great because it doesn't require a great deal of setup, and it's a lot lighter on consumed resources, meaning the difference between 5 and 50 different specs running in parallel is not as severe.
To make it a global resource, first thing we'll need to do is tell Weaver how to initialise the resource:
package jobby
package tests
package integration
import cats.effect.*
import cats.effect.std.*
import cats.syntax.all.*
import jobby.spec.*
import natchez.Trace.Implicits.noop
import weaver.*
object Resources extends GlobalResource:
override def sharedResources(global: GlobalWrite): Resource[IO, Unit] =
baseResources.flatMap(global.putR(_))
def baseResources: Resource[IO, Probe] = Fixture.resource
Two things make this work:
- Weaver is able to reflectively find all objects that extends the special
GlobalResource
trait - and initialise it before all the tests start up.
The baseResources
method is not strictly necessary, but we'll use it later
for convenience.
- When Weaver invokes
Resources.sharedResources(..)
, the initalised resource is written (using type as key) into the storage maintained by the framework
Now, our first (inconvenient) implementation of IntegrationSuite
base class
can look like this:
abstract class IntegrationSuiteWrong(global: GlobalRead) extends JobbySuite:
override def sharedResource = global.getOrFailR[Probe]()
end IntegrationSuiteWrong
Where we retrieve the resource we need by its type (Probe
).
It works well when both Resources
and classes implementing IntegrationSuiteWrong
are in the same package and you run the entire package, e.g. sbt> testOnly jobby.tests.integration.*
.
But if you run an individual spec, like testOnly jobby.tests.integration.UserTests
, then the framework actually cannot pick up Resources
object, because
the build tool doesn't pass it along. In that scenario, our only option is
to re-initialise the required shared resources within the spec itself.
So let's rewrite it as so:
abstract class IntegrationSuite(global: GlobalRead) extends JobbySuite:
// Provides a fallback to support running individual tests via testOnly
private def sharedResourceOrFallback(read: GlobalRead): Resource[IO, Probe] =
read.getR[Probe]().flatMap {
case Some(value) => Resource.eval(IO(value))
case None => Resources.baseResources
}
override def sharedResource = sharedResourceOrFallback(global)
end IntegrationSuite
Defining our actual tests is almost as easy as the stub tests:
package jobby
package tests
package integration
import weaver.*
class UsersTests(global: GlobalRead)
extends IntegrationSuite(global)
with jobby.tests.UsersSuite
class CompaniesTests(global: GlobalRead)
extends IntegrationSuite(global)
with jobby.tests.CompaniesSuite
class JobsTests(global: GlobalRead)
extends IntegrationSuite(global)
with jobby.tests.JobsSuite
And it sure runs!
2022.06.25 19:29:00:888 io-compute-6 INFO :whale: [testcontainers/ryuk:0.3.3]
Creating container for image: testcontainers/ryuk:0.3.3
Container testcontainers/ryuk:0.3.3 is starting: 8be6f912a462fdd5f3873d27a946e92de4c538ddefb145b794af1a82641fcbb6
Container testcontainers/ryuk:0.3.3 started in PT0.599771S
[info] jobby.tests.integration.CompaniesTests
[info] + Creation by authenticated user 794ms
[info] + Deletion by the owner 866ms
[info] jobby.tests.integration.UsersTests
[info] + Registration and authentication 46ms
[info] + Using wrong credentials 115ms
[info] jobby.tests.integration.JobsTests
[info] + Creating jobs by authenticated company owner 286ms
[info] + Listing latest jobs 1s
[info] Passed: Total 6, Failed 0, Errors 0, Passed 6
[success] Total time: 6 s, completed 25 Jun 2022, 19:29:06
I've purposefully silenced some, but not all of the loggers, to demonstrate that the containers are indeed started. The way you can silence loggers in Scribe is this by the way:
import scribe.{Logger, Level}
val silenceOfTheLogs =
Seq(
"org.http4s",
"org.flywaydb.core",
"org.testcontainers",
"🐳 [postgres:14]"
)
silenceOfTheLogs.foreach { log =>
Logger(log).withMinimumLevel(Level.Error).replace()
}
And with this, I believe our main goals are achieved - we are using the same test specifications to run in-memory tests as well as tests against running services.
As the number of specifications grows, the execution time for integration tests will grow much quicker than that of stub tests, which can be used in quick feedback loops during feature development.
In fact, to test this difference I added 1000 copies of the same tests to one of the spec:
stubTests
finished in 3 seconds, with each test take 10-40msintegrationTests
took 13 seconds to just report the fact that Blaze's wait queue was overfilled, failing 737 out of 1007 tests
Now, we can easily restrict the number of concurrent requests to 256 in our tests, but even successful tests took 2-3 seconds on average due to severe resource contention over a very limited physical network resource.
To extend this further, you can imagine end-to-end tests, where the fixture is instantiated without a database at all, just a HTTP client pointing at the services. The actual URL can come from environment or a configuration file.
All still using the same test specifications, with perhaps little modifications to add retries.
Obviously, none of this is relevant to you if you are writing functional Scala - if it compiles, then running it is no longer your responsibility or concern. You didn't spend 10 years studying pure mathematics (as some believe to be a forced pre-requisite for this code) to write pesky tests or worry about impure runtimes.