Testing Strategies in the Atomic TypeScript Stack

For the last few years, we’ve been using a universal TypeScript platform including TypeScript, GraphQL, React, and Express. Having TypeScript on both the client and the server (including command-line utilities and background worker processes) has transformed how we go about ensuring the quality of the software we build. When compared to other stacks we’ve used, we consistently get faster feedback about bugs and better discoverability of our systems.

Prior to putting together our “Atomic TypeScript Stack,” we did a lot of work with a combination of Ember and Rails. With that configuration, automated tests were really our sole feedback mechanism. We’d write unit/integration tests of Rails and Ember code, usually augmented with a system-testing strategy that exercised both the back end and front end at the same time. The system tests were always particularly frustrating, due to timing/synchronization bugs, poor debuggability, and general slowness when using Selenium.

Ember’s excellent integration test solution was an improvement, but it was always awkward to work around the lack of access to back-end APIs for seeding test data. We built solid, well-tested apps, but at great effort and with sometimes frustratingly-high maintenance burdens.

Our approach to TypeScript has involved a radical rethinking of how we go about ensuring software quality. The goal is to maintain equivalent or higher levels of quality while shortening developer feedback loops and lowering the cost of implementation and maintenance complexity.

At a high level, we’ve pursued three strategies to improve quality while decreasing the involved costs:

  1. Build more error-resistance into the system itself via Type-First Development and liberal use of schemas for data validation.
  2. Pursue architectural shifts that better leverage TypeScript’s type system and are inherently easier to test, particularly the functional core/imperative shell pattern and hexagonal architecture.
  3. Use a single unified, parallel test suite written in Jest with powerful techniques for different kinds of testing, versus a small, fixed bucket of test types (e.g. “unit” and “integration”).

In the next section, I’ll be sharing some thoughts about each of these components.

Building Inherent Error-Resistance

TypeScript has an extremely powerful type system, and mastery of its capabilities yields many options for making bugs impossible–they’d be caught by the type checker before you could even finish a unit test to cover them.

With a powerful type system, we can aim to make invalid states unrepresentable (including techniques like discriminated unions and flavored/branded types). We see this as the first line of defense to ensure software quality, and it is quite powerful. It catches issues that might be caught by unit tests, but the type system sees across module boundaries and can detect many integration bugs, as well.

To make this happen, we practice Type-First Development, projecting our notions about the shape of the system into types before writing any code. This helps make better use of the type system, and it helps make the system self-documenting, improving communication and clarity.

This is a microcosm of an overall approach of leveraging Domain-Driven Design to build clarity at the business and design level. The idea is to establish a Ubiquitous Language to guide a project and ensure that an application can flex in ways congruent with the underlying business.

In addition to making use of the type system itself, we also make heavy use of schemas to declaratively specify valid data. One source of these is GraphQL, which is inherently driven by a schema that governs valid use of an API. We also rely on JSON-schema (via Ajv) to validate data from external sources and to precisely specify document-structured data that might be stored in something like a PostgreSQL jsonb column or document database.

To help bridge the gap between runtime validations specified by GraphQL and JSON schemas and TypeScript’s type system, we generate TypeScript types from both our JSON schemas and our GraphQL schemas and queries/mutations. These integrations allow us, in general, to specify what valid data looks like declaratively and get automatic static checking of production and consumption of that data from TypeScript.

Pursuing Testable Architecture

We’ve structured our system using Hexagonal Architecture to make the core of our application easy to test. It allows us to mock key external touch-points with alternative implementations and thoroughly exercise the core of our application. In general, any unit test can exercise any arbitrary subset of either the client or the server. (More on our testing strategies below.)

Additionally, we look for opportunities to employ the Functional Core, Imperative Shell pattern to decouple business logic from external systems (such as the database). This approach is about doing side-effecting code at the outermost layer, having as much logic as possible be expressed in pure functions and inert data. (Redux Saga illustrates strategies for accomplishing this.)

Fundamental entanglements with database technology can make this challenging for the server, but complex subsystems often provide opportunities to create a well-defined data type (or Aggregate Root) where business logic can be implemented as a family of pure functions. This makes excellent use of TypeScript’s type system and yields fast test suites.

React’s architecture allows us to express the UI in a functional way–separating presentational and container components. However, colocating GraphQL queries with components via Apollo hooks and combining it with the ability to mock GraphQL APIs gives us good options for testing UI components even when that separation is not present.

Fundamentally, component architecture and colocated queries allow us to make local architecture/testing decisions in a situationally appropriate way.

Using a Unified Test Suite

We use Jest for basically all of our testing: client, server, unit tests, and integration tests. We rely on TypeScript at all layers of our application, and we have structured our codebase to represent the inherent nature of the system instead of siloing everything into “client” and “server” buckets.

In the same way, we use Jest as a universal testing substrate, and we rely on test helpers to implement different kinds of tests that utilize different testing strategies. In our recent codebases, unit and integration tests are not places (folder on-disk), but states of mind (testing techniques applied within a test).

By approaching testing from this perspective, we’ve ended up with a richer ecosystem of types of tests. We’re more free to introduce strategies for particular types of tests since we don’t follow a dogma such as, “Unit tests isolate individual modules with mocks, and system tests always use selenium.”

Our Test Types

Below is a current bestiary of our test types. First and foremost, remember: The type system is the fastest, richest layer of testing. Start with good types.

These are augmented with a few kinds of relatively isolated unit tests:

Simple unit tests

As much logic as possible is moved into simple, side-effect free functions which are easily testable.

Enzyme-based front-end tests

These verify that front-end interactions work as expected. Our GraphQL API and other external systems are mocked. We check front-end behavior against the API as an abstraction, and we separately verify that those queries and mutations work as intended (more below).

React Storybook stories

React Storybook stories are an interactive, visual test suite for components. They speed up manual checking of visual rendering across browsers and device sizes. They are integrated into Jest for simple snapshot validation of component rendering.

Back-end unit/integration tests

Our dependency-injection context–used as the context for all GraphQL resolvers–gives us a uniform way to test our back-end. These tests are all defined the same way, in terms of a dependency injection context which has been instrumented to operate in an isolated environment.

These tests are distinguished by what they test, and what guarantees that adds to the system.

  • Data repository tests: These generally validate that database operations work correctly. Test data is added to the database, and data-layer API functions are run end-to-end with data stores (e.g. PostgreSQL or Redis) to ensure they work correctly. Most of our dataloaders live at this level and need to be tested for individual and batched operation correctness.
  • Service layer tests: Our service layer represents business rules, including authorization. It includes our CQRS command pattern and triggering/semantics of background jobs. Whenever possible, complexity is factored into a functional core for a given service, enabling simple function unit testing.
  • Holistic tests: These integrate end-to-end and check that the service layer APIs handle edge cases correctly. We generally try to model permissions as types and use these types as parameters of service for later routines to enforce correct authorization checking of dangerous operations. Type tests can help here.
  • GraphQL API tests: We can test our GraphQL API with either ad hoc test queries or the actual queries used by the client. For private APIs used internally for one application, we tend to focus our testing on the specific .graphql files used in the client. We think of testing the interactions between mutations and queries as integration tests for the semantic model of the front-end, complementing the shallow enzyme tests that treat the queries as a system boundary.
  • Additional Tests

    The above tests represent our core testing strategies, but we have a number of other techniques we can deploy as needed:

    Type tests

    For particularly important or complex types, unit test your types via a package like typings-tester. We can write unit tests to ensure our types correctly catch intended type errors and admit valid states.

    Migration tests

    We also use migration testing for testing database migration scripts. We get these for relational databases from Knex. (See, for example, our old Rails gem.

    Property-based tests

    Occasionally, we deploy property-based testing. to stress-test the system for violations of fundamental invariants. We’ve done so sparingly so far and hope to leverage the technique more as we better integrate the tooling into our application framework.

    Browser-based tests

    Finally, we use browser-based testing (e.g. via Selenium) to fully integrate a system that’s prone to breaking in ways our enzyme tests and type safety can’t protect against. We treat these as a liability to be used sparingly.

    This rich ecosystem of tests has upsides and downsides, but overall, we think it’s a huge win. A rich testing ecosystem makes developers consider how to go about testing features. It can be difficult or ambiguous to decide which testing strategy is best in a given situation. But we also have more ways to test, catching errors and regressions.

    At the end of the day, developers should be intentional about what and how they’re testing. Because of that, we see additional conversations about how to test as a pro, not a con.