Working on small, monolithic applications with a small team is a joy. You can understand everything that happens in the system and what everyone else is working on. That also means you can quickly make and deploy meaningful changes. The boundaries of the system and the team are manageable. Beyond a certain scale of scope, usage, and/or team size, though, that monolith you loved will become a burden.
With some preparation and foresight, you can get ready for that growth. The key is knowing what boundaries to create and when to create them.
Why Create Boundaries?
Creating boundaries within a system means creating well-defined borders that can control scope and complexity. This is helpful to the people involved and provides more technical control options. Here are a few reasons I’ve found it helpful to compartmentalize parts of a system.
It’s better for the humans.
Boundaries limit the amount of complexity that can live inside the boundaries — inside the box. That’s good because we all have limits for how much complexity we can keep in our brains at once. When the complexity inside the box is lower than those limits, it’s easier to work in that box.
Having many smaller boxes also creates more opportunities for autonomy and evolution. As long as the behavior at the boundaries meets expectations, you can build one subsystem in a more functional programming style, while another can be more object-oriented. One can use TypeScript and another can use Python or Rust.
There are more options for parallel work.
Additionally, several smaller systems with well-defined boundaries are easier to work on in parallel. Boundaries in code and infrastructure make it less likely that team members will step on each other’s toes.
You can control the scope of builds and deployments.
With a monolith, deployment is often an all-or-nothing affair. The bigger the monolith the more of a problem that becomes. The right level of separation lets you deploy changes (especially hotfixes) to only the affected parts of your infrastructure. That limits downtime and impact.
It creates options for scaling.
When you separate parts of the system, you have more options for which parts of the infrastructure you scale and how you control load and throughput. For example, if you have one data storer for all reads and writes, you may find yourself with users performing actions that they expect to run long (reports). Other users will be performing actions that they expect to return quickly (mutations), and both sets of users are competing for resources.
It would be difficult to meet expectations consistently for both sets of users with scaling alone. But, if the data stores are separate, then scaling can be set to balance cost and performance for each use case independently.
Where are boundaries helpful?
When you decide that creating boundaries would be helpful, figuring out where to find or create those boundaries is the next step. Some areas I look at for boundary opportunities include:
- Between different functional areas of an application or application suite. One part of the application might be all simple CRUD for setup and another part that’s user management. One area may focus on advanced editing of content (by a different set of users than the simple CRUD setup), while another area is for sophisticated reporting and a dashboard. Formal boundaries between those areas would give you four full-stack separation opportunities.
- Between different types of operations. For example, you might separate simple reads and writes from complex reporting, or JSON API reads from operations that generate and deliver files (PDFs, Excel, etc.).
- Between things that take different amounts of time. Long-running operations may have different needs than simpler, faster operations.
- Between actions that depend on external systems and operations that don’t. Dependence on external systems may require extra error handling or modes of operation. An example would be queueing requests while the external system is down, which shouldn’t affect the rest of the application.
- Between deployment targets. For example, it’s helpful to be able to build and deploy a fix to a client application independently from the server or API.
- Between different entity types. Sometimes an update to one type of entity requires a change in a different type of entity. An example would be a change in customer address requiring an update of the address on that customer’s future orders. You can use a decoupling mechanism (events, queues, etc.) to prevent your customer repository from containing the logic to update orders.
- Between parts of the application that don’t naturally share much information or behavior. Maybe they’re working with different types or sources of data and only share a few very basic helpers.
Tools, Practices, and Timing
Some tools and practices work best when employed from the beginning of a project, while some you can bring in later as needed. It’s tempting to avoid over-complicating your project. But making smart investments at the beginning helps to avoid complicated rework later. Here are techniques that have worked to create separation for me or that I have on my radar for future projects.
Domain Event Publishing/Subscription (in the Same Process)
When: From the start
What: A mechanism that lets you publish and subscribe to domain-level events within a single process
Why: Keep the code focused and decoupled from secondary effects
Example: A customer domain entity repository publishes customer changed events, and a customer changed event handler uses the customer orders repository to make necessary updates to those entities.
Event Streams / Queues (in Separate Processes)
When: Segregation of scaling, responsible team, or other factors makes the separation of processes worthwhile
What: Use messaging infrastructure to decouple components into separate processes
Why: Keep processes focused and constrain the complexity of related code. Create opportunities for different teams to be responsible for different parts of the system. Avoid placing undesirable constraints on an entire system that only needs to apply to one part of it.
Example: Generating Excel files takes longer than allowed by API Gateway’s 29s limit. That means the request to create the file gets placed in a queue and handled by a separate Lambda that’s not bound by the 29s limit. Or, an update to an entity in one part of the system needs to trigger a report update in another part of the system. In that case, the information is placed in a message stream and handled by a special report updater Lambda.
Code Separation by Convention (Project Folder Structure)
When: From the start
What: Organize a project into folders that align with the boundaries you want to see in the application
Why: Organizing based on desired boundaries can help make it clear when those boundaries are ignored or broken. Automated tools may be able to help enforce those boundaries, depending on language and tools. For instance, Lint tools for TypeScript can enforce import restrictions.
Example: Rather than creating directories based on the horizontal slices of an application’s architecture (e.g., records, domain, logic), create directories based on vertical slices of functionality (e.g., pages/report, graphql-api/report, core/product, etc.).
Code Separation Into Modules Using Lerna
When: From the start
What: Like separation above, but a more formal and well-tooled way to separate and independently version and publish modules that live together in a single Git repository. I haven’t used Lerna yet, but it’s high on the list for my next project.
Why: Create a more formal separation between code modules in a monorepo, without sacrificing most of the convenience of the monorepo.
Example: See the Lerna documentation
Preparing for Growth
In the course of a project, you will run into situations where having boundaries in the right places will make your life much easier. Building a foundation with sensible boundaries and tools for creating new boundaries when the need arises is a good start.
Creating the right boundaries is a constant exercise in balance. As future growth becomes clearer, you’ll have more information to help you decide on potential boundaries and separations. Create low-effort, low-impact boundaries (e.g., in-process event pub-sub system) more aggressively. On the other hand, think carefully about where it makes sense to invest in higher-cost, higher-impact separations (e.g., using a Kinesis stream or SQS queue to separate processes at the infrastructure level) and take those on selectively.