Russian Doll Caching at the Edge

Rails gave web developers a durable caching model. Cache the outer fragment. Reuse the inner fragments that did not change. The Rails caching guide calls this Russian doll caching.

You Can Use Russian Doll Caching at the Edge.

The same idea now applies outside the template. The dolls can span the browser cache, CloudFront, the origin validator, the hot store, and the database. This builds on the scaling frame I used in What it Means to Build Antifragile Cloud Architecture. Stress exposes the next bottleneck. A good cache design moves that bottleneck outward.

That is the price-performance lens. The goal is not only lower latency. The goal is to spend less compute per correct response. A cache hit at the edge is usually better than a fast render at the origin.

Russian doll cache layers showing edge, 304, hot store, database, and done

Start with Request Shape.

Most server-rendered apps have three route shapes. They should not share one cache policy. The response scope should decide the cache scope.

  1. Public shell. Marketing pages, docs, catalogs, and public index pages.
  2. Tenant shell. Pages that vary by host, locale, path, or a stable tenant key.
  3. User leaf. Pages that vary by session, entitlement, or account state.
Route shape Good default Edge stance Risk
Public shell public, s-maxage=300 Cache at CloudFront. Stale copy after deploy.
Tenant shell public only when safe. Cache by host, path, locale, or tenant resource key. Cache-key explosion.
User leaf private, no-cache or no-store. Keep it out of shared caches. User data leakage.

cache_encapsulate
cache_encapsulate

Let HTTP Reject Stale Work First.

Do not start with Redis. Start with HTTP. Browsers already know how to ask whether a representation changed.

RFC 9110 defines ETag and Last-Modified as response validators. It also defines If-None-Match and If-Modified-Since as request preconditions. If the validator still matches, the server can return 304 Not Modified with no response body.

class CatalogsController < ApplicationController
  def show
    items = Current.tenant.items.published.includes(:prices, :media)
    relation_cache_key = items.cache_key_with_version
    @catalog_payload = Rails.cache.fetch(
      ["catalogs/show", Current.tenant.cache_key_with_version, relation_cache_key],
      expires_in: 5.minutes
    ) do
      {
        item_ids: items.pluck(:id),
        last_change: items.maximum(:updated_at)
      }
    end
    expires_in 5.minutes,
      public: true,
      "s-maxage": 5.minutes,
      stale_while_revalidate: 30.seconds,
      stale_if_error: 5.minutes
    fresh_when(
      etag: ["catalogs/show", Current.tenant.cache_key_with_version, relation_cache_key],
      last_modified: @catalog_payload[:last_change],
      public: true
    )
  end
end

This is boring by design. Rails gets to attach freshness to model state. The browser and edge get protocol-native headers. The app avoids rendering when the client already has the current answer.

Name the Cache Policy Precisely.

RFC 9111 makes the language clear. no-cache does not mean “never store this.” It means “do not reuse this without revalidation.” no-store means “do not store this.”

  • Use no-cache when a stored response must revalidate before reuse.
  • Use no-store when the response should not be stored at all.
  • Use public only when a shared cache can safely reuse the response.
  • Use s-maxage when shared caches should have their own freshness window.

Cache the Outer Doll at the Edge.

CloudFront is not just an asset CDN. It can be the shared outer shell for safe HTML and API responses. That only works when the cache key stays small.

The CloudFront cache-key docs describe how headers, cookies, and query strings shape reuse. Every extra dimension can multiply cache variants. The cache exists, but most requests still miss.

  • Route broad shapes with path-based cache behaviors.
  • Vary only on dimensions that change the response.
  • Avoid forwarding cookies for shared responses.
  • Do not let Set-Cookie leak into public cached objects.
  • Keep session-backed pages private or uncached.

Use Stale Serving with Care.

stale-while-revalidate lets a cache serve an expired response while it refreshes in the background. stale-if-error lets a cache serve stale content when origin is unhealthy. CloudFront documents both in its expiration and stale content guide.

These are latency and resilience tools. They are not a reason to cache everything. The response still needs a safe cache key.

Keep the Hot Store Behind the Protocol.

A hot store is still useful. It should save expensive rendering and serialization. It should not hide a weak HTTP cache policy.

Prefer cache values that age well. Rendered strings, JSON blobs, IDs, counts, and small primitive payloads are safer than live ORM objects. The cache key should describe the same semantic version as the validator.

  • Outer shell: CloudFront object.
  • Origin contract: ETag, Last-Modified, and Cache-Control.
  • Hot store: fragments or serialized data.
  • Database: source of truth.

This is Not Rails-only.

Rails gives this idea a good name. The same shape exists in Laravel, Django, Phoenix, and most server-rendered stacks. The nouns change. The protocol does not.

Stack Lever
Rails fresh_when, expires_in, fragment caching.
Laravel Cache::remember, Cache::flexible, response headers.
Django per-view cache, template fragments, low-level cache API.
Phoenix / Plug Plug.Static, ETags, cache-control options.

Compress Last.

Brotli and Gzip help. They reduce bytes over the wire. They do not remove rendering, queries, or serialization.

Compression is the last doll. Conditional GET and shared-cache hits are bigger wins because they avoid sending the body in the first place.

Conclusion

Russian doll caching still works. The dolls moved outward. The outer shell can be CloudFront. The middle shell can be an HTTP validator. The inner shell can be a hot store. The database should be the last stop.

The rule is simple. Cache the largest safe shell as far outward as possible. Revalidate before rendering. Keep personalized computation inside the smallest doll.

The fastest response is the one that stops before Rails renders. The cheapest compute is the render you never had to do.

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *