Keeping One Domain While Splitting a Monolith Across Regions
We thought the hard part of going multi-region would be standing up another copy of the application. It was actually preserving the simple user experience of one public domain while keeping region-specific data and operational behavior from bleeding across boundaries.
Mathspace had a monolith that was becoming two regional monoliths: AU and US. Asking users to know whether they should visit au.mathspace.co or us.mathspace.co would have made every link, integration, login flow, and support conversation worse. But blindly proxying all traffic through one region would defeat much of the point of the migration, and routing alone would not solve shared state or third-party OAuth races.
Vestibule is the edge layer we built to hold those constraints together. It is a Cloudflare Worker sitting in front of mathspace.co. Every request is classified into one of three categories - global, regional, or broadcast - and the edge decides where the request should go without teaching the monolith about edge routing.
That last clause matters. The design goal was not just "route traffic." It was to keep region pinning as an edge concern, keep authentication as an application concern, and prevent either system from slowly depending on the other's private details.

The System
The public interface is deliberately boring:
Browser -> mathspace.co -> Cloudflare Worker / Vestibule -> upstream
Behind that, Vestibule can send requests to:
- global services such as the marketing website, media, and selected static or integration paths
- the AU monolith
- the US monolith
- both regional monoliths at once for the few flows where the edge cannot determine the target region from the request alone
The current production Terraform config has several global rules, ~30 broadcast rules, and 2 regional rules. The final regional rule is the catch-all: once no earlier special case matches, the request belongs to a regional monolith.
That rule ordering is the product shape of the system. Most traffic should be ordinary. Special cases should be explicit. Anything complicated enough to require cross-region selection should be named in the edge config, not guessed by application code.
Vestibule was not the only multi-region mechanism. It was the front-door piece. Two other small systems handled problems that should not be solved by a request router.

Diagram note: the top lane shows possible destinations, not default fan-out. Regional requests go to one selected monolith; only broadcast routes fan out to both.
Global Models Sync keeps selected "global" tables synchronized between regional databases. It compares primary keys, compares row checksums for rows that exist on both sides, then inserts and updates the destination inside a transaction. In production, the owner region is US; each sync attempt is followed by a delay, and destination-only rows are deleted only for configured tables. That keeps routing code out of the business of manufacturing shared reference data.
OCAP, the Outgoing Caching Proxy, handles a different edge case: outbound one-time token exchanges. Some OAuth integrations require a POST that may only be redeemed once. Behind Vestibule, multiple regional login attempts can race to exchange the same provider token, and the region that wins the race may not be the region that owns the user. OCAP keys deduplication by method, target URL, and request body, normalizing JSON and form bodies first; concurrent callers wait on the same in-flight origin request, and the response is cached for the default 30-second TTL.
Those helpers are important because they kept Vestibule small. The request router did not become a data replication tool, and it did not become an OAuth protocol adapter. Each system owned one failure mode.
The Problem
Before login, the edge usually does not know which region owns a user. A browser might arrive with no session cookie, no region cookie, and only a path.
There were a few tempting fixes, all bad in different ways:
- Split public domains by region and make users, schools, LMS integrations, SAML providers, and support links remember the right one.
- Put enough application knowledge into the proxy to inspect users, credentials, tenants, or auth state directly.
- Always route anonymous traffic to one default region and hope the application can recover.
- Broadcast too much traffic and turn ordinary page loads into cross-region fan-out.
- Make the monolith itself responsible for every shared table, region switch, and integration race.
The system chose narrower mechanisms. Vestibule does not authenticate users. It does not know database schemas. It does not parse the monolith's domain model. It only answers a routing question: for this HTTP request, do we already know the region, can we safely assume one, or must we ask all regions and infer the winner from the response?
Three Kinds Of Routes
The core abstraction is simple enough to explain in one table.
| Route type | What it means | Example use |
|---|---|---|
global |
Proxy or redirect to a non-region-specific upstream | website, media, selected global pages |
regional |
Send to AU or US based on a region pin, or an assumed region when no pin exists | authenticated app pages, most application traffic |
broadcast |
Fan out to all regions and pick the region whose response matches a success rule | login POST, SAML completion, join-class signup, selected integrations |
The worker compiles route rules at startup. Request matchers can inspect method, path, and payload. Broadcast success rules can inspect status code, Set-Cookie, or response payload.
In pseudocode, the runtime shape looks like this:
for (const rule of orderedRules) {
if (await rule.matches(request)) {
return rule.process(request);
}
}
return errorResponse("no rule matched");
The interesting part is what each processor refuses to do.
Global paths forward only whitelisted cookies. That reduces unrelated browser state sent to global services.
Regional paths use the existing region pin when present. If no region pin exists, the worker can assume a region from Cloudflare geolocation and forward only whitelisted cookies. That means an anonymous request can land on a reasonable login page without creating a durable region assignment too early.
Broadcast paths are reserved for flows where the edge cannot determine the target regional origin from the request alone. Vestibule fans out to regional origins and selects a response using explicit success rules. A username/password login POST is the ownership-discovery example: both monoliths can check their own databases, but only the owning region should return a successful session response. Vestibule observes that response, sets a region pin, and future requests become ordinary regional requests.

The Region Pin Is Private To The Edge
Vestibule uses a browser cookie. It also supports a header for clients that prefer setting a header over a cookie. The header overrides the cookie if both are present.
But neither signal is forwarded to the monolith.
Before proxying to an origin, Vestibule strips the its cookie and deletes the its header. That choice keeps the application from depending on a routing implementation detail. The upstream sees an ordinary request with ordinary application cookies, plus normal proxy metadata such as the forwarded host and configured upstream headers.
This is a small implementation detail with a large architectural effect. If a developer cannot read Vestibule's cookie inside the monolith, they cannot build product behavior around it. Server-to-server callers can still express a region preference at the Mathspace boundary, but that signal remains a routing input rather than an application dependency.
That was not accidental. One background discussion framed the design rule bluntly: the proxied app should not know about Vestibule or region pinning, and Vestibule should not know the app it is proxying. The result is not perfect isolation, but it is a useful pressure: the easy path is the decoupled path.
Broadcast Is A Scalpel, Not A Default
Broadcasting sounds expensive because it is. Every broadcast request fans out to every configured region. With two regions, that is two origin requests for one browser request. With more regions, the fan-out grows linearly.
So the rule has to earn its place.
For username/password login, we may require a Django session cookie; API login may require a 2xx response. For password reset, the signal may be a redirect status. For a public GraphQL signup mutation, it may be a successful 2xx response with an empty errors array. Some routes opt into allow_multiple_successfor route-specific reasons.
The worker handles the failure modes explicitly:
- exactly one success: set the region pin and return that response
- more than one success: fail unless the rule opted into
allow_multiple_success - zero successes but at least one failed response: return the best regional failure response
- no usable response from any region: return a proxy error
That makes duplicate data and ambiguous ownership visible. If the same credentials unexpectedly succeed in multiple regions, the system should not silently pick one and teach the browser the wrong region. Failing is better than creating a long-lived pin from ambiguous evidence.
The Tradeoffs We Kept
Vestibule does not make geography disappear. A global edge layer still sees request metadata, and a single public domain does not by itself solve every data sovereignty or privacy requirement. What it does is reduce avoidable processing by non-relevant regional origins and make the remaining exceptions explicit.
It also keeps a regex-driven routing table in Terraform. That is direct and reviewable, but not magic. Rule order matters. Payload matchers need care because request bodies are one-shot streams. Response payload success checks need cloning for the same reason. This is why the worker compiles rules up front and validates impossible configurations early, and why the CI path runs both JavaScript tests and Terraform formatting checks.
The design also accepts that some edge behavior is product behavior. SAML, LMS integrations, password reset, join-class signup, and staff region switching each have different success criteria. Pretending they are all the same would create a smaller config and a worse system.
The supporting systems have their own tradeoffs. Global Models Sync intentionally works on selected tables, not the whole database. OCAP keeps its cache and in-flight waiters in process memory and is configured with as a single node; because that state is not shared between tasks, concurrent instances would weaken its deduplication guarantee. These are constraints, but they are explicit constraints in the right layer.
What Worked
The durable part of Vestibule is not Cloudflare Workers. The durable part is the boundary:
- one public domain for users and integrations
- two regional monolith origins in production
- three route categories that cover ordinary, global, and cross-region selection traffic
- private region-pinning signals that are stripped before origin fetches
- explicit success rules for flows where the target region must be selected
- selected global model sync outside the request path
- outbound OAuth/token exchange caching outside the monolith
That boundary kept the monolith mostly ignorant of the edge. It also gave service-to-service callers a header-based way to express regional intent without reintroducing broad broadcast callbacks through the global proxy. And it made unsafe behavior harder to stumble into: if a route needs broadcast semantics, it has to be named; if shared state needs syncing, it goes through the sync tool; if one-time outbound requests can race, they go through OCAP.
Measured at Cloudflare, the boundary is not just tidy architecture. During the last 30 days, the production handled roughly 555.7 million requests and made just a few percentage more subrequests. The reliability picture is useful but bounded: the Worker aggregate reported zero errors, while the sampled status breakdown surfaced 26 exceptions! Despite some gaps in analytics, the evidence is stronger than the qualitative story we started with:
The Lesson
The obvious way to split a monolith across regions is to make every layer a little region-aware. That works at first and then becomes a tax: every feature learns a different fragment of routing, auth, data ownership, and geography.
Vestibule took the opposite approach. It made the edge responsible for routing and then made that responsibility difficult to leak. The code is small (few hundred lines! but with a long list of unit and e2e tests) because the boundary is strict. The rule table is explicit because the exceptional flows are real. And the application stays closer to an ordinary application because the proxy refuses to become a second monolith.
That is the kind of infrastructure work that is easy to underestimate. It is not glamorous. It is just a careful answer to a hard question: how do you change the shape of production without making every future engineer carry that shape in their head?
If you like systems where the hardest part is drawing the right boundary and then defending it in code, this is the kind of work we enjoy doing.