The infrastructure firefight, and why your proxy shouldn't live at a school

A domain we didn't fully control went down. Every game broke. What happened next was 20 hours of rebuilds, relearned lessons, and a much better monitoring story.


What shipped

The great proxy migration

Our proxy server at proxy.themultiverse.school went dark, immediately breaking sprite loading and LLM inference across every game. The response: deploy a new llm-token-proxy at api.multiversestudios.xyz, rebuild all three games with the new URL, discover that compiled assets still had the old domain baked in, and rebuild everything again.

Cloudflare token permissions — blocking five separate tasks for weeks — were finally fixed by the board. The full website audit is done. Every deploy pipeline now includes a Playwright post-deploy smoke test. GlitchTip webhooks automatically create Paperclip tickets from production errors. Sentry DSN is verified in all environments.

CMS v2 Phase 1a landed: a Postgres schema for all content types with seed migrations from existing TypeScript configs. This is the foundation for moving game content out of source code.

Playable again

An HTML-rendering bug made the game unplayable — ARIA's overnight reports, toast notifications, and UI overlays were all dumping raw HTML tags instead of rendered text. Players couldn't click through anything. Fixed.

A security report that litellm was compromised sent us through every repo, Dockerfile, and dependency tree. It's not and has never been in our stack. Useful audit, false alarm.

Norn sprite families got a full regeneration pass with new animations. Sprite generation was returning 503s in production — fixed and verified live. An auth gate that was blocking new players was replaced with a "Play as Guest" option. Creature spawn counts were wrong (12 at spawn instead of 2+3 eggs) — corrected.

Rebuilt and stable

Rebuilt and redeployed with the new proxy URL. Pay-what-you-can buttons are confirmed visible on the live game.

Real asteroid physics

The asteroid belt got a physics overhaul: asteroids moved to real AU distances (2.0–3.5 AU) and rescaled from planet-sized objects to the tiny specks they actually are. PWYC button deployed to the live build for the first time. Rebuilt with the new proxy URL alongside everything else.


What we learned

When your proxy server lives at a domain you don't fully control, you're one DNS hiccup away from every game going dark simultaneously. Build-time constants that reference external domains should be environment variables, not hardcoded strings.

The migration to api.multiversestudios.xyz consolidates API traffic under infrastructure we own. But the real lesson was rebuilding three games twice because compiled assets had the old URL baked in. The Playwright smoke tests and the GlitchTip-to-Paperclip pipeline are the kind of investment that only looks obvious in hindsight — before today, the only way we knew production was broken was when someone opened a browser and noticed.


What is next

The CMS v2 schema opens the door to moving game content out of TypeScript source files and into a database that designers can edit without a deploy cycle. Infrastructure is consolidated under domains we control, with automated monitoring in place. The next step is making the deploy-verify loop tight enough that no broken build survives more than minutes.

Multiverse Studios builds games simultaneously with an AI-native team. All games are pay-what-you-can.