#containers // Ian's AI Thoughtstream

Claude Code as a DevOps Platform

Render sent me a $496 bill last month, and that was the moment I went back to running my own box. SpaceMolt served 1.3 TB of traffic in May, all of it HTTPS MCP servers and WebSocket connections, and Render's bandwidth pricing turned that into $336 of overage on top of $144 for hosting and $15 in fees. The thing that made self-hosting viable again wasn't a cheaper VPS. It was that Claude Code now does the parts I used to dread.

How I ended up on managed hosting in the first place

Last year I got bit by React2Shell, the CVE-2025-55182 pre-auth RCE in React Server Components. The damage on my end was mostly innocuous, but getting exploited at all was enough. I stopped running a long-lived VPS for personal projects and moved everything onto free or nearly-free tiers of Vercel, Cloudflare, and Fly.io.

When SpaceMolt started, Render.com was the obvious pick. Heroku-like push-to-deploy, a clean interface, the tooling you'd expect from a modern cloud service. It was great right up until the traffic grew and the bandwidth limits got tight.

What changed: the agent does the ops work

A year ago I would have built all of this by hand. Hardening, firewalls, log shipping, metrics, Docker Compose, monitoring, backups. That's a meaningful chunk of a weekend, and then it's a meaningful chunk of every future weekend.

An agent like Claude Code only needs SSH. I grabbed a $44/mo box from Hetzner with unlimited bandwidth and more RAM and disk than I'll ever use, told Claude Code I was migrating SpaceMolt off Render, and it wrote and executed a nine-phase plan to provision the machine end to end: a full deploy and rollback process, log shipping to Betterstack, and monitoring with a local Netdata instance.

I'd never heard of Netdata before this. Per-second metrics, near-zero config, a web dashboard that auto-detects services and Docker containers. It's left me impressed.

Monitoring dashboard displaying system storage metrics with line graphs showing pressure trends over time and gauge charts for disk I/O operations and utilization rates

The runbooks are the real artifact

The research, the plans, and the runbooks all live in a private git repo I can hand to the dev team. That's the part that makes this feel different from the old "SSH in and hope you remember what you did" approach. The knowledge isn't in my head or buried in shell history. It's written down, versioned, and reproducible.

The cost of running a server went from a meaningful part of my life to roughly the effort of a hosted service. The bill went the other direction.

Sandboxing AI Coding Agents

Coding agents will happily run whatever they generate, and most of them have your shell, your SSH keys, and your AWS creds one rm -rf away. Sandboxing the agent is the cheapest insurance you can buy, and in 2026 there are finally enough good options that you should pick one.

The landscape splits into a few camps. Full VMs (Firecracker, Lima, OrbStack) give you the strongest isolation and the most overhead. Containers (Docker, Podman, devcontainers) are the default for most people and work fine until the agent needs to touch your real checkout. And then there's the OS-native path: Seatbelt on macOS, seccomp-bpf and Landlock on Linux. Those last two are what the kernel already uses to sandbox App Store apps and Chrome tabs, so the primitives are battle-tested. The friction has always been the ergonomics.

My current favorite is nono. It's a CLI wrapper that uses Landlock on Linux and Seatbelt on macOS to restrict filesystem and network access for any process you launch under it. No container, no VM, no daemon. It ships with profiles for the popular coding agents and lets you write your own, and I've gotten into the habit of creating a profile per project. The agent gets exactly the directories and hosts it needs, and nothing else.

The per-project profile is the part that actually changed my behavior. Once writing a profile takes thirty seconds, you stop talking yourself out of it. The agent can still go off the rails inside the box, but the blast radius is whatever you wrote down, and the rollback story is just git. I'm extremely curious to see where this category goes once more agents ship with sandbox profiles in the box.