← Back to posts

Sandboxing Agents - My setup for agentic coding from anywhere

In my last post, Building The Wall, I mentioned I will write a more polished version of my setup. This is that.

The goal: a reusable, safe setup where I can spin up isolated VMs and let agents run wild. Basically a dev environment for running untrusted code inside a secure sandbox. And I can attach to it from anywhere, whether I am on my phone or my laptop.

The setup

I think in three "machines":

Finally I also have Tailscale setup because I like remote access to random household stuff (dishwasher, robot vacuum, and a few Airthings sensors) and I can connect to local machine over SSH via my phone. So since I already have a private network setup, a dev VM just becomes another node in there.

The loop

1. Spawn a VM: for a task (or reuse one).

2. Bootstrap it: I have bootstrap scripts that setup my tmux configs, aliases, startup Tailscale etc.

3. Scoped repo access: Fine-grained GitHub PAT limited to a single repo (small blast radius).

4. Run the agent inside: VM comes installed with Claude/Codex. For now I sign-in manually.

5. Ralph Wiggum plugin: For long-running tasks I lean on the /ralph-loop plugin, it basically runs Claude in a first-class persistent loop.

6. Attach from wherever: Via phone or Homebox to monitor, nudge, or review.

The key is isolation, I can comfortably let the agent run fast inside a sandbox, even when using modes like --dangerously-skip-permissions, because the blast radius stays inside the VM.

VM bootstrap (repeatable)

exe.dev gives you a VM quickly. I add a bootstrap layer so every machine looks the same, the bootstrap script:

For repo access, I use a fine-grained GitHub personal access token scoped to a single repository. My bootstrap script stores it using the Linux secret-tool utility, so tools can authenticate without me pasting tokens into prompts.

That's basically it: SSH + tmux + disposable VMs, plus repo-scoped PATs. Enough to let agents iterate fast inside a safe sandbox.

Alternatives worth knowing about

If you like "sandboxed computers for agents", a couple of adjacent ideas I noticed:

For phone terminal emulators, I have also considered ReTerminal and Terminus as alternatives, but haven’t tried them.

I've also been tinkering with oh-my-claude-sisyphus: a multi-agent orchestration plugin for Claude, ported from oh-my-opencode after the whole Anthropic API access restriction saga.