Sandboxing Agents - My setup for agentic coding from anywhere
In my last post, Building The Wall, I mentioned I will write a more polished version of my setup. This is that.
The goal: a reusable, safe setup where I can spin up isolated VMs and let agents run wild. Basically a dev environment for running untrusted code inside a secure sandbox. And I can attach to it from anywhere, whether I am on my phone or my laptop.
The setup
I think in three "machines":
- Homebox (local): My local Ubuntu machine
- Phone:
- Termux + tmux: Termux CLI on Android with tmux to survive network transitions
- WhisperIME: Voice input on my phone for prompts when I don't want to type. Its English-only model is super fast which makes blabbering to the agent easier than typing.
- exe.dev VMs: disposable workbenches for agents
- exe.dev is built by folks from the Tailscale ecosystem. The VMs come with persistence and sudo, and perfect for hosting too. Worth checking them in their own words: Why exe.dev?
- VMs spin up fast, easy SSH, comes with Claude Code/Codex and an own agent Shelley pre-installed. Also a browser terminal out of the box, though I prefer my own CLI flow.
Finally I also have Tailscale setup because I like remote access to random household stuff (dishwasher, robot vacuum, and a few Airthings sensors) and I can connect to local machine over SSH via my phone. So since I already have a private network setup, a dev VM just becomes another node in there.
The loop
1. Spawn a VM: for a task (or reuse one).
2. Bootstrap it: I have bootstrap scripts that setup my tmux configs, aliases, startup Tailscale etc.
3. Scoped repo access: Fine-grained GitHub PAT limited to a single repo (small blast radius).
4. Run the agent inside: VM comes installed with Claude/Codex. For now I sign-in manually.
5. Ralph Wiggum plugin: For long-running tasks I lean on the /ralph-loop plugin, it basically runs Claude in a first-class persistent loop.
6. Attach from wherever: Via phone or Homebox to monitor, nudge, or review.
The key is isolation, I can comfortably let the agent run fast inside a sandbox, even when using modes like --dangerously-skip-permissions, because the blast radius stays inside the VM.
VM bootstrap (repeatable)
exe.dev gives you a VM quickly. I add a bootstrap layer so every machine looks the same, the bootstrap script:
- installs basic tool chain (
tmux,curl,secret-tooletc.) - bring up Tailscale
- pull in my configs (claude, codex, tmux and bash configs)
- set up helper scripts / aliases
For repo access, I use a fine-grained GitHub personal access token scoped to a single repository. My bootstrap script stores it using the Linux secret-tool utility, so tools can authenticate without me pasting tokens into prompts.
That's basically it: SSH + tmux + disposable VMs, plus repo-scoped PATs. Enough to let agents iterate fast inside a safe sandbox.
Alternatives worth knowing about
If you like "sandboxed computers for agents", a couple of adjacent ideas I noticed:
For phone terminal emulators, I have also considered ReTerminal and Terminus as alternatives, but haven’t tried them.
I've also been tinkering with oh-my-claude-sisyphus: a multi-agent orchestration plugin for Claude, ported from oh-my-opencode after the whole Anthropic API access restriction saga.