How I use Claude Code on the job
I recently gave an internal workshop on agentic coding with Claude Code. Preparing for it forced me to articulate what I do differently now versus when I started. This post is the distilled version.
One thing upfront: as Boris Cherny (creator of Claude Code) has said, there is no single best way to work with Claude. What follows is how I get the most out of it.
In this post:
- First principles: Polya's problem solving
- Context is everything
- The loop in practice
- Agent teams: the next level
- Failure patterns
- Security
- One last thing
First principles: Polya's problem solving
The framework I keep coming back to isn't from AI or software. It's from a 1945 math book.
George Polya's How to Solve It lays out four principles of problem solving: understand, plan, execute, and look back. It maps almost perfectly to how I use Claude:
- 1. Understand - Brainstorm the problem space. What are we solving? What constraints exist?
- 2. Plan - Devise a concrete plan with steps, dependencies, and verification criteria before writing code.
- 3. Implement - Execute the plan. Let Claude carry it out with guard rails in place.
- 4. Verify - Look back. Run tests, check the UI, review what worked.
The most common mistake I see (and made myself early on) is jumping straight to step 3. One-shotting prompts has its place, but for anything non-trivial, just telling Claude "implement this feature" isn't going to cut it. The more time you invest in step 2, the better the outcome. A good plan compounds, a missing one costs you double/triple in time.
Context is everything
Claude Code works inside a finite context window. Everything lives in it: your conversation, files read, command outputs, CLAUDE.md instructions, loaded skills, MCP tool definitions, system prompts. When the window fills up, Claude auto-compacts older content, blurring earlier details. I've had it forget constraints I set ten messages ago.
The single most important habit: context hygiene. One task per conversation. If direction changes, start fresh. I use ccstatusline to keep token usage, context fill percentage, current branch, and active model visible at a glance. Hard to practice context hygiene when you can't see the context.

Claude's extensions are really context controls that manage what's in the window and reduce manual oversight:
- CLAUDE.md - Always-on rules that load every session. Keep it lean; over-specifying wastes context on every interaction.
- Skills - Reusable prompt packages, loaded on demand. I use obra/superpowers for brainstorm, planning and execution workflows. Unlike CLAUDE.md, they only eat context when invoked.
- MCPs - Tool interfaces to external systems (Jira, Playwright, Slack). The catch: tool definitions live in context. Wire up fifteen MCPs and you're burning window space before starting. Tool Search helps with on-demand discovery.
- Sub-agents - Isolated parallel workers with their own context window. You get back a summary, not the whole trace.
- Hooks - Run scripts on events (pre/post tool use) without spending tokens. I use them for lint, test triggers, and notifications.
The loop in practice
Here's how the four Polya steps play out day-to-day.
Step 1: Understand (Brainstorm)
I don't start by describing what I want built. I start by making sure Claude and I understand the problem.
For non-trivial work, I use the obra/superpowers brainstorm skill. Pull the Jira epic/task and associated confluence articles for a given problem (via Atlassian MCP), point to relevant context in codebase, and brainstorm approaches. One thing that helps: have Claude write its findings to a markdown file rather than just discussing in chat. A persistent document you can review catches misunderstandings that a conversation would bury. Sometimes Claude's understanding is wrong, and that's the point: catch it here, not three files deep into implementation.
Step 2: Plan
Imo this step matters most.
I ask Claude to write a detailed implementation plan: what files to modify, what tests to write, what verification to run, and how steps depend on each other. Point Claude at existing open-source implementations of what you're building. It anchors the plan in real patterns instead of theoretical designs.
Crucially, persist the plan as a .md file (e.g. .claude/plan-feature-x.md), not chat messages. A file you can read, edit, and diff beats a plan buried in conversation history that gets compacted away. I edit the plan multiple times before execution. I'll add constraints, remove steps, reorder, or add verification Claude missed. The plan becomes a contract.
A technique I like: structure the plan as a decision tree or a DAG (directed acyclic graph). Instead of a flat list, it becomes a dependency graph where independent steps run in parallel via sub-agents and dependent steps chain:
Step 1: Set up database schema
Step 2: Write API endpoints (depends on 1)
Step 3: Write frontend components (depends on 1)
Step 4: Write integration tests (depends on 2, 3)
Step 5: UI validation with Playwright (depends on 3)
Steps 2 and 3 run as parallel sub-agents. Step 4 waits for both. Faster execution, tighter context per sub-agent.
Step 3: Implement
With a reviewed plan, implementation is the straightforward part. Claude executes step by step, spinning up sub-agents for parallel branches. My role is monitoring: watching for plan drift, checking verification steps pass, nudging when needed.
For larger tasks, hooks automate verification. A PostToolUse hook runs the linter after every file edit, a test hook runs the suite after a module is complete. No tokens spent.
Step 4: Verify (Look Back)
Not just "do tests pass." A broader check:
- Tests pass - Unit, integration, whatever the plan specified.
- UI looks right - Playwright MCP or agentshot for visual verification.
- Plan was followed - Any steps skipped or deviated?
- No regressions - Existing functionality still works?
The whole loop, from Jira ticket to open PR, happens without leaving the terminal.
Agent teams: the next level
Claude recently shipped agent teams, taking the sub-agent pattern further. Instead of isolated workers reporting back, agent teams are full Claude Code sessions that communicate with each other, share a task list, and self-coordinate.
Where I've found this useful: giving teammates distinct personas. For example:
- A frontend specialist focused on component architecture and UX
- A backend engineer on API design and data modeling
- A principal engineer who reviews each teammate's output for architectural consistency, edge cases, and cross-cutting concerns
- A critic whose only job is to poke holes in every proposal
The critic and principal engineer matter most. Without them, Claude settles on the first workable approach. Unlike sub-agents, teammates talk to each other directly, not just back to you. Still experimental and token-heavy, but worth it for complex cross-cutting work.
Failure patterns
These cost me time before I named them:
- Kitchen sink session - Too many unrelated tasks in one conversation. Context gets muddled. One task per session.
- Correcting over and over - Third correction for the same issue? The approach is wrong. Reset context, reframe the problem.
- Over-specified CLAUDE.md - Entire coding style guides in always-on context. Move non-essentials to skills.
- Trust-then-verify gap - Only checking the final result. A wrong assumption in step 2 compounds through steps 3, 4, 5. Bake verification into the plan.
- Infinite exploration - Claude reads file after file without scope. Use sub-agents so only the summary comes back.
- Skipping the plan - The most expensive failure. A thrown-away implementation costs more than five minutes reviewing a plan.
Security
With agentic coding, the attack surface expands. Three things worth attention:
Prompt injection - Untrusted text (a README, a Jira comment, a fetched web page) can steer the agent into unsafe actions.
Permission fatigue - After fifty "approve" clicks, you stop reading. Use allowlists so you only get prompted for things that matter.
Tool/MCP risk - Network-capable tools expand what the agent can reach. Only use trusted providers, scope permissions tightly.
Tools I recommend:
- claude-security-guidance - A security reminder hook that warns about potential security issues when editing files.
- claude-code-safety-net - A PreToolUse hook that blocks destructive git/filesystem commands before they run. Designed to not interfere with normal workflows.
- leash - Runs agents in a containerized environment and manages mounts/config, keeping blast radius contained.
- Sandboxed VMs - I wrote about my sandbox setup previously.
- Scoped access tokens - Fine-grained GitHub PATs limited to a single repo.
One last thing
Even if you don't adopt the full brainstorm-plan-implement-verify loop, one habit worth picking up: before jumping into implementation, just tell Claude to ask you 5 clarifying questions first. It feels like a small thing, but those questions consistently surface assumptions, edge cases, and constraints that a one-shot prompt would miss. The implementation that follows is noticeably better because the agent actually understood the problem before writing code.
The age-old cliché holds: garbage in, garbage out. Agentic coding is no exception.