
When you measure something, it becomes the goal. That's Goodhart's Law. I learned it the hard way last week while watching my ralph wiggum loop run.
I'd gotten clever with session resumption. Each iteration would fork from a previous session, loading all that rich context - file discoveries, architectural understanding, previous decisions. The loops looked smart. They'd start running and immediately seem to know what they were doing.
Here's what I was actually optimizing for: looking productive while I watched.
The loops appeared to take action faster. Less "figuring out" time, more "doing" time. When I'd check in on a running loop, Claude was already deep in the work - reading files, making edits, running tests. No warm-up period. No re-discovery phase.
It felt like progress. I'd built something clever.
I even wrote a whole blog post about the session marker system. Showed off the code. Explained the workflow. Felt pretty good about it.
Context windows are finite. Every byte I pre-loaded for appearances was a byte I couldn't use for thinking through hard problems.
My loops looked fast because they skipped the exploration phase. But exploration isn't waste - it's where Claude builds a mental model of the problem. By pre-loading "answers" from previous sessions, I was forcing Claude to work with stale context instead of fresh understanding.
The hard problems - the ones that actually needed solving - got squeezed. Claude would hit the context limit right when the thinking needed to get deep. The loop would compact, lose nuance, and produce shallow solutions.
I was optimizing for the spectator experience. Making myself feel good watching the terminal. Meanwhile, the actual work suffered.
Strip it back. Only load what's critical:
Let Claude rediscover the codebase each iteration. Yes, it looks slower at the start. Yes, there's a "figuring out" phase. But that phase builds fresh context - context that's available when the hard thinking needs to happen.
Reserve the context budget for problem-solving, not for looking smart.
I struck through the session markers section in my previous post. It's still there - you can see what I thought was clever. You can also see the warning that it was wrong.
That's the deal with building in public. You ship your mistakes alongside your wins. The lesson here isn't "session resumption is bad" - it's that what you optimize for becomes your goal, whether you meant it to or not. I've since updated towles-tool and removed all the session marker stuff.
I measured "looks fast." I got "looks fast." I didn't get "solves hard problems."
Choose your metrics carefully.
Implementing a Ralph Wiggum Loop: The Secret is Session Markers
How I turned towles-tool into an autonomous AI coding system using session markers to preserve context across iterations - and the weekend of mistakes that got me there.
Claude Code's Hidden Multi-Agent System Is Real
I verified the claim - there's a fully-implemented TeammateTool hiding in Claude Code, feature-flagged but ready to go.