Claude Code·

Goodhart's Law Ate My Context Window

I optimized my AI coding loop for the wrong thing. It looked fast. It wasn't effective.
A developer's workspace showing a dramatic split-screen visualization: on the left monitor, a progress bar racing forward with flashy green metrics and spinning activity indicators; on the right monitor, the same task failing silently with truncated output and a warning symbol. A magnifying glass held between the screens reveals the progress bar is hollow - a beautiful shell with nothing inside. Dramatic chiaroscuro lighting with cool blue from the monitors contrasting against warm amber desk lamp, shallow depth of field focusing on the magnifying glass revelation. Cinematic realism, 8k photorealistic quality, noir atmosphere with teal and orange color grading, slightly elevated three-quarter angle capturing the moment of uncomfortable truth.

When you measure something, it becomes the goal. That's Goodhart's Law. I learned it the hard way last week while watching my ralph wiggum loop run.

I'd gotten clever with session resumption. Each iteration would fork from a previous session, loading all that rich context - file discoveries, architectural understanding, previous decisions. The loops looked smart. They'd start running and immediately seem to know what they were doing.

Here's what I was actually optimizing for: looking productive while I watched.

The Seduction

The loops appeared to take action faster. Less "figuring out" time, more "doing" time. When I'd check in on a running loop, Claude was already deep in the work - reading files, making edits, running tests. No warm-up period. No re-discovery phase.

It felt like progress. I'd built something clever.

I even wrote a whole blog post about the session marker system. Showed off the code. Explained the workflow. Felt pretty good about it.

The Reality

Context windows are finite. Every byte I pre-loaded for appearances was a byte I couldn't use for thinking through hard problems.

My loops looked fast because they skipped the exploration phase. But exploration isn't waste - it's where Claude builds a mental model of the problem. By pre-loading "answers" from previous sessions, I was forcing Claude to work with stale context instead of fresh understanding.

The hard problems - the ones that actually needed solving - got squeezed. Claude would hit the context limit right when the thinking needed to get deep. The loop would compact, lose nuance, and produce shallow solutions.

I was optimizing for the spectator experience. Making myself feel good watching the terminal. Meanwhile, the actual work suffered.

The Fix

Strip it back. Only load what's critical:

  • Spend time in planning sessions to actually plan.
  • Review the plan carefully - time spent here is much cheaper than fixing bad execution.
  • Keep context to whats critical

Let Claude rediscover the codebase each iteration. Yes, it looks slower at the start. Yes, there's a "figuring out" phase. But that phase builds fresh context - context that's available when the hard thinking needs to happen.

Reserve the context budget for problem-solving, not for looking smart.

Building in Public Means Showing the Screwups

I struck through the session markers section in my previous post. It's still there - you can see what I thought was clever. You can also see the warning that it was wrong.

That's the deal with building in public. You ship your mistakes alongside your wins. The lesson here isn't "session resumption is bad" - it's that what you optimize for becomes your goal, whether you meant it to or not. I've since updated towles-tool and removed all the session marker stuff.

I measured "looks fast." I got "looks fast." I didn't get "solves hard problems."

Choose your metrics carefully.