
Honest caveat first: this space is moving so fast that things I would have sworn by a month ago I wouldn't do today. Take anything you read with that in mind, including this post.
I wrote a tips post back in July 2025 that covered the tactical side — keybindings, features, tricks. This post is different. It's what I actually tell teams when they're adopting Claude Code and want to know what matters.
The one thing that has held up: go to primary sources first. The people actually building Claude Code will tell you exactly how they use it. The Anthropic engineering blog and Boris Cherny's talks have been way more durable signal than anything else out there.
The developers who were shipping valuable code every day will get dramatically more powerful. It won't turn someone who wasn't into someone who suddenly is. And people pushing code they didn't read or understand just creates a mess that everyone else ends up spending time cleaning up.
I've seen this firsthand — a developer uses Claude to generate a feature, ships it without reading the diff, and the next person who touches that code spends twice as long understanding it as it would have taken to write it properly. The tool amplifies habits, good and bad.
I learned this the hard way myself. I tried using Claude Code on my brother Patrick's (slyedoc) Rust repo built on the Bevy game engine. I don't know Rust. I don't know Bevy. The PRs I submitted were worse than harmful — I didn't understand what Claude was generating, and I didn't have the taste to steer it. I couldn't tell when to say "no, that's wrong" because I had no baseline for what right looked like. Knowing when to tell Claude "no" is the most critical skill, and you can't do that in a domain you don't understand.
Look hard at GitHub history and what people have actually contributed before picking your cohort. That's a better signal for who will thrive with these tools than almost anything else.
People level up, but there's no way to skip steps yet. Everyone has to earn the reps.
A great codebase before AI makes a great codebase with AI. The same logic that applies to developers applies to the code you give them:
We also scope CLAUDE.md files to just the folder level where the information is actually relevant rather than one giant file at the root. The model uses what it needs and isn't wading through noise.
One more thing: don't let credentials or secrets anywhere near the context. Treat the AI like a junior developer sitting at your desk — give it access to what it needs and nothing more.
These agentic workflows work dramatically better when they can evaluate themselves. Tests are the obvious version of this, but it applies everywhere — the more feedback loops you can give the model, the better the output gets without you having to babysit it.
This means:
The key insight is that Claude Code can run its own tests, check its own types, and iterate on failures. If your project has those feedback loops, the model becomes dramatically more autonomous. If it doesn't, you're back to babysitting.
Not doing so is leaving power on the table. I covered the full list in my tips post, but the ones that matter most for teams:
.claude/skills for reuse across the team. Skills give Claude domain knowledge and workflows it can invoke automatically.When AI generates code, your review process needs to adapt. The reviewer needs to read the diff just as carefully — maybe more carefully — because the code wasn't written by someone who will remember why they made each decision. Treat AI-generated PRs the same way you'd treat a PR from a contractor: thorough review, clear acceptance criteria, and someone who understands the codebase signing off.
The real unlock isn't any single session with Claude Code. It's building a system that gets better the more you use it.
Every correct example in your codebase — well-tested code, clean patterns, good CLAUDE.md files — makes the next generation of output better. The model learns from your conventions. Your tests catch its mistakes before they land. Your linter enforces your style. Each iteration tightens the loop. More correct code means better context means more correct code.
Karpathy demonstrated this recently with nanochat. He set an agent loose on hyperparameter tuning for two days. It autonomously tried ~700 changes, found ~20 real improvements that stacked up to an 11% speedup — things like attention scaling oversights, missing regularization, conservative banding he forgot to tune. All on top of code he'd already manually optimized. The agent found real improvements because it had a clear metric to optimize and a fast feedback loop to evaluate against.
That's the flywheel. Any metric you care about that's reasonably efficient to evaluate can be autoresearched by an agent. For most of us, that metric is "do the tests pass and does the code match our conventions." The more of those signals you have, the more autonomous the model becomes, and the better the output gets without you touching it.
This is where I'd focus if I were starting from scratch: not on prompt engineering, but on building the infrastructure that makes every future session better than the last.