AI Tools·

Voice to System

How watching Anthropic's Claude Code lead work made me realize I wasn't using AI enough, and the journey to OS-level voice dictation for better terminal workflows.
Modern developer workspace with clean desk, laptop, and organized tools representing productive AI workflow

TLDR

Had my mind blown by Claude Code's creator after watching @bcherny. Realized I've been massively underutilizing AI even as a Principal Architect working in AI! (The irony is not lost on me.) The key insight: remove all abstractions between the model and the task. I remember when @karpathy pointed this out to me years ago!

https://x.com/Chris_Towles/status/1941610508173320352

Having my mind blown

So on Friday, I watched a video by Anthropic's Claude Code lead developer. I knew this man was recently in the news – he had jumped to Chief Architect at @anysphere, which owns Cursor. Pretty relevant timing with all the talent shuffling between OpenAI, Meta, and everyone else right now. I'd seen posts by others referencing him but never watched one of his talks. But this video changed everything. It made me realize that even though I've been trying to increase my usage of AI, I wasn't thinking big enough.

https://youtu.be/Lue8K2jqfKk?si=rF5CoJAZWiFbAFvN

What you have to know that earlier that day i was watching a video by @mattpocockuk about how he's using AI agents to work by themselves. He was showing how he uses Claude Code to automate his workflow, and I was genuinely impressed with how he's been using it more more affectivly than me. But after thinking about https://x.com/mattpocockuk/status/1940693600737841272 video about how he's using AI agents to work by themselves because he's invested in his CI/CD and tests...

Back to watching Boris Cherny talk about why Claude is built the way it is. By removing everything between the model and the files and system itself, the better at everything it gets. No more leaky abstractions between the model and the lower-level system and files. He made me realize that a lifelong developer, the reason i still use the terminal is because its lack of abstractions. how the reason I use command line tools almost every minute I'm on a computer is because it's still the most powerful interface we have.

The fact that I use the terminal with my autocomplete, colored output, aliases, and dotfiles is because that's the best abstraction we've had since ed in the 1970s. (No i'm not that old but i remember the late 80s machines with Number muchers, I'm old enough to remember when that was cutting-edge.

Note: i found that link to play the game but its in full color, i swear it was in black and white when I played it.

And it aligned with the idea that the fewer crutches you put between the model and your CLI, the more powerful it becomes as the model improves. Just like he references in http://www.incompleteideas.net/IncIdeas/BitterLesson.html. Which i've read before, we all have, but it never occured to me that all the tools we added on top of the Model were actually limiting the model's potential.

https://tenor.com/view/mind-blown-mind-explosion-explode-gif-4740219

I'd read it multiple times! And remember when Andrej Karpathy checked that we'd all ready it too!.

Watching why Boris made claude code the whay he had, to use a terminal like i would those tools. I realized that Claude Code is so close to the actual files – the way I would edit files, the way I would work if I was doing everything at a really high level, but it lets me do those things on my machine. Everything in between me and the files was really an abstraction – a limiting and leaky abstraction. Every tool added between those files, even VS Code (as good as it is), creates distance. This kind of blew my brother's mind as well. We've been talking about it for the last 24 hours, including while I was working on this post during my free time after taking the kids to the pool. (Nothing like a good tech epiphany to ruin a perfectly relaxing Saturday.)

Yes i'm aware I'm aware that he recently took a new job at Cursor, so good luck to them on that hire because I think he's definitely onto something.

So while I've had my mind blown all weekend after watching Anthropic's Claude Code video, it really made me consider how much we let the tooling limit what the agent can do in our repos. Literally a week and a half ago, I posted about how I was using voice-to-text integration in VS Code to help use AI better. But that was actually a hindrance. (Classic case of solving the wrong problem with more complexity.)

One of the most amazing things about the Claude Code video is how it removes all the distractions. These models get better when you don't constrain them. Anything you do betting on the model – like any tool you try to give it – is actually likely harming it by finding a local maximum. With Claude Code, the interface uses the same tool we've been using for the last 50 years. His iteration through the editor looks like a standard batch of terminal commands. So by giving the model the same way we would try to fix things at the terminal – with the only interface we've ever found that actually works – we get better results.

So with my voice dictation, I was not being efficient. I was still typing in Claude Code instead of just speaking the commands I wanted Claude to execute. When I was adding markdown, I could use the VS Code voice integration with the editor. But here's the kicker...

However, VS Code's voice integration doesn't work the same way outside of the editor window. When I'm in the terminal and start dictating, it can't dictate directly. Instead, I can dictate to ChatGPT or whatever model I'm using for copilot, but it interprets that command and puts it in the terminal. I don't want that layer – just like in the Claude Code talk. I don't want VS Code between me and the terminal. So like his comment (and this was what blew my mind), I need system-level dictation.

I really struggled with VS Code audio settings. I spent probably an hour messing around with plugins, thinking maybe they'd help with dictation better. But what I should do is just not have VS Code in the equation. I just want to talk to the terminal window, right? (Revolutionary concept, I know.)

So how do you do that in Linux? Well, you need OS-level dictation. What was crazy was when I was trying to find the best dictation solution for a Linux terminal, I was taken back to the same man, Boris Cherny, who gave that talk. In a GitHub issue reply, he mentioned that the way he's built most of Claude Code is by using OS-level dictation in a terminal. Mind blown! No VS Code in between, which just validates that he's really right.

The idea of getting closer to the concept of nothing between the files and the model brings me back to an unrelated topic and concept: nothing between you and the model. (It's like layers of abstraction are the enemy of progress – who knew?)