What Makes a Good Tool for Claude Code
I’ve been using Claude Code extensively for personal projects, and similar AI coding tools at work. Recently I came across this excellent blog post that resonated with a lot of my experience.
One part stuck with me though: Noah emphasizes that tools fail with LLMs when
they’re “overly complex,” with the Unix philosophy being particularly
well-suited for tool calling. But then I thought about git
.
Git breaks the Unix philosophy completely. It’s sprawling, stateful, and complex. And yet Claude Code handles it effortlessly. It composes commands that, even after 10+ years of daily git usage, I wouldn’t think to use. It handles rebasing, cherry-picking, complex resets—stuff that trips up experienced developers regularly.
So if simplicity and the Unix philosophy aren’t the whole story, what else matters?
I’ve come up with three “hallmarks” of a good tool for tool calling with LLMs.
1. It’s been around for a long time and/or is used by lots of people
Examples: Unix tools like cat
, sed
, awk
, grep
, find
—but also git
,
npm
, docker
, kubectl
. Every Stack Overflow thread, blog post, and tutorial
using these tools has likely ended up in the training data. Claude isn’t
reasoning from first principles—it’s drawing on millions of examples.
This is why git works despite its complexity: Claude has effectively memorized decades of collective wisdom.
2. It has really good documentation (built-in help or external docs)
Even if a tool isn’t widely used, great documentation can bridge the gap. I’ve been building a finance system on top of Beancount—a double-entry accounting system that’s definitely not mainstream (maybe I’ll write a post about this in the future). Claude Code handles it surprisingly well because Beancount has exceptional documentation. When I point Claude at the docs, it can figure out the directive syntax, transaction formats, and account structures without necessarily having seen millions of examples in its training data.
Good --help
text matters. Clear external documentation matters. If Claude can
discover how your tool works, it can use it effectively.
3. It has good error messages when something is wrong
Error messages, especially those with suggestions like “you used X, did you mean Y?”, can be tremendously helpful for LLMs. The best example is the Rust compiler: it gives errors like “you typed foobar, did you mean foobaz?” and Claude Code can actually use that feedback to correct itself.
This might be one reason why people feel Claude Code is particularly good at Rust programming—the compiler is essentially coaching it through mistakes in real time.
None of this is to say the Unix philosophy is wrong — it’s that Unix tools work
well with Claude Code for different reasons than simplicity. Tools like grep
and cat
nail hallmarks 1 and 2: they’ve been around for decades (massive
training data) and have extensive man pages (great documentation). The fact that
they follow “do one thing well” is almost incidental to their success with LLMs.
So if I’m building a tool today, I can’t make it instantly popular, but I can make it understandable. That means good documentation, clear error messages, and a few solid examples of how it’s used. Those aren’t new ideas — they’ve always mattered. It’s just that, with LLMs in the mix, the payoff for doing them well is suddenly much higher.