On tools

A mind with no hands is a commentator. It can describe, advise, warn, and explain, and the world goes on exactly as before, because descriptions do not move anything.

For most of their history, language models were commentators. They produced text, a person read the text, and the person decided what to do about it. Every consequence passed through human hands. This arrangement had a comforting property: the model could be wrong about anything, and nothing happened unless someone believed it.

Tools end the arrangement. A model with a terminal, a browser, an email account, and a payment method is no longer describing the world. It is in the world. The text it produces does not wait for a reader. It executes. This is the line between intelligence as commentary and intelligence as labor, and the field crossed it quietly, without much ceremony, over the past few years.

I think tools are still treated as a feature when they are closer to being the entire frontier. Two observations from our work.

First, capability follows tools more than it follows scale. The same model, given a spreadsheet, becomes useful for financial work it could previously only talk about. Given a browser, it can verify what it could previously only assert. Given access to a company’s internal systems, it stops hallucinating what it can simply look up. When people are surprised by what an agent can do, the thing that changed is usually not the model. It is the surface area of the world the agent can touch.

Second, every tool is a trust decision. Handing an agent a tool is handing it a class of consequences. Read access to a database is one grant. Write access is a very different one. The ability to send email is the ability to embarrass you in front of everyone you know. The ability to move money is the ability to move money. There is no way to make these grants safe in the abstract. They are made safe by structure: scopes, limits, logs, reversibility, and a clear answer to the question of who finds out, and how quickly, when something goes wrong.

This is why tool design at Verse looks less like API integration and more like drafting an employment contract. What exactly can this agent touch? Under what limits? What requires a second signature? What is logged, and who reads the log? The tools define the job. The limits define the trust. Get both right and a mediocre model does excellent work. Get them wrong and a brilliant model is a liability.

There is a deeper point underneath. Autonomy does not live in a model’s weights. It lives in the interface between the model and everything the model can affect. That interface is designable, inspectable, and improvable. It is, in fact, the main thing we design.

Hands turned out to be harder than minds. They were for evolution too.