What does a database for agents look like if every mutation is a named action with a contract, instead of an UPDATE statement?
Most production-bound agent stacks I look at end up sitting on top of a normal CRUD database. The agent picks a table, picks a row, picks a column, sets a value. The audit log says "agent set status = completed" and you have to reconstruct intent from context. The agent can in principle do anything its connection permits — and "in principle" usually becomes "in practice" the first weekend you stop watching.
The bet I'm prototyping is to invert the surface: the agent never sees tables. It sees actions.
What's wrong with the default surface
- Vector memory is too unstructured. Great for recall, useless for the operational mutations agents actually need to make.
- Raw CRUD is too open. Every column is a foot-gun. The agent's effective permissions are its connection's permissions.
- Workflow engines are too rigid. They assume code-first authoring by engineers. Domain experts can't author or revise the contract.
- Audit logs lose intent. "set status = X" doesn't tell you why. Agents call low-level operations, not domain actions, so the trace is noise.
The shape I'm trying
Two modes, deliberately separated:
- Builder mode (humans, design time). A non-engineer-friendly surface for defining records, fields, actions, and the state machines that gate them. "When a post is in
draft, an agent may callrequestApproval. Fromapproved, onlymarkPublishedis callable." - Runtime mode (agents, locked down). A different MCP surface that exposes the published actions and nothing else. Agents call
markCompleted(postId)— they cannot writeUPDATE posts SET status='completed'. The platform enforces the transition; the agent doesn't get to be wrong about it.
The audit log becomes readable. The set of legal mutations is finite, named, and typed. Governance moves from "watch the agent" to "publish the contract."
Where it is
Working prototype. Core MVP shipped: typed schemas, action runtime, state-machine guardrails, auth, single-process container. Running internally, getting tested against real domain models.
What I'm still thinking about
- Reads vs. queries. Today the design favors open reads with summary projections and locked writes through actions. Where does that stop being enough — at what record volume or query complexity does the platform need real query primitives without becoming a query engine?
- Governance throughput. "Humans review and publish" is the right answer at small scale. At larger scale, with frequent schema changes across many agents, where does that bottleneck and what replaces it?
- The bridge problem. Eventually agents need to push to CRMs, fire Slack messages, sync to external systems. The current non-goal is "no webhooks, no event streaming." What's the smallest external surface that doesn't turn this into another ETL platform?
This one feels closer than the fleet idea — the contract is sharper, the value is easier to show in a demo. Still figuring out who the first hundred users actually are.